Wiktionary:Beer parlour

Wiktionary > Discussion rooms > Beer parlour

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.

Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.

Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!

Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


Parsing the principal parts in the headword lines of Latin verbs

edit

Previous discussion:

As a consequence of the change to Latin verb definitions initiated in the above-linked discussion, some editors raised the concern that some readers might be confused by the mismatch between the grammatical status of the verb form which serves as the Latin lemma (the first-person singular present active indicative) and the status quo novus of how Latin verbs are defined (with the English infinitive). Accordingly, the suggestion was made to parse the first principal part of Latin verbs (the other three principal parts had already been parsed since time immemorial). The code to institute this was added, but soon after removed again because “[it] wildly change[d] the visual outcome of what Latin verbs have always been and [had] not [been] changed with a valid vote”. Given that consensus for the addition is not yet apparent, I thought I'd try to reinvigorate the discussion here.

Since the occasion has presented itself, I would like to propose a reform to how Latin verbs are parsed. I shall use for my example the old grammarians' favourite, Latin amō (I/to love). Currently, Latin verbs are parsed thus:

and restoring the parsing of the first principal part as it stood 14–17 November 2023 would result in:

In both these schemes, however, the parsings are badly and inconsistently conceived. All the parsings are abbreviated; if they weren't, this is what we'd see:

  • amō first-person singular present active indicative (present active infinitive amāre, first-person singular perfect active indicative amāvī, accusative supine amātum); first conjugation

The first three principal parts (hereafter PPs) are all active, so why is only the third PP labelled as such? The first and third PPs are both first-person, singular, and indicative, so why is only the first PP labelled as such? This doesn't really make sense. Note that, for normal Latin verbs' principal parts, no person but the first person, no number but the singular number, and no voice but the active voice is ever mentioned. Only the tense (present or perfect) and the mood (indicative or infinitive) vary; accordingly, they are the important features to parse. If we restricted ourselves thereto, amō would look like this:

Of course, the initial concern about the aforementioned mismatch remains in that scheme, so what I actually propose is:

Thoughts? 0DF ([[User talk:|talk]]) 03:14, 1 July 2024 (UTC)Reply

I agree that we can leave out "active" from all parts: I doubt anyone would expect unlabeled forms to be passive, so treating the active as an unmentioned default seems unproblematic. So I support replacing "perfect active" with "perfect indicative". I think it's not so obvious though that the citation form is first-person singular. While "first-person singular present indicative" is a bit long, possibly we could use "1sg present indicative" and "1sg perfect indicative" with a link explaining what the abbreviation 1sg means?--Urszag (talk) 03:35, 1 July 2024 (UTC)Reply
I suggest using standard linguistic glosses throughout:
With tooltips spelling everything out in plain English. Nicodene (talk) 09:56, 1 July 2024 (UTC)Reply
This one reads the smoothest. Fay Freak (talk) 14:16, 1 July 2024 (UTC)Reply
I don’t think this is a good idea. Tooltips don’t display on mobile devices, and the abbreviations are difficult for ordinary readers to understand. — Sgconlaw (talk) 16:17, 1 July 2024 (UTC)Reply
We already do that for gender and number. Nicodene (talk) 21:05, 1 July 2024 (UTC)Reply
Funny that I specifically considered mobile users and reached the opposite conclusion. It is advised either way to keep the lines short and avoid line breaks ergo as well vertical space. Our mobile presentation can look different, by including a question-mark toolbutton in place of relying on tooltips. And link to w:List of glossing abbreviations or something; we have not shied away from having an Appendix:Glossary, so @Sgconlaw is inconsequential here, though Nicodene has not expressly thought it through. Fay Freak (talk) 23:15, 1 July 2024 (UTC)Reply
My opinion may be a bit radical. Our headwords are already five times more verbose than normal paper dictionaries, because of course, we can afford it, but visual clutter is still a thing (i.e. scouting out the real information drowned in labels). Even as they currently stand I would make them shorter:
amō (perfect amāvī, supine amātum, infinitive amāre)
More thorough explainations can be held in a centralised appendix, which should somehow be accessible and reader-friendly. As a side note, I notice only now our Latin headwords have moved infinitives from the fourth place (as they're in all Latin dictionaries) to the first one, I don't see any obvious reason why we would be breaking such a secular practice in Latin lexicography. Catonif (talk) 11:24, 2 July 2024 (UTC)Reply
I’d be happy with that. Plus perhaps a footnote/toolbutton/whatever with a message along the lines of:
  • “The four principal parts are as follows: first-person singular present active indicative (first-person singular perfect active indicative, accusative supine, present active infinitive).”
Nicodene (talk) 12:29, 2 July 2024 (UTC)Reply
Seeing as the headword will never need to link to another page (since it's identical to the name of the page), I guess we could put a tooltip on the headword itself that when hovered on, explains that it is a first-person present indicative form--unless that would be too distracting. Like this:
amō (perfect amāvī, supine amātum, infinitive amāre)
--Urszag (talk) 13:56, 2 July 2024 (UTC)Reply
I suppose that is enough. More complete information about the other three principal parts can always be found by just clicking on them. Nicodene (talk) 16:02, 2 July 2024 (UTC)Reply
I support this--i.e., (1) having tooltips so that all the relevant information about the verb forms is presented, but (2) without cluttering the entry, and (3) moving the infinitive to the end to match the order used by other dictionaries. I would also support a solution that used straightforward abbreviations (e.g. "1st person sing."). I am concerned about the status quo, which has simply removed any indication of the lemma not being an infinitive form. Andrew Sheedy (talk) 18:48, 2 July 2024 (UTC)Reply
Re reordering the principal parts, I was surprised to discover that Charlton Thomas Lewis's An Elementary Latin Dictionary (amō) and Félix Gaffiot's Dictionnaire illustré latin-français (ămo) do indeed give the principal parts in the order present indicative, perfect indicative, supine, present infinitive. However, I'm not seeing universality here. Harm Pinkster's Woordenboek Latijn/Nederlands (amō via Logeion) gives only the present indicative and present infinitive; Joaquim Affonso Gonçalves' Lexicon Magnum Latino-Sinicum (Amo) gives the first- and then second-person singular present active indicative (amō, amās), followed by the present active infinitive; whereas Lewis & Short's A Latin Dictionary (ămo) gives them in the order present indicative, perfect indicative, supine, omitting the present infinitive altogether. Our order is that which prevails on Wikipedia and in Joseph Henry Allen and James Bradstreet Greenough's New Latin Grammar for Schools and Colleges (see § 1.26.1 thereof). Pace Catonif, I am not persuaded that we are in fact "breaking…a secular practice in Latin lexicography" by ordering verbs' principal parts as we do. 0DF (talk) 20:38, 2 July 2024 (UTC)Reply
Aha! Well conducted research. Not trying to persuade you, legitimately thought that was a thing, mostly due to the Italian lexicographic tradition for Latin verbs we're taught in schools, which is actually first person singular indicative present, second person singular indicative present, perfect, supine, infinitive: all four Italian-Latin works I have access to work like that. I guess if it is not as rigid as I was taught we can go with whatever is most common in Anglophone literature, or really, we can keep the current system and eventually discuss this separately if we feel like we have to.
Anyways, I also agree with having a tooltip (even though there's always the mobile issue), maybe however just containing "first person singular"? Since indicative, present, and active is all what you'd expect anyways from a lemma, and the first-person status is also shared with the perfect. Catonif (talk) 22:10, 2 July 2024 (UTC)Reply
Given that around 30-40% of our readers are using mobile devices (source) I   Oppose any further use of hover titles beyond what is already in place. Would definitely   Support removing the word "active" and re-adding the caption that labels the first principal part as the first-person singular present (indicative? not sure if we need to mention that). This, that and the other (talk) 23:54, 8 July 2024 (UTC)Reply
0DF's idea of tooltips might work with changes in Module:headword and a JavaScript gadget to show the content of tooltips. That is, you would tap "present indicative" in the text below and a box would pop up above that with "first principal part: the first-person singular present active indicative". I guess the gadget would also have to detect taps outside the box or the label and dismiss the popup box. Module:headword doesn't currently support tooltips on labels, but they could be added to the module or inserted manually into the labels by Module:la-headword.
amō present indicative (present infinitive amāre, perfect indicative amāvī, supine amātum); first conjugation
This, does this sound like a feasible workaround for the mobile site? (I am not sure I can take this on right now. I'm already very behind on requests to do Wiktionary things.) — Eru·tuon 21:11, 25 July 2024 (UTC)Reply
Yes, some kind of solution that can be activated by both hover and touch is needed. We could do it with JS. As with you, I am rather busy right now (a new teaching semester has begun) so may not be able to get onto it either. This, that and the other (talk) 23:00, 25 July 2024 (UTC)Reply

Should minimum wage be in a PIE root category?

edit

Sorry for a header so specific, what I mean to address is actually this general issue, but wasn't able to capture it in a concise enough title. It is something that has bugged me for quite some time already, but now that {{etymon}} started to be greatly employed and even deals with categorisation (not without disagreements, see GP § June 2024), it seems like this has become a very pressing issue. For the sake of an example, take Category:English terms derived from the Proto-Indo-European root *mey- (small). At the moment it contains mostly sensible entries and is an actually enjoyable and relatively useful category. Although notice entries such as minimum wage (where the question would be, does that need an etymology in the first place) and, most importantly, note that once we fully undergo proper automated concistency, this will contain thousands of entries containing the prefix mono- (e.g. monobrominated, monochromaticity, monotheistically), making the category overly cluttered and much harder to find non-obvious results in it.

My approach for this (automation aside, this is how I have always used {{root}}) is to only add to that category the terms that aren't themselves derivates (or derivable, tricky here) from other entries that are already in the category. Essentially the entry would contain only the basic terms by which all the other ones can be derived.

I can see how this can seem unappealing to those seeking for full module automation and steel-hard consistency accross all entries, although I hope many can agree that whoever chose to make root category a thing didn't do it for them to be an endless list of monotonousness.

Catonif (talk) 17:42, 1 July 2024 (UTC)Reply

I don't see why an MWE like minimum wage needs an etymology. If an MWE were to have an etymology, it would seem useful to exclude it from etymology trees. This kind of thing is also a problem under Derived terms headers, where it would not surprise me to find minimum wage law. DCDuring (talk) 18:32, 1 July 2024 (UTC)Reply
It was originally agreed that this template would not be deployed in multiword terms, not sure who's ignoring that. I don't see the problem with single-term entries being categorized. Vininn126 (talk) 19:21, 1 July 2024 (UTC)Reply
Side note, but can we make an exception for that when a multiword term can't be broken down into its constituents? It would be really dumb to exclude Hong Kong, for instance. Theknightwho (talk) 20:19, 1 July 2024 (UTC)Reply
I think that was mentioned in the thread, I can't remember. Vininn126 (talk) 20:25, 1 July 2024 (UTC)Reply
The exceptions to the MWE exception do need to be respected. Do we have a category for such terms? If not, we could benefit from having one. DCDuring (talk) 21:19, 1 July 2024 (UTC)Reply
I am aware of a few other multi-word entries that use {{root}}, namely Ku Klux Klan ("Ku Klux" being a split form of Ancient Greek κύκλος (kúklos)) and sgian dubh (borrowed together from Scottish Gaelic; neither word is used separately in English).
This discussion reminds me of a similar one from June about the etymon template. As for this discussion, I wondered the same thing, but I feel that the following rules of thumb are likely to prevent people from getting too riled up:
  • Only lemmas should be included, unless sufficiently distinct (such as inflected forms of be) or suppletive (such as people). Where something like datum versus data falls, I don't know.
  • Each one must be a single word or morpheme; this may include, I suppose, every word prefixed with insert prefix here (I won't go crazy with it, though). However, "unsplittable" terms such as the ones described above can be treated as one word.
  • WT:COALMINE scenarios... I'm not sure.
  • Descendant hubs should not use {{root}} or etymology trees, though using the {{etymon}} template is okay for passing information to other pages.
-BRAINULATOR9 (TALK) 02:18, 2 July 2024 (UTC)Reply
@Catonif: minimum wage is:
  • English
  • A term
  • Derived from the Proto-Indo-European root *mey- (small).
So there is absolutely no justification for it to not be in a category called Category:English terms derived from the Proto-Indo-European root *mey- (small).
It seems like what you really want to do is to change the category itself. What you're describing (i.e. a category without thousands of mono- words) might be more accurately dubbed Category:Common English words derived from the Proto-Indo-European root *mey- (small). I don't know how we could decide which terms are "common" enough to include. Also @Vininn126: the previous BP discussion was about etymology trees on multi-word entries, not the template itself. Ioaxxere (talk) 04:17, 2 July 2024 (UTC)Reply
@Ioaxxere: It's also spelled with the letters "a","e","g","i","m","n","u", and "w", but a decision was made to not have "spelled with" categories for letters that are part of the normal orthography of a language. We absolutely should have single words and morphemes in all the applicable derivation categories, but derivational categories for all the parts of a multi-word expression is unnecessary overkill. Do we really want categories for function words like "a", "and", "of", "or", "the", etc. in phrase entries? Chuck Entz (talk) 05:46, 2 July 2024 (UTC)Reply
@Chuck Entz: I agree. That's why we don't have a category called Category:English terms spelled with A. My point is that if the category Category:English terms derived from the Proto-Indo-European root *mey- (small), as literally specified, is too large to be useful, then it should simply be deleted, rather than post hoc negotiating "well, actually, it's not all the English terms...". Ioaxxere (talk) 06:26, 2 July 2024 (UTC)Reply
@User:Ioaxxere The principle that you seem to require us to follow is that once something is specified, the specification must remain unchanged, either for all time or perhaps until you decide otherwise. This would seem to mean that nothing should be specified unless it is perfectly specified for all time. Good luck with that. I've always thought that humans, both individually and in groups, were at their best when learning and adjusting their institutions accordingly.
To me making the simple adjustment of excluding MWEs by default from the listing in question and requiring exceptions to be made manually seems reasonable to cover the cases where the MWE has a non-trivial etymology. Occasionally the etymology of an MWE is relevant to an etymon tree. But usually it is not. For example, I don't understand why anyone finds adding a trivial etymology to a multi-part taxonomic name a good use of their time or of a user's attention, but some do. DCDuring (talk) 14:23, 2 July 2024 (UTC)Reply
I think it's a fair point that these kinds of categories are doomed to be highly incomplete and arbitrary, especially if it's left up to manual placement of "root" templates. It's not so obvious that this is something that makes sense as a category (if we're just presenting a manually curated list with no pretensions at being comprehensive, wouldn't that almost be more appropriately presented in an appendix?). I definitely think it seems especially low value to include multi-word terms in these kinds of categories, and that isn't a very difficult exclusion criterion to apply (although the use of "term" instead of "word" in the category name doesn't help with making this criterion apparent). But the conversation has also brought up prefixed or derived single words, which is a much harder criterion to follow. (Incidentally, it seems like "mono-" probably doesn't come from *mey- after all, though that doesn't resolve the issue of if we want all the mono- words included in some category or another.)--Urszag (talk) 14:34, 2 July 2024 (UTC)Reply
I agree with User:Urszag here. Ioaxxere (talk) 18:26, 2 July 2024 (UTC)Reply
Well put. Vininn126 (talk) 18:35, 2 July 2024 (UTC)Reply
Then I propose we simply change these categories to read "English words derived from the Proto-Indo-European root...". Andrew Sheedy (talk) 18:53, 2 July 2024 (UTC)Reply
Agree with Urszag and Sheedy. But what about Klu Klux Klan and similar? DCDuring (talk) 19:35, 2 July 2024 (UTC)Reply
@Andrew Sheedy That doesn't work, because we need an exception for terms with spaces that aren't decomposable in the given language. Theknightwho (talk) 21:03, 2 July 2024 (UTC)Reply
@Ioaxxere I think a short-term way out of this mess would be to allow control over categorization: there should be parameters to tell the module that certain parts of the tree should not generate categories. There may be better names for the parameters I'm suggesting.
  • first the easy part:
    • Prevent addition of categories to the current entry only without affecting the drawing of the tree:
      • |nocat=[language code or spec of node to be uncategorized]
      • |endcat=[language code or spec of the highest node to be uncategorized]
This tells it to show categories up to the node in question, but not those of any of its ancestors. If you don't think the ancestry of a minor morpheme in an Old English ancestor should be added to the categories for a Japanese calque of an English term, you would just put |encat=ang, or whatever the spec is for the Old English morpheme itself. If the node given is the current entry, no category at all would be added, so you could use |nocat=[spec for "the" in the current entry name] to keep it from showing any categories for "the"
  • more complicated:
    • The same as above, but the parameters would control {{etymon}} in all the other entries that use the entry as a node in their trees. Thus, a parameter in an Old English entry could prevent a certain node in its tree from being used to add categories for any of its children, and another could tell all of its children not to look past a certain ancestral node in its tree when adding categories.
That's all I have time for right now, but at least it should give the basic idea Chuck Entz (talk) 15:14, 3 July 2024 (UTC)Reply
@Chuck Entz: Those are interesting ideas. But I think it would be better to come up with some simple and consistent rules that the template can enforce rather than letting editors arbitrarily cut off whichever categories they like.

Ryukyuan kanji entries

edit

I'm a bit concerned about the large number of unsourced kanji entries we have for the non-Okinawan Ryukyuan languages. I note that they were generally added to pages en masse by users who like(d) to bounce about between languages (e.g. [1] [2] [3]), and most are completely unsourced. In some cases, they don't seem to make much sense, either: e.g. Miyako 食ー, which I think has been inferred from JLect or from Nikolay Nevskiy’s Miyakoan dictionary, which gives "foː", but I can find no dictionary to verify the kanji spelling, and it seems implausible that we'd have a lone after the kanji, given that's not where the morpheme boundary is.

There are a handful of entries that that do provide a source for the kanji spelling, like Kikai , and although JLect isn't seen as very reliable by some contributors, it's better than nothing. However, I really think we should remove all of the unsourced entries, as they look strongly like inferences to me. Before I nominate them, though, I wanted to hear what others have to say first. @Eirikr @Fish bowl @Chuterix @Lattermint @Poketalker @Kwékwlos @Mellohi! @TongcyDai Theknightwho (talk) 19:41, 1 July 2024 (UTC)Reply

Also pinging @荒巻モロゾフ.
Recently, for some new Yonaguni entries I create that source the reference of the original word, I tend to put main headword at hiragana (used in Dunanmunui Jiten). The alternative kanji however, is entirely inferred from etymology/semantics; feel free to remove them if you don't like it. Chuterix (talk) 20:14, 1 July 2024 (UTC)Reply
Most Ryukyuan languages, except for Shuri, have little to no literary tradition. Hiragana would be best suited, since the kanji is meant to signify a possible Japanese cognate but not all etymologies are correct. Kwékwlos (talk) 21:38, 1 July 2024 (UTC)Reply
Agreed with Kwekwlos. We should move all Ryukyuan entries to hiragana. For Shuri, kanji should only be an alternative spelling. Chuterix (talk) 21:48, 1 July 2024 (UTC)Reply
IFF kanji spellings are used in texts written in those respective languages, then great, we should include those somewhere (whether as main or alt-spelling entries, I currently have no strong opinions). ‑‑ Eiríkr Útlendi │Tala við mig 22:22, 1 July 2024 (UTC)Reply
I've expressed similar concerns at Beer Parlour (March) but do not have the knowledge to comment further. —Fish bowl (talk) 22:01, 1 July 2024 (UTC)Reply
My thoughts on this are still the same:

We should lemmatize at what native speakers have used the most, absent a standard orthography, regardless of if it seems inconsistent or "ad-hoc". Defective or variant orthographies are not specific to Ryukyuan, and in other cases, we list the variants as alternative forms with the "standard" or most-common form as the lemma. (Or in the case of two differently-pronounced words represented by the same orthography, we disambiguate in the etymology + pronunciation sections)

For Okinawan in particular, there are several works written in mixed script (Kanji & kana), and it looks to be the traditional orthography as well, so I wouldn't support a move to solely kana, and definitely not the Latin script. The same level of research should be done for the other languages as well; if they are more-written in the Latin script or katakana [or hiragana], then shifts can be made, but the research needs to be done first.

The same applies here. I would highly recommend doing a deep dive into what speakers use most often. And again, I would not support a move of Okinawan entries to hiragana. AG202 (talk) 17:19, 3 July 2024 (UTC)Reply
@AG202 I completely agree with you re Okinawan; I had hoped I'd made it clear that it's not in the scope of this thread, as I'm aware it has its own literary tradition that is best handled in the same way we handle Japanese. Theknightwho (talk) 20:49, 4 July 2024 (UTC)Reply

Category:English arbitrarily coined terms

edit

Should we have a category for words that were completely "made up" like quark, grok, or frabjous? Searching "arbitrary formation" on the OED reveals many more results. (accepting suggestions if anyone has better ideas for a name)

Personally   Support. Ioaxxere (talk) 21:29, 2 July 2024 (UTC)Reply
Wouldn’t this just be “Category:English nonce terms”? — Sgconlaw (talk) 04:06, 4 July 2024 (UTC)Reply
@Sgconlaw: Checking the contents of that category I see that very few of them would fit into my proposed category. Ioaxxere (talk) 05:21, 4 July 2024 (UTC)Reply
@Ioaxxere: in your view, what is a “completely made up” term and how does it differ from a nonce term? — Sgconlaw (talk) 10:14, 4 July 2024 (UTC)Reply
ex nihilo vs nonce. Vininn126 (talk) 10:18, 4 July 2024 (UTC)Reply
@Vininn126: seems to me that “completely made up” terms are also “terms invented for the occasion”, so the former could quite happily be included in “English nonce terms” without the need for an additional category. — Sgconlaw (talk) 10:23, 4 July 2024 (UTC)Reply
No they are not. Nonce terms are "made for a single occasion". One could argue that normal affixation could also relate to nonce terms, wherein the speaker realizes it's not fully "lexicalized" the way other words are, and they are not restricted to new stems. Ex nihilo is a new stem, it may be nonce, it may catch on and become fully lexicalized. Vininn126 (talk) 10:25, 4 July 2024 (UTC)Reply
‘Arbitrary’ isn't all that descriptive.
I'd suggest ‘ex nihilo coinages’. Nicodene (talk) 05:03, 4 July 2024 (UTC)Reply
@Nicodene: How about "English terms coined ex nihilo"? Ioaxxere (talk) 05:21, 4 July 2024 (UTC)Reply
Sounds good to me. Nicodene (talk) 05:30, 4 July 2024 (UTC)Reply
Related, what do if only a part is ex nihilo, as in pharmacology, it turns out, is often? → ipamorelin and its whole suffix. Fay Freak (talk) 05:31, 4 July 2024 (UTC)Reply
I have created Category:English terms coined ex nihilo. @Fay Freak: I'm not sure if there's a sensible way to define this kind of thing. In your example, the "arbitrary" part (ipa-) is pretty substantial but in other cases it might be a single syllable or even letter. Consider entries like doge#Etymology_2 and forgor. Ioaxxere (talk) 17:42, 25 July 2024 (UTC)Reply

RQ for Rollbacker (and Patroller)

edit

I'm not sure what the bar for these rights are, but I would like to request these tools as I believe they would be helpful tools on those days I end up watching RC for vandalism (which has been happening more frequently as of late). I will not rollback anything other than obvious vandalism and my primary usage of patroller would be to simply review un-patrolled edits. — BABRtalk 08:34, 3 July 2024 (UTC)Reply

I nominated you in WT:WL. — Fenakhay (حيطي · مساهماتي) 15:58, 3 July 2024 (UTC)Reply
Approved. Vininn126 (talk) 08:48, 4 July 2024 (UTC)Reply

Gender Only, or Gender + Number for Nouns in Translation Tables?

edit

I've noticed that many noun entries in translation tables are annotated with gender only, with no indication of number, though there is no specific guidance or policy provided for this particular more in documentation related to translation tables.

Should number indeed be left out of translation table entries, perhaps unless the number of the translated noun differs from the number of the original noun? On the other hand, the argument for including number, along with gender, in translation tables, would be that it provides the most possible context for a reader who is more unfamiliar with the language in question.

If the consensus is that number should only be included if it differs from the number of the original English entry, I would suggest this policy be made explicit either in the translation table "add translation" forms themselves, or in Wiktionary:Translations.

Hermes Thrice Great (talk) 10:01, 3 July 2024 (UTC)Reply

I wouldn't support a policy of routinely including number. The point as I see it of including gender is that (for many commonly used languages) gender is lexically specific and relatively arbitrary relative to the meaning and potentially also the form of the word. Number is usually non-arbitrary and semantically transparent. I would agree with a policy of including number only when it differs from English, as in "meubles (fr) m pl" at furniture.--Urszag (talk) 05:42, 6 July 2024 (UTC)Reply
My impression is that by far the most common situation in noun translations tables is "English singular noun is translated into another language as Other-Language singular noun": I see no reason to indicate number in that case, since it's the 'default'. Whether to indicate it when the English and foreign-language numbers are plural (which is different from the default, but matches) is debatable. Obviously, it would be helpful to indicate where the English vs foreign-language numbers differ, as Urszag says. - -sche (discuss) 21:09, 7 July 2024 (UTC)Reply

Full entries for alternative forms in Chinese

edit

@Justinrleung, Fish bowl, Wpi, Mar vin kaiser, Kc_kennylau Right now, etymology 2 of soft redirects to , while under etymology 3 of is a full entry containing the pronunciation of Min in the different varieties of Min. I was wondering if a full entry for the Min word for "foot" at is allowed, since 漢語方音字彙 lists the readings of under the entry for as 訓讀训读. What are the rules regarding entries for alternative forms in Chinese?

As a side note, perhaps should be adopted as the main form for the Min word for "foot" instead of , which we currently use. is used only in Taiwan sources, and as far as I know, is used only for Hokkien. On the other hand, is used in Mainland China publications for all varieties of Min. RcAlex36 (talk) 12:40, 3 July 2024 (UTC)Reply

I would argue that {{zh-see}} should only be used for orthographic variants that one would consider as representing the same underlying character, so things like traditional/simplified or 異體字; those that involve intermediate steps which cannot simply be summarised as "orthographic variant" (e.g. modern kun'yomi, modern borrowings for the character pronunciation, romanisations, puns) would be a full entry and use {{alt form}} (or preferably a template that specifies the type of alt form). Often the latter type would require (or already have) its own set of attestation separate from the main entry, so it would be more convenient to have a definition line (as in the {{alt form}}) where quotes can be added to.
Some examples: and redirecting to ; redirecting to will continue to use {{zh-see}}, while for ; for ; or for ; der for will use {{alt form}}. – wpi (talk) 14:39, 3 July 2024 (UTC)Reply
I agree with the general intuition from wpi. I also would like to add that {{zh-see}} should only be used when it can entirely replace a whole etymology section. Any proper subset of an etymology section, such as restrictions to certain pronunciations, definitions, and/or lects, should use {{alt form}} for sure. — justin(r)leung (t...) | c=› } 14:46, 3 July 2024 (UTC)Reply

Is it ever necessary to use {{etymon}} in a redirect?

edit

I'm referring to pages that look like this:

#REDIRECT [[some page]]
{{etymon|en|id=something}}

Here are the disadvantages:

  • Harder to keep follow an {{etymon}} chain as you have to check both the redirect and the redirect target to find an {{etymon}} call.
  • Worse performance, as the template has to search both the redirect and the redirect target for the parent.
  • A *lot* of corner cases. Say we have {{etymon|en|id=someID}} on some_redirect and {{etymon|en|id=someID}} on redirect_target. Should this be allowed? It has to be, because otherwise you can never add {{etymon}} on a page until verifying that the same ID isn't already used on any of its redirects, which is obviously inconvenient. But if we do that, now [[some_redirect#English:_someID]], when clicked, takes you to the wrong place!

Thus, unless there's a very good reason for this to be supported, I would like to remove all {{etymon}} uses from redirects.

Pinging @Theknightwho who pushed for this to be supported. Ioaxxere (talk) 14:16, 3 July 2024 (UTC)Reply

@Ioaxxere I didn't push for this to be supported - I just did the implementation, so that pre-existing attempts to do this weren't completely broken. Theknightwho (talk) 14:21, 3 July 2024 (UTC)Reply
@Theknightwho: I say "pushed" because you called it the "most sensible solution". But maybe I'm misinterpreting what you meant. Ioaxxere (talk) 16:03, 3 July 2024 (UTC)Reply
I don't think this is optimal - I think allowing for alt forms in the title of the pointed-to page would be better. Vininn126 (talk) 14:29, 3 July 2024 (UTC)Reply

categorizing modern English verbs as "class 4 strong verbs" etc

edit

At Wiktionary:Requests for cleanup#Cat:English_class_4_strong_verbs, User:Mahagaja and I questioned whether it makes sense to be presenting modern English verbs as still having the class system they had back in PIE. Many verbs which were historically one class now behave like another class, or class distinctiveness has been lost, a lot has changed over the last few thousand years. I suggested that if anyone wants an etymology category, renaming the cats like "English verbs derived from PIE PGmc class X verbs" would make the intended(?) purpose and scope clearer, but alternatively it might make more sense to just not be categorizing this. What do you think? - -sche (discuss) 16:13, 3 July 2024 (UTC)Reply

The strong verb class system only dates back to Proto-Germanic, not Proto-Indo-European, AFAIK. I would prefer not categorizing them at all, but if we do, then yes, "derived from Proto-Germanic class ## verbs" makes more sense than calling them class ## verbs synchronically. —Mahāgaja · talk 16:18, 3 July 2024 (UTC)Reply
(That's its own issue, then, because the category descriptions are defining themselves in terms of the PIE conditions of the words, with no obvious reference to PGmc.) - -sche (discuss) 16:23, 3 July 2024 (UTC)Reply
I think the category descriptions are supposed to be giving background information, not defining criteria (even if that's not clear from how they are written).--Urszag (talk) 17:01, 3 July 2024 (UTC)Reply
IMO English verbs should not be categorized according to the Proto-Germanic strong verb system because most of the classes no longer have any coherence in modern English. (German is a different story. We still categorize modern German verbs according to this system because most of the classes have not lost their coherence in modern German.) Having this be an etymology category (reflecting what language? Middle English, Old English, Proto-West-Germanic, ...?) doesn't make a lot of sense IMO. Benwing2 (talk) 22:28, 3 July 2024 (UTC)Reply
I think it makes sense if all the verbs in the category are irregular due to still behaving like they're a member of a particular class, but it's probably better to give them a different name, as "English class 4 strong verbs" implies this is a common/agreed upon way of classifying English verbs. Theknightwho (talk) 20:43, 4 July 2024 (UTC)Reply
I read somewhere that someone tried to group English irregular verbs into classes and came up with 26 of them. Needless to say, there's no standard way of forming such classes; dictionaries just list the principal parts. Benwing2 (talk) 20:59, 4 July 2024 (UTC)Reply

What defines "transitive"?

edit

I am in the process of converting {{indtr}} to use {{+obj}}, but {{indtr}} seems to play fast and loose with the "transitive" label so I'd like to get a sense of what people think "transitive" means. In my book, a "transitive" verb takes a direct object, and a verb whose only object is taken through a preposition is not a transitive verb. However, {{indtr}} labels such usages as transitive with em or similar. (Note that {{indtr}} doesn't actually categorize such verbs in e.g. CAT:Portuguese transitive verbs due to the way it implements the labels; I'm not sure if that was intended, though.) My questions are:

  1. Do we agree that a verb usage like Portuguese pegar em (to touch) is intransitive, even though it's translated in English using a transitive verb? (IMO yes.)
  2. If so, should we label the verb using {{lb|pt|intransitive}}, thereby categorizing it into CAT:Portuguese intransitive verbs? (IMO yes.)
  3. What about verbs like Latin serviō (to serve) (which takes a dative object) in languages like Latin that have a case system? Should these be labeled as intransitive? (IMO yes. This is what Gaffiot does, for example.)

I should add that Italian generally follows the above rules, and it's useful to do so because all non-reflexive transitive verbs (according to the above definitions) take avere as an auxiliary, but intransitive verbs can take avere or essere. I think Spanish does too, and Italian and especially Spanish make little use of {{indtr}} compared with Portuguese and French. Benwing2 (talk) 22:23, 3 July 2024 (UTC)Reply

From what I figured when I specifically researched this question, in order to write government labels, transitivity depends on semantic properties and thus can be mediated through adpositions, so I am pleased to see that the author of the template {{indtr}}, which I have not known yet, due to generally editing other languages, understood it the same way. Many reference works dance around the mines when defining the concept, of course, particularly Wikipedia, whose manifold authors in one article have different understandings without realizing, giving false impression of a coherent article. The German Wikipedia at least was and is pretty explicit about the optional restriction to only direct objects. Als transitiv werden … Verben bezeichnet, die kein (oder, je nach Definition, kein direktes) Objekt haben. in the introduction and then a whole section about “conceptual (or terminology) variants” Fay Freak (talk) 23:25, 3 July 2024 (UTC)Reply
As a consequence, I disagree with your first conclusion.
Labels {{lb|langname|intransitive}} and {{lb|langname|transitive}} seem to have the purpose of disambiguating English equivalents, so one would have to reject the idea that they should categorize at all, were one not to know that other editors use the labels with different focus. In other words, the labels are polysemous, contrary to what we, who we describe language, trapped in its linearity, use to intuit – one reason why I avoid {{lb|langname|intransitive}} and {{lb|langname|transitive}} completely, preferring to specify government by {{+obj}}, formerly and in its new version, and use unambiguous verbose glosses. Fay Freak (talk) 23:38, 3 July 2024 (UTC)Reply
The simplest criterion and the one I think that is most commonly used by English speakers is that "transitive" means "takes a direct object", which excludes verbs that take prepositional phrases as their complements. I guess there could be unclear edge cases in some languages like the use in Spanish of "personal a" for animate patients of otherwise transitive verbs. The necessity of accusative case isn't entirely clear to me: I believe it is traditionally treated as diagnostic in Latin, but it seems like there might be a tradition in Polish of recognizing some verbs that take a genitive or instrumental complement as transitive if they can be passivized.--Urszag (talk) 03:03, 4 July 2024 (UTC)Reply
There is a great deal of sloppiness in labeling English phrasal verb senses as transitive or intransitive. I think it derives from including full definitions that are arguable SoP in the phrasal-verb L2s. DCDuring (talk) 18:54, 4 July 2024 (UTC)Reply
I'm opposed to calling any verb which takes a complement (be it a direct object - "verbes transitifs directs" in French - or a prepositional object - "verbes transitifs indirects") intransitive, though I agree it's not satisfactory to call them all transitive and leave it at that. Why not use "prepositional transitive" when needed? PUC18:20, 4 July 2024 (UTC)Reply
@User:PUC I assume that your opposition does not extend to all languages. DCDuring (talk) 18:54, 4 July 2024 (UTC)Reply
@PUC Concepts like "indirect transitive" appear to be Romance-specific. Benwing2 (talk) 19:01, 4 July 2024 (UTC)Reply
And even then, not found in all Romance languages. So I think it's much better to just call them intransitive; the fact that there is a preposition that can (and often is not) attached is clear from the use of {{+obj}}. Benwing2 (talk) 19:02, 4 July 2024 (UTC)Reply
To clarify, I have so far found the concept of "indirect transitive" only in the TLFi French dictionary and Michaelis Portuguese dictionary. It is not found in any other Portuguese dictionary, nor in any Spanish, Galician or Catalan dictionary that I have consulted. The Spanish, Galician or Catalan dictionaries label verbs as transitive only if they take a direct object; the Priberam and Infopedia Portuguese dictionaries are sloppy about the use of the labels "transitive" and "intransitive", sometimes calling verbs that only take a prepositional object "transitive" and sometimes "intransitive". The policy I'm following is that a verb that takes a prepositional object is transitive if and only if it also takes a transitive object; hence "to base (something in something else)" is transitive, but "to confide (in someone)" is intransitive. Benwing2 (talk) 19:42, 4 July 2024 (UTC)Reply
I'm getting the impression we may need to adopt various definitions of transitive per-language. Vininn126 (talk) 19:02, 4 July 2024 (UTC)Reply
In Polish grammars a requirement is generally that it can form the passive. Some verbs can take "accusative" arguments but cannot form the passive, and most scholars analyze the accusative argument as more of an adverbial. Vininn126 (talk) 18:22, 4 July 2024 (UTC)Reply
@Vininn126 Are you referring to what are often called "cognate accusatives"? Benwing2 (talk) 19:04, 4 July 2024 (UTC)Reply
@Benwing2 No, remember Talk:przejść? Vininn126 (talk) 19:10, 4 July 2024 (UTC)Reply
@Vininn126 Hmm, can you give a sentence with one of these adverbial accusatives in them? I'm having a hard time understanding what's being referred to. Benwing2 (talk) 19:12, 4 July 2024 (UTC)Reply
here's an article.
Basically people might create nonce formations like jezioro zostało obeszłe or something but they're highly non-grammatical, despite being able to say "Jaś obszedł jezioro". Vininn126 (talk) 19:13, 4 July 2024 (UTC)Reply
I think I recall this happening in Russian, too, where some transitive verbs don't have a passive participle. Zaliznyak labels them (depending on the verb) as either missing the passive participle entirely or as having a "rare and awkward" passive participle (which in practice won't ever be encountered). There are also a very small number of intransitive verbs that do have a passive participle; I think these are verbs that for whatever reason use an object in some other case but sound transitive. It's also interesting to me that the article you linked mentions that Polish in general avoids the passive, unlike German; this is like Spanish and unlike English. Benwing2 (talk) 19:51, 4 July 2024 (UTC)Reply

"Encyclopedic" as a deletion reason

edit
Discussion moved from Wiktionary:Requests for deletion/English#Sticks Nix Hick Pix. (to be archived at Talk:Sticks Nix Hick Pix)

I've forked this from that discussion because it concerns broader issues and was starting to take it over. Chuck Entz (talk) 17:56, 5 July 2024 (UTC)Reply

@Chuck Entz You're about the only one who's actually given any substantive reasons. Inqil and Fay gave no reasons at all, Sgconlaw said "we don't have these" but didn't say way, and Fenakhay simply said "encyclopedic" without saying encyclopedic in what respect. When I talk of a "false dichotomy", what I mean is a false dichotomy between dictionary definition and encyclopedia entry. The idea that there is some bright, hard-and-fast line between dictionary definition and encyclopedia entry is fantasy. There are Wikipedia articles about parts of speech and Wiktionary entries about places and occasionally events too. Furthermore, the term "encyclopedic" has now become a catch-all excuse for deleting or revising almost anything. One frequent usage I've seen of "encyclopedic" is to claim that an entry is too detailed, but that's clearly NOT the case here. Purplebackpack89 16:25, 5 July 2024 (UTC)Reply

Well it is true that many encyclopedic entries can be lexicographical and vice versa, however when we say that a term is encyclopedic, it means it is purely non-lexicographical and has zero rationale for inclusion— unlike those tons of encyclopedic entries we keep such as toponyms, anthroponyms, and any abbreviations. Inqilābī 17:16, 5 July 2024 (UTC)Reply
a) If by encyclopedic you meant "non-lexicographical", you should have said "non-lexicographical" at the outset. And "non-lexicographical" isn't much better because it is also an amorphous idea. Purplebackpack89 17:34, 5 July 2024 (UTC)Reply
(After edit conflict) Just because there isn't a perfect bright line between the two concepts doesn't mean either is invalid. As I've explained to people in the past: an encyclopedia is about things: ideas, events, people, places and things. A dictionary is about the words, phrases, etc. used to refer to those as words, phrases, etc.
Yes, Wikipedia has articles about parts of speech- but Wiktionary doesn't (not in mainspace, anyway). Part of speech is information about the terms, that we give in the entries for them. This is illustrated by the fact that definitions for "verb" as a part of speech are under a "Noun" part of speech header, since the word "verb" is a noun.
Likewise, we don't have entries for things like List of ethnic slurs, even though we have lots of entries for ethnic slurs.
There's overlap between an encyclopedia and a dictionary as far as definitions, because they have to be clear about what the terms refer to and thus give some of the same information that an encyclopedia article contains. There's also overlap in encyclopedia articles, because they often contain information about the names and terminology used for the article subjects. Still, overlap isn't identity.
Of course, whether something is encyclopedic or not is sometimes not clear- but that just means we need to discuss it. Also, a wiki is a community that decides things via consensus. All of the rules you cite were originally arrived at by consensus. Right or wrong, your opinions are just your personal opinions unless there is a consensus that agrees with them. Chuck Entz (talk) 18:57, 5 July 2024 (UTC)Reply
I think you've actually acquiesced to the balance of what I've said.
And if there's a lack of clarity of the the term "encyclopedic" (and it's pretty clear there is!), we need to tighten the language. Purplebackpack89 20:14, 5 July 2024 (UTC)Reply
I think this is a good opportunity to clarify "Wiktionary:Criteria for inclusion#Wiktionary is not an encyclopedia", which currently states:

Care should be taken so that entries do not become encyclopedic in nature; if this happens, such content should be moved to Wikipedia, but the dictionary entry itself should be kept.

Wiktionary articles are about words, not about people or places. Articles about the specific places and people belong in Wikipedia.

The first sentence seems to be addressing that point that a definition (for example, for a scientific concept like relativity) should not become like a Wikipedia entry in length. Thus, the entry itself should remain but the encyclopedic information should be moved to Wikipedia (if it isn't already there).
The second sentence comes closer to addressing the general issue about when an entry is "encyclopedic" and so is not sufficiently lexical for inclusion in the dictionary, but it does not explain very much. It needs to be read in conjunction with "Wiktionary:What Wiktionary is not", which states: "Wiktionary is not an encyclopedia, a genealogy database, or an atlas; that is, it is not an in-depth collection of factual information, or of data about places and people. Encyclopedic information should be placed in our sister project, Wikipedia." We should discuss whether this makes it clear enough to determine when a term is "encyclopedic" and so inherently unsuitable for the dictionary.
Following the discussion here, we should have a formal vote to amend to clarify the CFI. — Sgconlaw (talk) 18:12, 5 July 2024 (UTC)Reply

The general idea of the RfD and now of this post is that the word "Encyclopedic" is thrown around way too casually as a catch-all for everything. The secondary concern is attempts to establish a bright line between encyclopedia entry and dictionary definition are folly. There's just too many similarities and too much of a gray area. Let's take a more specific look at the two clauses Sgconlaw references. I believe both are in need of some revision:

  1. The first one needs to use greater precision than the catch-all "Encyclopedic". Instead of saying, "entries do not become encyclopedic in nature", perhaps a better phraseology might be, "entries and definitions do not become overly lengthy or detailed".
  2. Given my druthers, I'd dispense with the second, or reword it, because there are already many exceptions and need to be more. There is a long list of places that are acceptable entries or definitions. If a person or a fictional character becomes genericized, they are permitted a definition. There are also nicknames for individual people or groups of people.

Furthermore, there may need to be guidance at RfD that simply staying "Encyclopedic" isn't a thorough enough rationale for deletion, and greater specificity is required. Purplebackpack89 18:32, 5 July 2024 (UTC)Reply

If we do that, could we also legislate on bare "Keep" votes without rationale? PUC19:58, 5 July 2024 (UTC)Reply
When someone says keep or delete without any further statement, it generally means the rationale is the same as that of the previous voters in the post. Inqilābī 20:37, 7 July 2024 (UTC)Reply
Purple, may I suggest reading encyclopedia vs dictionary (on Wikipedia)? Just because you mentioned being unsure what type of content is encyclopedic material and what type of content is dictionary material. Hope that helps clear things up.

That being said, even putting aside it being encyclopedic material, it fails CFI as the term is SOP. The general expectation for multi-word terms is that the words put together have a different meaning than they do apart, but that doesn't seem to be the case here. The definition is literally just every word that makes up the term, hence it's SOP. And, while we do sometimes make exceptions to certain CFI policies, we do so when there is agreement that the terms would be helpful for translations; Such an exception wouldn't apply here, there isn't really any translation based usage for this term that I can think of. So, no hard feelings Purple, but this does not meet our CFI, even if you challenged the current definition of "encyclopedic". — BABRtalk 01:16, 6 July 2024 (UTC)Reply
I think most of what you've said belongs at the RfD discussion as it focuses on the specific rather than the general. Purplebackpack89 14:16, 10 July 2024 (UTC)Reply

Moratorium on editing other languages' etymology sections for the purpose of English etymology trees

edit

I'd like to request a moratorium on the editing of other languages' etymology sections for the purpose of populating English etymology trees (outside of adding {{etymon}} based on the already existing etymology). This has led to several conflicts and cleanups due to English editors wanting to display an etymology tree, haphazardly editing another language's etymology and causing misinformation to spread for other editors to clean up. It's been brought up to such users several times (primarily on Discord), but it looks like the problem has been continuing. As such, I'm bringing it to BP for a wider audience.

Whether it be the creation of problematic PIE reconstructions as detailed at Wiktionary:Requests for deletion/Reconstruction § Reconstruction:Proto-Indo-European/stéh₂tus, the editing of Spanish guayaba based on a misreading of an already faulty source for the tree at English poggers, the creation of {{etymon}} for nonexistent Middle Irish entries as stated at Wiktionary:Grease pit/2024/June § Template:etymon for nonexistent entries added to other entries, or the editing of Welsh entries for the purpose of the tree in English Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch by an editor with very little experience, it's been more clear to me that certain editors are more focused on the gamification aspect of trees rather than propagating pertinent and accurate information. See also: Wiktionary:Beer parlour/2024/June § Use of etymology trees made with Template:etymon in the entries for multi-word terms.

This was mainly sparked by the edit at guayaba because I don't even know where to begin to fix it, and it doesn't seem like {{etymon}} allows a derivation from a parent language with no attested term. I'm tired at this point of bringing this up on an individual basis to users and having to play cleanup, and had I known it would've exploded like this I would've voted oppose at the vote instead of abstain until this was made more clear. It's gotten out of hand. AG202 (talk) 00:49, 6 July 2024 (UTC)Reply

  Support 100%. The fact that {{etymon}} seems to be in large part used to add "cool" trees to terms like United States of America, pneumonoultramicroscopicsilicovolcanoconiosis and such makes it clear that too many people view this as a game. At the time this template was being created, I had a bad feeling about the design and usage and expressed my concerns; unfortunately these concerns appear to be borne out. Benwing2 (talk) 02:48, 6 July 2024 (UTC)Reply
Being entirely fair, a lot of this seems to be the work of one editor (@Akaibu), but I agree with @AG202: I have removed {{etymon}} from Welsh Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch, as the template call was essentially malformed by not even accounting for mutations, and I don't feel I'm in a position to correct it. While I'm all in favour of having etymology trees, I think we shouldn't be afraid to simply revert its addition to an entry if an editor is completely inexperienced with a given language, just as we would with any other part of an etymology section. There's no big rush here; the trees will get added over time, but it has to be done by editors who know what they're doing. Theknightwho (talk) 04:08, 6 July 2024 (UTC)Reply
I really wish it were mostly just one user. And the reason I've brought it to light is because it's led to mini edit wars as with the revision history at þeir that only ended when I brought it to Discord to remind folks yet again to not edit languages they're not familiar with. And then I myself had to go find and revert all the tree additions. AG202 (talk) 04:15, 6 July 2024 (UTC)Reply
I will add that I haven't taken a close look at the {{etymon}} syntax but from what I've seen it needs an overhaul. This implies we should not be adding it all over the place yet because all the uses will have to be fixed by bot. Benwing2 (talk) 04:43, 6 July 2024 (UTC)Reply
Just for clarity's sake, after posting the above image, I was (inadvertently) pinged by @Akaibu on Discord in the English channel who stated:

>I really wish it were mostly just one user. ~ @AG202

continues to bitch and moan about something that was still a single user's doing

This is just highly inappropriate. They've since replied that it was a joke about "[themselves] myself being the cause of the problems" and "saying one deserves more credit for causing trouble", but I still do not find it appropriate. This adds to their other messages on Discord pushing back against simple asks to not edit languages that they're not knowledgeable about (on multiple instances), along with misunderstanding basic tenants about how this project works. For a user with under 500 edits (a large chunk related to this effort), I'm concerned about constructive editing for the future. AG202 (talk) 04:45, 6 July 2024 (UTC)Reply
  Support, and your statement that "certain editors are more focused on the gamification aspect of trees rather than propagating pertinent and accurate information" is spot on. PUC08:05, 6 July 2024 (UTC)Reply
  Support. I do feel that people just wanna use the new tool wherever, without checking the results.
(Also, should we gamify making good etymologies?) CitationsFreak (talk) 09:34, 6 July 2024 (UTC)Reply
  Support. Going to the first redlink or the first place we are sure is fine. It's better to be slow and sure than fast and wrong. It's better to make a request somewhere or discuss it at WT:ES than anything. Vininn126 (talk) 09:41, 6 July 2024 (UTC)Reply
  Support. @VGPaleontologist --{{victar|talk}} 02:06, 7 July 2024 (UTC)Reply
  Support. Theknightwho (talk) 02:23, 7 July 2024 (UTC)Reply

Criteria for "...terms inherited from..." categories

edit

Related to the discussion above about PIE root categories, what kinds of bounds do we want on the membership of categories like Category:Latin terms inherited from Proto-Italic? I noticed this now includes words like solstitium and tessellatus, presumably because of the etymology trees. While I don't see an issue with putting these in Category:Latin terms derived from Proto-Indo-European, I don't think compound words that were put together after the ancestor language should go in this kind of category, much less words like tessellatus where only the suffixes are inherited, and the base is a borrowing. For simplicity, I think it would be best not to have words categorized as "inherited" unless they are inherited from only one etymon in the relevant ancestor language. @Ioaxxere would you be able to clarify if this is intended behavior, or a bug? Urszag (talk) 01:29, 6 July 2024 (UTC)Reply

@Urszag: the two entries you linked were using the template incorrectly which is what caused the unwanted categories to be added. I've gone ahead and fixed the entries. Ioaxxere (talk) 01:36, 6 July 2024 (UTC)Reply
Thanks for fixing my mistake. I hadn't realized it was erroneous to use "from" for affixed words, although that makes sense given the analogy to Template:from.--Urszag (talk) 01:39, 6 July 2024 (UTC)Reply
@Ioaxxere: What would be the correct derivation keyword to use for univerbations or multi-word phrases such as ab ante (or, if they don't end up being forbidden, cases like United States of America, LASER etc, which are now in "English terms inherited from Proto-West Germanic", "English terms inherited from Middle English", etc.)? I noticed the same issue here ("ab ante" being wrongly placed in "Latin terms inherited from Proto-Italic"), but wasn't sure about the semantically correct way to fix it. Should they be equated to compounds and marked with "af" (despite not containing any affixes)? I don't quite see in what circumstances the current behavior of "from" with multi-word etymologies will be appropriate, so maybe it should trigger an error message?--Urszag (talk) 04:37, 6 July 2024 (UTC)Reply
@Urszag: Yes, compounds are af. Maybe the name is a little misleading but it's explained in the documentation. Both of the examples you listed were added by @Akaibu who I hope can resolve the issue. Switching af and from still results in a valid etymology in this case (just not one that matches reality) so it's hard to automatically judge when someone has done it by accident. Ioaxxere (talk) 14:27, 6 July 2024 (UTC)Reply
I don't understand yet what the hypothetically valid alternative etymology would be in cases like this. Are there any concrete examples of entries derived from more than one etymon in the same language where it would be correct to use "from" rather than "af"? Seeing an example would help me understand better why it's necessary to distinguish the two keywords and their behaviors in this context. (I tried to look at examples of Template:from, but it seems to be barely used judging from the "What links here" tool.) Would "from" be reserved for cases like "a word used to have multiple forms that then merged into one" (but wouldn't that be more of a case for "influence"?) or "a term/phrase was derived from modification of a preexisting phrase" (but wouldn't that call for syntax like "from|en>united states of America...", like at religion of piss)?--Urszag (talk) 15:04, 6 July 2024 (UTC)Reply
@Urszag: Yes, you're correct. One example of from with more than one etymon would be cytotech which might be written |from|cytotechnologist>id1|cytotechnician>id2. Ioaxxere (talk) 15:38, 6 July 2024 (UTC)Reply
Thanks. While I think I understand this now, I think the way it currently works is unintuitive and is likely to lead to a lot of cases of "from" being used in contexts where it isn't appropriate according to these criteria, and where it will create categorization errors. Making it the default and describing it as "unspecified derivation type" doesn't help: "from" sounds a lot more generic than it really is, in contrast to "af" which sounds more specific than it really is. E.g. Reconstruction:Latin/ad vix was created with {{etymon|la|id=barely_vulg|la>ad>to|la>vix>barely}}, where the absence of a keyword apparently makes it be treated as "from", which inaccurately puts this entry into Category:Latin terms inherited from Proto-Italic.
Also, even in cases like "cytotech", would it really be accurate to describe this term as "inherited from Middle English" in a hypothetical situation where "cytotechnologist" and "cytotechnician" are inherited from Middle English, but the shortened form arose only in Modern English?
If the plan is to keep the current behavior of "from" with regard to inheritance categories the same, what would you think of making "af" instead the default when there is more than one etymon, all of which are in the same language as the entry?--Urszag (talk) 12:37, 9 July 2024 (UTC)Reply

merge "pronominal" into "reflexive"

edit

We have two labels pronominal and reflexive that are supposed to reflect a difference made in the linguistic tradition of certain Romance languages, whereby a "pronominal" verb is a reflexive verb whose meaning isn't obviously reflexive. Unfortunately in practice there's absolutely no consistency in how these labels are used, and it doesn't reflect anything in the actual syntax of the verbs, but only in an extremely subjective judgment as to whether a given sense has a sufficiently "reflexive" meaning to it. On top of this, the actual display of the labels doesn't do anything but muddy the waters: reflexive displays as reflexive while pronominal displays as takes a reflexive verb. Furthermore, the pronominal label doesn't seem to be used outside certain Romance languages even though there are several other languages (e.g. Slavic languages) that have a similar concept of reflexive verbs that may or may not be semantically reflexive. Finally, whether a verb is semantically as well as syntactically reflexive should be obvious from the specified meaning of the verb, i.e. the pronominal label adds absolutely nothing of value to the entry beyond what reflexive would do. So given all this I propose simply bot-replacing pronominal with reflexive. Benwing2 (talk) 02:44, 6 July 2024 (UTC)Reply

I'm not sure about this. I agree the distinction doesn't seem particularly necessary in terms of usage label text. This blog post makes a four-way distinction between reflexive, reciprocal, idiomatic pronominal, and essentially pronominal verbs. The last category seems to be fairly small, and it is possible to characterize it relatively unambiguously in terms of these verbs normally not occurring without a reflexive pronoun (compare Latin deponent verbs, which can be identified by the absence of attested active-morphology forms in most parts of their paradigm). It looks like currently, we have separately named categories for Category:French pronominal verbs, Category:French reflexive verbs, Category:French reciprocal verbs, although the first contains only one verb. We seem to already treat these verbs differently by including the pronoun "se" in the entry name, at least in the case of se barrer and a number of verbs in Category:French reflexive verbs, such as se passionner, se casser, se pouvoir; what's our policy on this? We have essentially duplicated information about the reflexive sense at passionner and pouvoir but not at casser.--Urszag (talk) 04:58, 6 July 2024 (UTC)Reply
@Urszag But this isn't at all how the label 'pronominal' is used here. It is used facultatively for reflexive senses of verbs (including those that are also used non-reflexively) that are idiomatic, that's all. Benwing2 (talk) 05:13, 6 July 2024 (UTC)Reply
BTW the policy for Spanish and Portuguese is that verbs are lemmatized with the reflexive pronoun only if they don't occur non-reflexively. This is different from the practice with Italian, which strictly separates reflexive and non-reflexive verbs into different lemmas. Benwing2 (talk) 05:14, 6 July 2024 (UTC)Reply
French appears to mostly follow the Spanish and Portuguese practice. Benwing2 (talk) 05:15, 6 July 2024 (UTC)Reply
Yes, I see that the current usage of the labels doesn't follow this or any other clear distinction (there are even some cases like se la péter that have both labels), so a bot replacement seems like it wouldn't remove information. I don't oppose that, but it seems like an opportunity to consider the question of whether it is possible to make a non-subjective distinction between the concept of pronominal and reflexive verbs, and to what extent our entries should mark this or leave them undistinguished. Since labels are sometimes used to add words to categories, that made me think about the categorization of these verbs, but it seems like pronominal doesn't actually even place a verb in Category:French pronominal verbs. Is there any easy way to see which pages use a certain label? I just noticed that pronominal is used not only in the lb template but also in other templates such as Template:indtr.--Urszag (talk) 05:35, 6 July 2024 (UTC)Reply
@Urszag {{indtr}} underlyingly uses the label machinery to handle things like .pronominal and other parameters preceded by a dot. You can see which pages use a given label by visiting Special:WhatLinksHere/Wiktionary:Tracking/labels/label/pronominal or a language-specific subcategory such as Special:WhatLinksHere/Wiktionary:Tracking/labels/label/pronominal/fr for French. Note that I'm in the process of converting all uses of {{indtr}} to a combination of {{lb}} and {{+obj}}, which is why I'm running into this issue. Benwing2 (talk) 05:41, 6 July 2024 (UTC)Reply
@Urszag I am also finding various examples e.g. Portuguese dedicar where "pronominal" is used even with explicitly reflexive meanings. Benwing2 (talk) 03:13, 9 July 2024 (UTC)Reply
  Support merging. In my experience "reflexive" is simply the term used in English-based learning materials where Romance languages use "pronominal". You could distinguish between the different functions of these verbs that Urszag listed, but I imagine none of the editors adding labels to these verbs have those distinctions in mind. They are most likely just using the terminology of their native language. Ultimateria (talk) 08:10, 9 July 2024 (UTC)Reply

where does Medieval Latin begin?

edit

This came up in an WT:RFDO topic. I'd like to establish clearly where Medieval Latin begins, so we can determine whether categories like CAT:Proto-West Germanic terms borrowed from Medieval Latin are legitimate, or should be emptied by fixing the terms in it to refer to Late Latin, Vulgar Latin or some other variety. AFAIK Medieval Latin begins no earlier than 600 AD; anything prior is Late Latin. User:Theknightwho and User:Nicodene agree with me, but User:Victar claims that Medieval Latin begins in the 4th century AD with Christian writers such as Jerome. What's the consensus here? Benwing2 (talk) 04:44, 7 July 2024 (UTC)Reply

‘Late antiquity extends roughly from 200 to 600, and the grammarians active during this period are often known as the Late Latin grammarians [...] The early Middle Ages (600—800) was characterized by the need to study Latin as a foreign language [...]’ - Mantello & Rigg (1996), Medieval Latin: An introduction and bibliographical guide, page 288.
Nicodene (talk) 05:23, 7 July 2024 (UTC)Reply
I would not expect it to be used of Latin earlier than 500 AD (or 476 if we use historical events as a marker). The Dictionary of Medieval Latin from British Sources apparently focuses on texts from between 540-1600.--Urszag (talk) 09:12, 7 July 2024 (UTC)Reply
What Nicodene and Urszag said. From the Early Middle Ages; for me Boethius is a marker, himself Late Latin. The 4th century even in Christian writers is clearly far from Medieval. It is bizarre to view as Augustine as Medieval Latin, though the so-called Church Fathers be all at fault for the decline and fall of the Roman Empire. This is the same fallacy as calling the Qurʔān Classical Arabic only because it is the basis of Classical Arabic. Fay Freak (talk) 09:40, 7 July 2024 (UTC)Reply
I don't really have that much of a dog in this, but this is from {{R:ine:EIEC|xxi}}: "[M]edieval Latin, a rather generic designation for Latin of the third century AD and later. (The cutoff date between Latin and medieval Latin follows that of the Oxford Latin Dictionary)". Personally, I also find this early, but 7th century seem quite late. If one used the fall of the Roman Empire, that would be end of the 5th century, and the works of Boethius would be the start of the 6th century. --{{victar|talk}} 05:07, 8 July 2024 (UTC)Reply
Quoting what @Nicodene said in response to this in the other thread: :::::: The Oxford Latin Dictionary set an approximate cut-off of 200 AD for the end of Classical Latin (the date I use as well), not the start of 'Medieval Latin'. I hate to say it, but the authors of the EIEC are simply mistaken. Benwing2 (talk) 05:11, 8 July 2024 (UTC)Reply
Also, to repeat (and add to) what I said: if what Victar is claiming is true, it either leaves no room for Late Latin, or means that we have to start treating Late Latin as a period of Medieval Latin; neither of which make much sense to me. Theknightwho (talk) 05:27, 8 July 2024 (UTC)Reply
TKW, "Victar is claiming is true"? These aren't my claims, and I was citing Urszag and Fay Freak. Please see their replies above. {{R:itc:EDL|14}}, going off of Weiss, claims Late Latin spans 3rd~4th c. to 5~6th c., leaving Medieval Latin to begin in the 5~6th century. That would allow for ML borrowings into 5th century Frankish/Proto-West Germanic, as well as Proto-Slavic. --{{victar|talk}} 05:34, 8 July 2024 (UTC)Reply
I think that's pushing it. Benwing2 (talk) 05:54, 8 July 2024 (UTC)Reply
And by that you mean to say you think de Vaan's/Weiss' dates are wrong? --{{victar|talk}} 05:56, 8 July 2024 (UTC)Reply
If "Late Latin spans 3rd~4th c. to 5~6th c", then Medieval Latin should start 6th~7th century, not 5~6th century. It's pushing it to infer from Late Latin having a 5th-6th century ending date that Medieval Latin can start as early as c 425 AD. That seems exceedingly unlikely to me. Benwing2 (talk) 05:59, 8 July 2024 (UTC)Reply
"then Medieval Latin should start 6th~7th century": No it wouldn't. See https://pasteboard.co/4tt1HXHqRvUq.png from the EDL. --{{victar|talk}} 06:03, 8 July 2024 (UTC)Reply
I take c 5th/6th century to mean c 500 AD. You can't take it to mean a 200 year range and arbitrarily pick the earliest possible date as the beginning of Medieval Latin. Benwing2 (talk) 06:06, 8 July 2024 (UTC)Reply
In any case, I think you'll have a hard time getting consensus on a date for Medieval Latin before 500 AD at the earliest, and you're kinda tilting at windmills trying to do so. Benwing2 (talk) 06:08, 8 July 2024 (UTC)Reply
That shows Late Latin lasting into the 5th~6th c., as we've been been saying.
Also there really is no calling anything prior to 476 (at the earliest) 'medieval' in any sense. Nicodene (talk) 06:12, 8 July 2024 (UTC)Reply
Nicodene, in Benwing's opening statement it is claiming the 7th century as the start of Medieval Latin. I am fine with a 5th~6th century start to ML, which still allows for some very late PWG and SL borrowings. --{{victar|talk}} 06:16, 8 July 2024 (UTC)Reply
As I said, you're trying to impose an artificially early date on Medieval Latin so you can borrow from Medieval Latin into PWG. I don't buy it. PWG is < 500 AD, Medieval Latin is > 500 AD, hence no overlap. Benwing2 (talk) 06:25, 8 July 2024 (UTC)Reply
And you keep trying to impose some finite date, when it's actually porously 5th~6th century. The end of PWG itself too is vague, and probably better also labeled 5th~6th century, as many scholars would call the 6th century Malberg glosses still Frankish.
In the end, it really doesn't matter. If all those entries on CAT:Proto-Slavic terms derived from Medieval Latin and CAT:Proto-West Germanic terms borrowed from Medieval Latin where changed to Vulgar or Late Latin, it would be of little consequence. You came at me hot, though, and so I'm just giving my understanding of the scholarship on the issue. --{{victar|talk}} 07:33, 8 July 2024 (UTC)Reply
Specialists generally place the cutoff in the sixth century AD (beginning, end, or somewhere in between) give or take a few decades. The cutoff is often tied to the death of a scholar, for instance Boethius or even more so Isidore of Seville. The latter lived to see the last gasps of the old Roman order. Nicodene (talk) 08:29, 8 July 2024 (UTC)Reply
I have to agree that anything before 476 AD can't be considered medieval. Also, Wikipedia claims that Etymologiae (c. 625) is Late Latin, so maybe the date should be pushed even later...? Ioaxxere (talk) 13:19, 8 July 2024 (UTC)Reply
The endpoint of "Late Latin" can be put at various places: Wikipedia says some would put it as late as 900 CE. My viewpoint is that if we make use of the term "Medieval Latin", it is best to define it in terms of the same date range commonly recognized for the Medieval period/Middle Ages in historical periodization. While the start of the Middle Ages isn't entirely fixed by convention (The Catholic Encyclopedia suggests you could take 375, 476, or 609) our entry at Middle Ages and Wikipedia both describe it as starting at 500 CE. If we expect this definition of "Medieval" to be the most common prior expectation for our readers, I think it seems strange to cut off several centuries from the category bearing that name. Sure, the division is arbitrary since there is no sharp transition between Late Latin and Medieval writers, but the same applies to Classical and Late Latin, and Medieval and New Latin.--Urszag (talk) 14:08, 8 July 2024 (UTC)Reply
This claim is for his generation. Isidore, over 60 when authoring his work, must have employed the Late Latin language he learnt when a bairn, like some of our seniors appear to relate to 20th-century English, and foreign languages, better than Generation Alpha slang. Idiolects aren’t all updated at the same time, so chronolects have intersections in reality. Fay Freak (talk) 14:13, 8 July 2024 (UTC)Reply
To give a real-world example, Latin plastrum is only attested in Medieval Latin, so the etymology on PWG *plastr is {{bor|gmw-pro|ML.|plastrum}}, supported by {{R:nl:NEW|plaaster}}. What should this be changed to in cases like this, Vulgar Latin? --{{victar|talk}} 20:57, 8 July 2024 (UTC)Reply
"Vulgar Latin" is a problematic term. I would lean towards saying it's not necessary to distinguish different kinds of Latin in the context of categories for borrowings into Proto-West-Germanic, and thus "Proto-West Germanic terms borrowed from Latin‎", "Proto-West Germanic terms borrowed from Medieval Latin", "Proto-West Germanic terms borrowed from Early Medieval Latin‎" and "Category:Proto-West Germanic terms borrowed from Vulgar Latin" might be better as just one category. In that case, it could simply use {{bor|gmw-pro|la|plastrum}}. If more context is desired, another format could be "from {{bor|gmw-pro|la|emplastrum}} via a clipped form {{m|la|plastrum}} (attested in Medieval Latin)."--Urszag (talk) 21:27, 8 July 2024 (UTC)Reply
Latin plastrum needs a label for a conjectured chronolect, starred “Late Latin”. I commented four weeks ago about this missing functionality. Fay Freak (talk) 21:34, 8 July 2024 (UTC)Reply
@Victar @Urszag I agree here with Fay Freak that if we actually believe this term existed in PWG, it needs to be derived from a hypothesized Late Latin term. I should add that the earliest cites listed in Du Cange and DMLBS are c. 1200 AD; not even Early Medieval Latin. @Fay Freak I saw your comment but didn't respond because I wasn't sure (and still am not sure) what you're asking for exactly. Can you give an example? Benwing2 (talk) 22:15, 8 July 2024 (UTC)Reply
@Benwing2 I think Fay Freak was saying exactly the same thing as you, but as a FF-ism. Theknightwho (talk) 22:30, 8 July 2024 (UTC)Reply
@Benwing2: I think the thing is simple, but people are unsure how to fit it in. The claimed dialect or chronolect label can be based on conjecture rather than attestation, so merits a star, or something else, if the idea from my side to put it beside language names rather than word forms is erratic, though it is just a general icon for reconstruction and I wouldn’t know which other sign to invent. By the same reasoning a sense has to be marked as reconstructed when the term is attested in but a part of the distinct meanings. Fay Freak (talk) 22:32, 8 July 2024 (UTC)Reply
Another example is PWG *lubistik, where the etymology lists multiple stages of Latin: Borrowed from Medieval Latin lubisticum, libisticum, from Late Latin levisticum, corrupted from Latin ligusticum. Detailing the different forms of Latin helps to give a sense of chronology, which just using plain Latin doesn't afford. --{{victar|talk}} 22:44, 8 July 2024 (UTC)Reply
I suspect a lot of these terms are wanderwords that didn't exist at the PWG stage. For example, if there was really a PWG plăstr, wouldn't we expect OE plæster not #plaster? Benwing2 (talk) 22:48, 8 July 2024 (UTC)Reply
Old High German pflastar exhibits p > pf, which points to it being borrowed before this change occurred, i.e. in Proto-West Germanic. What happens with a lot of Latin borrowings is that they get later reenforced by Latin individually and even later by Old French. --{{victar|talk}} 23:03, 8 July 2024 (UTC)Reply
Not necessarily; the p -> pf change could have survived as a surface filter for hundreds of years after it first occurred. Benwing2 (talk) 23:45, 8 July 2024 (UTC)Reply
I didn't realize you were a PWG editor. --{{victar|talk}} 00:44, 9 July 2024 (UTC)Reply
Cut the sarcasm. You're not a Latin "editor" either. Benwing2 (talk) 01:42, 9 July 2024 (UTC)Reply
No, however I am an editor who spends a large portion of their time focusing on Latin borrowings into West Germanic. OHG had no problem borrowing p from Latin, with examples like pensil, from Medieval Latin penicillum. --{{victar|talk}} 02:32, 9 July 2024 (UTC)Reply
I know that, but IMO it doesn't prove that much. Spanish borrowed some words from Latin with ie reflecting short ĕ hundreds of years after the initial sound change /ɛ/ -> ie under stress; Italian borrowed some words from Latin with closed /o/ reflecting Latin ŭ late into the Medieval and Early Renaissance period, more than 1,000 years after the corresponding sound change took place. Russian still sometimes makes the substitution /h/ -> г /g/ in borrowings.
In any case we seem to have 5-1 consensus that PWG can't borrow from Medieval Latin. Benwing2 (talk) 03:23, 9 July 2024 (UTC)Reply
"5-1 consensus"? Where do you see that? What's been discussed is the start date of Medieval Latin. If ML begins in 5th~6th century, 5th~6th century PWG can conceivably borrow from it. --{{victar|talk}} 03:42, 9 July 2024 (UTC)Reply
I've sorted out some of the details of plastrum.
Not the first time I've encountered a late borrowing from Romance into Latin that happens to be spelt the same way as we'd spell the 'Vulgar Latin' reconstruction. An even later example is consutura. Nicodene (talk) 03:15, 9 July 2024 (UTC)Reply
Thanks for creating an entry for it. --{{victar|talk}} 03:23, 9 July 2024 (UTC)Reply
@Nicodene, want to create an entry for *buttia, from buttis? --{{victar|talk}} 04:36, 9 July 2024 (UTC)Reply
@Victar Done. Let me know if there are others like this. Nicodene (talk) 22:50, 9 July 2024 (UTC)Reply
Bit late to the party, but I agree with Nicodene's comments in this discussion. The sixth century is the clear transitional period going by scholarship of Late Antiquity, with 600 or perhaps the Etymologiae being a logical cutoff point if we have to draw the line somewhere. — Mnemosientje (t · c) 10:35, 23 July 2024 (UTC)Reply

Is it necessary to draw a distinction between Late and Medieval Latin? What if we merge them as simply ‘Post-classical Latin’ or something? Since different sources can variously call the same etymon LL and ML, a merge might be helpful… Inqilābī 04:14, 9 July 2024 (UTC)Reply

Yes. They represent notably different stages and dialects. Late Latin was still spoken natively, while Medieval Latin was learned as a foreign language and has a lot of weirdnesses in it by comparison. Benwing2 (talk) 04:46, 9 July 2024 (UTC)Reply
The end period of natively spoken Latin overlaps with the early period of the use of Latin as a scholarly, learned Lingua Franca. Therefore, any switch-over date we pick is somewhat artificial and arbitrary. To make it easy, we should then use some century. It will not make a great deal of difference whether we let Medieval Latin “take over” per the 5th, 6th or 7th century. But perhaps we should allow the languages to overlap in time, depending on the evidence of use – whether it was a (possibly reconstructed) term used by the common people or a term used by a scholar or scribe. After all, Late Latin was not truly the ancestor of Medieval Latin in the sense in which Middle English is the ancestor of English.  --Lambiam 22:42, 17 July 2024 (UTC)Reply
There is a label "post-Classical" that I've used in the past (e.g. octoplus, athough an IP replaced it with the more specific ML. label). As a label, it would in theory cover all of Late Latin, Medieval Latin and New Latin, which is a pretty broad time range. I can see why we might want to include a few more divisions. I'm not entirely on board with conceptualizing the boundary between "Late Latin" and "Medieval Latin" as a real internal transition rather than a convenient yet more-or-less arbitrary convention of periodization: while some analyses consider the transition between native use and "learned as a foreign language" use of Latin to be significant and something that happened around the start of the "Medieval" era, there isn't consensus either on the nature of that transition or its date, so I don't think our dictionary should commit to the idea that this is an essential criterion dividing Late Latin from Medieval Latin. (I don't think the presence of these labels in itself requires that theoretical commitment, but I wanted to push back a bit against the viewpoint mentioned by Benwing.)--Urszag (talk) 12:55, 9 July 2024 (UTC)Reply

Unchecked proliferation of pages generated due to using {{uder}}

edit

I noticed many editors who aren’t well-aquainted with etymology templates are using {{uder}}. I would recommend displaying a default warning after someone uses this template to deter people from using it. @Benwing2, Chuck Entz Inqilābī 20:28, 7 July 2024 (UTC)Reply

@Inqilābī Could you please give some context as to the kinds of pages you mean, and what the problem is? Theknightwho (talk) 20:34, 7 July 2024 (UTC)Reply
Category:Undefined derivations by language. The previous template was {{etyl}} (which was the oldest etymology template and was non-specific in nature), later replaced by {{uder}} for smoother appearance. This template ought to be substituted with a legitimate etymology template such as {{der}}, {{inh}}, {{bor}}, {{lbor}}, etc. Using {{uder}} generates the aforesaid category and subcategories; even Wingerbot keeps generating these categories as part of its routine category creation. Inqilābī 20:49, 7 July 2024 (UTC)Reply
@Inqilābī My understanding is that the template is supposed to be used if you're not fully sure of the derivation. Ideally every entry would be more specific, but that's not feasible. Theknightwho (talk) 20:53, 7 July 2024 (UTC)Reply
I guess I would rather someone who doesn't know what the right template to use is, uses {{uder}} rather than just {{der}}, which lots of people are in the (unfortunate) habit of doing. Benwing2 (talk) 21:00, 7 July 2024 (UTC)Reply
As someone who does etymology cleanups, {{der}} and {{uder}} are the same thing to me. Using a wrong template isn’t the only issue though; wrong etymons are another problem, commonly found in very old entries. Changing {{bor}} to the more precise {{lbor}} is another maintainence. Inqilābī 21:10, 7 July 2024 (UTC)Reply
@Inqilābī Well, going forward shall we agree the following?
  1. {{uder}} should be used when you're unsure of the exact derivation, so it categorises the entry into a maintenance category so that it can be replaced with a more specific template if appropriate, or otherwise {{der}} if it's not.
  2. {{der}} should only be used when the derivation does not fit into one of the types of derivation covered by another template.
This seems like a useful distinction to me, at least. Theknightwho (talk) 21:16, 7 July 2024 (UTC)Reply
@Theknightwho Yes, that was the original intention of these templates. Benwing2 (talk) 21:17, 7 July 2024 (UTC)Reply
@Inqilābī Please don't do that. If you're not sure of whether to use {{inh}} or {{bor}}, leave it alone. Benwing2 (talk) 21:17, 7 July 2024 (UTC)Reply
As someone who also does etmology cleanups, I would rather people be vague than wrong. With {{uder}}, people who know what they're doing can find these entries to fix them. It's already too easy for someone to just copypaste a derivation template from another language without changing the language codes. I find things like Bengali entries in Category:Pashto terms borrowed from Arabic, Bengali terms categorized as Assamese and vice versa. I even find entries for European languages in Category:Indonesian terms inherited from Malay. Then there are the etymologies where a term is borrowed from a neighboring language, which inherited it from another language, which borrowed it from yet another language- and the etymology uses {{bor}} for that last step, even though there was no direct contect between the first language and the last language. Of course, the people who make those errors would probably get the language codes wrong in {{uder}} if they even knew about it. My point is that unnecessary usage of {{uder}} is pretty minor compared to that kind of nonsense. Chuck Entz (talk) 22:26, 7 July 2024 (UTC)Reply
@Theknightwho: I am pretty sure that is not the case. It’s only a cleanup template / category— and if unsure about the immediate etymon, using {{der}} is sufficient (but a {{rfe}} can be added alongside). It wasn’t created to exist permanently, and will be deprecated eventually. Inqilābī 21:02, 7 July 2024 (UTC)Reply
I'm a little confused, because wouldn't that mean it could have been wholesale replaced by {{der}} without any loss of specificity? I thought the whole point was that it created the maintenance category because the editor suspects a more specific derivation may be possible, whereas {{der}} does not necessarily imply that, meaning it's still useful if people add it. Theknightwho (talk) 21:05, 7 July 2024 (UTC)Reply
@Theknightwho That's right. {{der}} is supposed to used only when {{inh}} and {{bor}} don't apply. I don't agree with User:Inqilābī that {{der}} is OK if you're not sure of the correct template. Benwing2 (talk) 21:07, 7 July 2024 (UTC)Reply
Firstly, I think having {{der}} is okay in cases where the etymology says ‘ultimately from’. Another editor could subsequently add to the etymology if more data is available. {{rfe}} can always be used, and editors can fix etymologies from that category page. Originally, {{uder}} was created to prevent someone who was converting {{etyl}} to {{der}} wholesale, and not out of the desire to make a category for to-be-fixed etymologies.
But if people want to change the purpose of this template, then I have no objections- I am just stating what I thought was a misuse of the template and categories. Inqilābī 21:21, 7 July 2024 (UTC)Reply
@Inqilābī On a separate-but-related note, please don't change empty categories with {{auto cat}} to {{d}}, because if they stop being empty then they're still no longer part of the category tree, and it means someone else (read: an admin) has to actively undo your change, which is annoying. Empty categories get automatically categorised into Category:Empty categories, and are deleted routinely. However, these categories are maintenance ones that shouldn't be deleted anyway, since they will occasionally see new entries. Theknightwho (talk) 21:23, 7 July 2024 (UTC)Reply
I agree with this; I just did a run deleting empty categories a couple of days ago. Benwing2 (talk) 21:25, 7 July 2024 (UTC)Reply
Okay sorry, I didn’t know I wasn't supposed to do that with the {{uder}} categories, even though I have done this for a long time without anyone objecting. Inqilābī 21:32, 7 July 2024 (UTC)Reply
@Inqilābī Yes, IMO {{der}} is OK in 'ultimately from' cases but from what you said above it sounded like you were doing exactly that wholesale conversion of {{uder}} to {{der}}, which is absolutely wrong. Benwing2 (talk) 21:23, 7 July 2024 (UTC)Reply
Yeah, people don’t clearly understand what I say. This was funny because I am against indiscriminately using {{der}}, and I was the main reason this was halted after I started a BP discussion about the problem some years ago. Inqilābī 21:31, 7 July 2024 (UTC)Reply
My apologies, when you said {{der}} and {{uder}} are the same thing to me it sounded like you were using {{der}} indiscriminately. Benwing2 (talk) 21:35, 7 July 2024 (UTC)Reply
Few considerations
  • Future consideration: It seems like editors who use {{uder}} use it randomly (this includes copy and pasting from elsewhere) and not because they think it is to be used if they are uncertain about the specific etymology. Our default welcome message could also contain a link to a page listing the etymology templates with explanations about each of their purposes, which would help prevent a misuse of any of the templates by new editors. This in turn would rule out the necessity of any maintainance etymology template.
  • Immediate consideration: If empty pages aren’t regularly deleted then {{uder}} cleanup becomes difficult. I don’t want to click on every subcat to check which languages’ cleanups are complete. Hence I would suggest a bot deleting the empty pages periodically (from what I know this has not been done for the category in question), or actually allowing editors to tag the pages for deletion, or even creating a duplicate copy of the category which won’t contain empty subcats (I’m not sure if the last option is feasible). This makes life easier for people going across language entries to substitute this template with better etymology templates.

Inqilābī 12:24, 8 July 2024 (UTC)Reply

@Inqilābī Your second point confuses me: if you're monitoring them from Category:Undefined derivations by language, then you can already see which are empty. The next level up from that is Category:Entry maintenance subcategories by language, which is way too broad to be conducting routine entry maintenance from, since you generally want to pick a single area to focus on at any one time. Theknightwho (talk) 15:20, 8 July 2024 (UTC)Reply
@Theknightwho: Do you mean those (x e) things beside the category names? Wow I never realized until just now that it indicated the number of entries contained. Sorry for wasting everyone’s time!- but also thank you for pointing it out. Inqilābī 15:31, 8 July 2024 (UTC)Reply
@Inqilābī No worries. If it has subcategories, you can also click the little arrow to the left of the name to expand them. Theknightwho (talk) 15:33, 8 July 2024 (UTC)Reply
@Theknightwho: I am aware of those arrows, but for some reason every single arrow in this category actually appears grey, and I’m unable to expand them. Inqilābī 15:38, 8 July 2024 (UTC)Reply
@Inqilābī Yeah, it only shows subcategories - you can't use it to see pages, unfortunately. Theknightwho (talk) 15:39, 8 July 2024 (UTC)Reply
@Benwing2, well I would still urge periodically deleting empty subcats of undefined derivations using your bot, given that limitless new subcats can be created by anyone, while I prefer it be a short list of cleanup category which I don’t have to scroll through to be able to spot the non-empty ones. Thanks for considering! Inqilābī 11:59, 9 July 2024 (UTC)Reply

Voting to ratify the Wikimedia Movement Charter is ending soon

edit
You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello everyone,

This is a kind reminder that the voting period to ratify the Wikimedia Movement Charter will be closed on July 9, 2024, at 23:59 UTC.

If you have not voted yet, please vote on SecurePoll.

On behalf of the Charter Electoral Commission,

RamzyM (WMF) 03:46, 8 July 2024 (UTC)Reply

etydate template

edit

Originally, {{etydate}} displayed text inside square brackets and small text. However, this was removed later on while retaining the dot at the end of the text and the parameter |nodot=. Now, the dot at the ending is not necessarily needed always and editors often use other wording in the same sentence after the template-generated text. So I think it should be consistent with other etymology-line templates like {{doublet}}, {{calque}}, etc. which generate text but no dot, also because it's easily much less a hassle to type a dot than |nodot=1. I'd like to know if people agree with or object to such a change. Svartava (talk) 12:15, 9 July 2024 (UTC)Reply

I'd be in   support of this, as probably the biggest user of this template. All instances would need a bot update, but it would be much better imo. Similarly, I've considered removing "the" when a non-number was given, opting to manually have it, but maybe it's better to have it. Vininn126 (talk) 12:17, 9 July 2024 (UTC)Reply
  Support. (Adding the langcode as the first parameter might also be useful as it would help creating lists of terms in a given language by first attestation.) Einstein2 (talk) 12:34, 9 July 2024 (UTC)Reply
There was also talk of having categorization for dates at some point, but no one was able to come up with a concrete system. Vininn126 (talk) 12:57, 9 July 2024 (UTC)Reply
  Support removing the default dot and removing the |nodot= parameter in {{etydate}} because it's easily much less a hassle to type a dot.
Adding the langcode as the first parameter seems like a useful idea as well for creating lists of terms in a given language by first attestation. Kutchkutch (talk) 13:46, 9 July 2024 (UTC)Reply
  Support and I can do the bot changes. Benwing2 (talk) 18:39, 9 July 2024 (UTC)Reply
  Support BABRtalk 19:06, 9 July 2024 (UTC)Reply
  Support - Leasnam (talk) 02:26, 10 July 2024 (UTC)Reply
  •   Done; as of now, the cleaning up is in process. @Benwing2: If you can run a bot job of adding . at the end of instances of {{etydate}}, here's a list of entries on which cleanup has not been done -- rest have been fixed. Thanks! Svartava (talk) 09:18, 11 July 2024 (UTC)Reply
    @Svartava Can you tell me exactly what steps you took and in what order? In the future it would be better not to do half-cleanups like this, particularly when making a change that isn't idempotent such as adding or removing a period, because it's difficult to figure out how to do the bot changes correctly. It would be better to let me do it completely. Benwing2 (talk) 02:00, 12 July 2024 (UTC)Reply
    @Benwing2: I removed all instances |nodot=1 and added periods at the end of {{etydate}}'s on some of the pages on which |nodot=1 wasn't used. I've created the list mentioned above for the entries on which cleanup is yet to be done so on those entries it's just adding periods after {{etydate}} since all those pages are among those pages which were not using |nodot=1. Svartava (talk) 02:51, 12 July 2024 (UTC)Reply

Removing hiragana transliterations in Japanese

edit

Hello, I propose that I run a bot task to remove the instances where we see hiragana used as part of the transliteration when linking to Japanese, e.g. {{t|ja|窯|tr=かま, kama}}{{t|ja|窯|tr=kama}}. This is because the hiragana doesn't add anything more than the romanization already offers; the transliteration doesn't help those who can't read Japanese writing anyway; and, in general, I think |tr= should be reserved for Latin writing, as people who only know English can at least always derive something from it. I believe we had (maybe not everyone) agreed that we should only use the romanization in this case for Japanese, but please let me know what you think.

What I would do: 1. Likely make a tracking category for entries that use non-Latin in Japanese translations in {{t}}, as I believe the majority of uses of this are in translation sections; 2. Iterate over all translations, and for each one: 3. If the transliteration of the hiragana equals the romanization, simply remove the hiragana; else, save it for our review, in case there are any mismatches in transliteration out there.

If there's a more refined way to access any instances of the link module using non-Roman transliterations, that might also be a better substitute for step 1, but I don't know if that exists. Kiril kovachev (talkcontribs) 21:35, 9 July 2024 (UTC)Reply

@Kiril kovachev Yes, I agree. I don't mind/quite like having hiragana displayed as rubytext (or we could even do Chinese-style and have it displayed after a slash), but having it in transliterations like this is generally pretty crap. Theknightwho (talk) 21:15, 10 July 2024 (UTC)Reply
@Theknightwho Do you think we should do either of these ideas, instead of outright removing it? Kiril kovachev (talkcontribs) 21:44, 10 July 2024 (UTC)Reply
@Kiril kovachev I think for now it's best to remove them, since any conversion to rubytext would need to be done manually, and a lot of them are quite sloppy.
We probably want to have a proper discussion about displaying multiple forms (i.e. rubytext), as it might make more sense to use the Chinese style in translation sections (where space is tight), as compared to other places. I'm still keen for us to display kana forms, though, since having to work backwards from transliterations is annoying. Theknightwho (talk) 21:57, 10 July 2024 (UTC)Reply
@Theknightwho Alright, I'll focus on removing them now, then, if that's what we want. I'm asking because if we eventually wish to convert the translations to give kana inline anyway, getting rid of it now would just make it harder for us later, no? Kiril kovachev (talkcontribs) 22:13, 10 July 2024 (UTC)Reply
@Kiril kovachev Hopefully it should be possible to automatically scrape kana in some cases, which should mitigate this. However, I don't think it's too much of a problem if we remove it now, since it would all need to be converted properly by hand anyway. Theknightwho (talk) 22:20, 10 July 2024 (UTC)Reply
Okay, gotcha. I'll figure it out one of these days hopefully then. Kiril kovachev (talkcontribs) 22:23, 10 July 2024 (UTC)Reply
Agreed also. Benwing2 (talk) 21:32, 10 July 2024 (UTC)Reply
@Kiril kovachev:
  Oppose simply deleting these hiragana readings. In Hepburn romanization (which is standard in the English Wiktionary), the long vowel in any intra-morpheme combination of an (だん) (odan) kana + (u) or (o) is transcribed ō without distinction, and are both transcribed ji, and and are both transcribed zu, so it is untrue that the hiragana adds nothing, since one can't in all cases infer the hiragana from the Romanisation (the converse is also true, since kana do not distinguish case, whereas the Latin script does). It is also very common for learners of Japanese to be proficient in kana but be unfamiliar with many kanji (I am just such a learner). I would propose converting these hiragana readings to furigana, but as Theknightwho already noted, the smaller text size and lack of space in translation tables militates against this. I'm not thrilled about slashed translations à la Chinese traditional/simplified spellings and would prefer the kana remain in parentheses alongside the rōmaji, but it would be a tolerable solution.
0DF (talk) 22:54, 10 July 2024 (UTC)Reply
@0DF Well, I've personally argued for having a 1-to-1 transliteration system in the past, but that doesn't seem to be overly popular, so for now you're right that it does add minor distinctions, which I chose to ignore because I didn't believe them to be overly significant. The reason is because you can just click on the link to see the kana if you wanted to see the original; after all, if you wanted to see the historical spelling, which is even more distant from the current pronunciation, you'd again have to do that too. The differences in kana and romaji don't affect the way a reader would pronounce the word, as far as I'm aware, which is in the first place why the romanization is identical for all those syllables.
But, I get that there is a difference, and that you won't be able to spell the word in kana correctly if all you have is the Hepburn. Fair enough. Maybe we can abstain from removing it for now.
I also have a few suggestions for what we can do with the kana, though, to keep it a bit smaller: as some dictionaries do, we can have the furigana given as a subscript on the kanji. Or as brackets after each kanji, but that's less readable IMO. Kiril kovachev (talkcontribs) 23:06, 10 July 2024 (UTC)Reply
@0DF I completely disagree with this. You are arguing based on a small number of edge cases, which can easily be determined for those small numbers of people who care, by looking at the lemma page. Benwing2 (talk) 23:39, 10 July 2024 (UTC)Reply
@Benwing2 Well, you could make the same argument for removing simplified Chinese, since - like kana - it can't be readily determined in only a quite small proportion of cases. Theknightwho (talk) 23:46, 10 July 2024 (UTC)Reply
@Theknightwho But the difference is that simplified Chinese *IS* the normal way of writing these lexemes for 95%+ of native speakers, which doesn't apply to kana in the case of words normally written with kanji. Benwing2 (talk) 23:58, 10 July 2024 (UTC)Reply
@Benwing2 It's trivial to find examples of words which are in free variation between the two. It's not safe to just assume the kana forms aren't used. Theknightwho (talk) 00:03, 11 July 2024 (UTC)Reply
@Theknightwho I'm not sure why you are arguing. Did you change your mind about removing kana from transliterations or are you just playing devil's advocate? Benwing2 (talk) 00:08, 11 July 2024 (UTC)Reply
@Benwing2 I think we should remove the ones given in manual transliterations, but I'm ultimately in favour of having kana displayed somehow. Theknightwho (talk) 00:12, 11 July 2024 (UTC)Reply
@Theknightwho I see. Personally I think furigana is enough. Benwing2 (talk) 01:04, 11 July 2024 (UTC)Reply

U4C Special Election - Call for Candidates

edit
You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello all,

A special election has been called to fill additional vacancies on the U4C. The call for candidates phase is open from now through July 19, 2024.

The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community members are invited to submit their applications in the special election for the U4C. For more information and the responsibilities of the U4C, please review the U4C Charter.

In this special election, according to chapter 2 of the U4C charter, there are 9 seats available on the U4C: four community-at-large seats and five regional seats to ensure the U4C represents the diversity of the movement. No more than two members of the U4C can be elected from the same home wiki. Therefore, candidates must not have English Wikipedia, German Wikipedia, or Italian Wikipedia as their home wiki.

Read more and submit your application on Meta-wiki.

In cooperation with the U4C,

-- Keegan (WMF) (talk) 00:03, 10 July 2024 (UTC)Reply

MediaWiki:Gadget-SpecialSearch

edit

Should we get rid of this old gadget (currently on by default)? It creates three new buttons at Special:Search (Google, Bing, Yahoo) which are meant to let you search Wiktionary with an alternative search engine, although the gadget is not working properly at the moment. @This, that and the other mentioned that the gadget might have been created at a point in time in which the built-in MediaWiki search was more "primitive". I think it is no longer necessary, but maybe some people would still like to be able to use it. Ioaxxere (talk) 04:21, 10 July 2024 (UTC)Reply

I don't care either way but if other people don't want to keep & fix it then let's just get rid of it.

I will add, though, you are able to link to google using [[google:]] (e.g. google:Wiktionary), and if you type google: in the search bar you'll be redirected to google as well. So if the goal is to search something on wiktionary and pull results on google, I assume a fix is possible but I'm not sure how helpful that would be. OTOH if the goal is to pull up google results on Wiktionary, I 'm still not sure it would be helpful. Unless, it could count the total amount of results (since google removed that feature), but even that feels like a stretch. — BABRtalk 04:35, 10 July 2024 (UTC)Reply

Looks like nobody cares that much. I'll take the gadget away and we'll see if anyone complains. This, that and the other (talk) 03:25, 19 July 2024 (UTC)Reply

I don't care about the issue, but 9 days in Summer seems like a very short time to expose any issue to objections. DCDuring (talk) 11:39, 19 July 2024 (UTC)Reply
someone can still voice an objection after it is removed. I don't think there's any harm in removing the gadget now, and think it's good way to see if the gadget being gone actually causes issues for anyone. Plus, FWIW, the gadget is not even functional right now and no one has voiced support for fixing it, so there's really no difference between it remaining installed or not. — BABRtalk 19:30, 19 July 2024 (UTC)Reply

Mahagaja changing references to {{reflist|size=smaller}}

edit

This was brought up before (link?), but User:Mahagaja has a habit of going around changing the references section on pages from <references /> to {{reflist|size=smaller}}. Assuming this is out of a visual preference, they can simply edit their private common.css to accommodate their personal taste. If we, as a project, wanted this font size as the default, that change would have been made in the backend. --{{victar|talk}} 18:28, 10 July 2024 (UTC)Reply

I don't remember anyone bringing this up before, but if we're not supposed to be allowed to do this, we shouldn't have |size= in {{reflist}}, or maybe we shouldn't have {{reflist}} at all. —Mahāgaja · talk 18:31, 10 July 2024 (UTC)Reply
What a bizarre reason to remove a feature (one can also always use <small><references /></small>), but I find it useful for notes on entries and references within discussions. Also {{reflist}} is the only way you can have a references inside a references list, is which helpful for notes. --{{victar|talk}} 18:47, 10 July 2024 (UTC)Reply
I wasn't actually advocating removing either {{reflist}} or |size=; I was pointing out the oddity of providing a template that has a certain function, and then complaining when users use it. —Mahāgaja · talk 08:39, 11 July 2024 (UTC)Reply
And it has its uses here and there, but changing all instances in the references section is another thing entirely. Is this just for aesthetic reasons? You can add ol.references { font-size: smaller; } to your common.css file which will accomplish the same. --{{victar|talk}} 03:15, 12 July 2024 (UTC)Reply
That would make them smaller only for me, not for everyone. There's a reason that books have for centuries been printing footnotes at the bottom of the page in a smaller font size than the regular text: you don't want less important information like references to be written as large as the more important information. The difference between main text and footnotes is clearer to the reader when there's a size difference. —Mahāgaja · talk 12:44, 13 July 2024 (UTC)Reply
@Mahagaja: if your intention is to apply {{reflist|size=smaller}} to all entries, this is a matter that should be discussed first. — Sgconlaw (talk) 13:36, 13 July 2024 (UTC)Reply
Well, I never edit entries just to apply {{reflist|size=smaller}}, but if I'm editing, for some other reason, an entry that uses <references/>, I often change that too. Never in a billion years would it have occurred to me that anyone would be annoyed by that, but if they are, I'll stop changing it in existing entries. But I will keep using {{reflist|size=smaller}} in entries I create or entries I'm adding references to for the first time. —Mahāgaja · talk 14:06, 13 July 2024 (UTC)Reply
@Mahagaja: but that's exactly what should be discussed. It makes certain entries look different from others—I'm not sure that's a good idea. — Sgconlaw (talk) 14:09, 13 July 2024 (UTC)Reply
We're a wiki with hundreds of editors. It's inevitable that some entries look different from others, and that's never going to change. And you know how discussions of this type end up: lots of people express lots of different opinions, much more heat than light gets generated, eventually the conversation fizzles out without anything being resolved, and everyone goes back to doing things exactly the way they always have. —Mahāgaja · talk 14:20, 13 July 2024 (UTC)Reply

Language of surnames

edit

Someone has had a lot of fun adding tons of Polish surnames as English entries, rendering Category:English terms borrowed from Polish practically unusable in the process.

Five years after a failed vote (Wiktionary:Votes/pl-2019-11/CFI policy for foreign given names and surnames), couldn't we have another go at devising a policy about this? PUC18:52, 10 July 2024 (UTC)Reply

The phenomenon of proper nouns "drowning out" regular words in categories is not unique to this situation. E.g. it's a little bit of work to spot the common nouns in Category:Latin feminine nouns in the second declension and Category:Latin masculine nouns in the first declension given all of the borrowed names in these categories. Does this mean it would be useful to have categories for these that exclude proper names? I once thought so, but I think I've seen some argument about how intersectional categories aren't necessary because there's supposed to be some way to generate them yourself—not that I remember how to do it. Since Category:English terms borrowed from Polish, Category:English proper nouns, and Category:English surnames all exist, in theory there's all the information needed to calculate the difference of these sets.--Urszag (talk) 19:23, 10 July 2024 (UTC)Reply
@Urszag I don't know how easy it is to do set differences. Maybe @Chuck Entz or @DCDuring or someone else who knows the search system would know. But one way to deal with your specific issue is to categorize proper nouns differently from (common) nouns in the above categories.
@PUC I completely agree we need a criterion preventing people from arbitrarily adding surname X as term in language Y. I remember this happening various times, leading to mass RFD's that haven't been resolved consistently. In inflected languages like Russian and Polish it's useful to know how to decline certain foreign names, but not arbitrary ones; an appendix would be sufficient for that. Benwing2 (talk) 23:36, 10 July 2024 (UTC)Reply
It's easy to do searches in the searchbox: 'incategory:"English terms borrowed from Polish" -incategory:"English proper nouns"' (There are 69; note the "-".). Using categories and templates (hastemplate:"template name") in combination is quick. Adding individual words or phrases to shorten the result is also quick. If you do these things, adding regex searches for very specific targetting (eg, rare typos) isn't much of a performance hit. See Help:CirrusSearch (at MediaWiki). DCDuring (talk) 01:54, 11 July 2024 (UTC)Reply
We should make a subcat called "LangX proper nouns borrowed from LangY". CitationsFreak (talk) 06:11, 11 July 2024 (UTC)Reply

Pintupi-Luritja

edit

Currently, Pintupi-Luritja does not have a script code assigned (just None), meaning that the one translation we have at peace comes up in CAT:Pintupi-Luritja terms in nonstandard scripts. For Pitjantjatjara, we have a special encoding pjt-Latn at MediaWiki:Gadget-LanguagesAndScripts.css which uses the same special characters (in addition to Pintupi-Luritja and Pitjantjatjara both being classified as part of the Western Desert dialect cluster). I personally do not have the ability to do any of these things, because I lack the rights, but I propose that:

  1. a (etymology-only? I don't know what the right handling is here) language code be created for the Western Desert (see w:Western Desert language for what would be included) cluster, or at least a family code,
  2. pjt-Latn be renamed to follow that code,
  3. and all Western Desert dialects be changed to use the special encoding.

Minor Pama-Nyungan languages are currently severely neglected and neither I nor anyone else can give them any attention in this state. Pinging @Soap who I discussed this with on the Discord. -saph668 (usertalkcontribs) 20:56, 10 July 2024 (UTC)Reply

I'm keen for us to not add more custom script codes, as these were essentially inherited from a time when we used different templates for every script. What we need to do is sort this out via CSS. For now, I've added the Latin script, so CAT:Pintupi-Luritja terms in nonstandard scripts is now empty. Theknightwho (talk) 22:02, 10 July 2024 (UTC)Reply
What would sorting it out via CSS entail? -saph668 (usertalkcontribs) 22:05, 10 July 2024 (UTC)Reply
@Saph668 There should be a way to specify it based on what's in the lang= tag, though @This, that and the other may know more. Theknightwho (talk) 22:21, 10 July 2024 (UTC)Reply
I'll look into this again.
I do agree with what @Saph668 says about Pama-Nyungan languages. There doesn't appear to have been any attempt to group them into even the most obvious and uncontroversial subfamilies (Western Desert, Kulin, ...) This, that and the other (talk) 06:17, 12 July 2024 (UTC)Reply
@Theknightwho see the last discussion at Wiktionary:Beer_parlour/2024/January#Deprecating_pjt-Latn. The issue is specifically with page titles and would need some Lua coding to fix. This, that and the other (talk) 06:19, 12 July 2024 (UTC)Reply
i thought the idea was that it needs a special font so that the underscores appear properly. see on the peace page, under "Translations to be checked", how Pintupi still appears with a normal font, but the closely related Pitjantjara language (which coincidentally is next to it alphabetically) uses a font that looks slightly more bunched together. I was only guessing, but my intuition told me that the reason we do this is because these languages are among the few that use letters with underscores as part of their alphabet, and that these might not render properly on some fonts, especially the . But I could be wrong. Soap 08:14, 11 July 2024 (UTC)Reply

Are taxonomic names Latin or Translingual?

edit

Following an RFV discussion, there has been some further discussion between me and @Benwing2 on how we should treat taxonomic names. So I think it is best to take this discussion to BP. The questions that stand to be resolved are:

  1. For specific epithets that are only attested in taxonomic names and not in the Latin literature, should we categorize them as Latin or as Translingual?
  2. If we do categorize them as Translingual, how should we deal with their inflections?

According to @Urszag on the linked RFV discussion, "other editors agree in the past, a taxonomic name by itself doesn't count as a usage of a word in the Latin language", but that isn't the same as a formal discussion, so here I am. Here are my concerns:

  • abbotti is currently a Translingual adjective that has no gender specified, but lycioides is currently m or f or n, even though they function identically in the context of being a specific epithet in Translingual. (This is because in Latin, abbotti is the genitive form of a noun, and lycioides is an adjective.)
  • We're kind of implying that Translingual is a gendered language...

Here are Ben Wing's concerns, to the best of my knowledge (I apologize if I have misrepresented him in any way and I am open to be corrected):

  • actinocarpus is also currently a Translingual adjective, and it is gendered as m, and the feminine and neuter forms are provided. Further, actinocarpa is currently marked as the "feminine" form of actinocarpus, whereas in Latin it would be instead "nominative feminine singular". He thinks that partially borrowing the inflectional structure from Latin is problematic, because "Translingual doesn't have any grammatical rules"; but then it also "seems wrong" to have to "make three lemmas for the masc/fem/neut varieties".
  • The whole taxonomic naming system is really "a restricted sort of Latin" and so they should be classified as Latin in the first place.

(Also pinging @Chuck Entz, Trooper57, DCDuring.)

--kc_kennylau (talk) 21:44, 10 July 2024 (UTC)Reply

After some discussion with @Nicodene it has been brought to light that in the (pre-)modern scientific Latin literature, the species names are declined as normal as in Latin (see noctula where the species name Vespertilio murinus is declined in the ablative as Vespertilione murino). This has changed my opinions a bit, and I now think that it would be reasonable to categorize them under Latin. --kc_kennylau (talk) 22:43, 10 July 2024 (UTC)Reply
  • When I address such things, I normally leave the L2 header alone, not because I don't have beliefs and preferences about them, but because there has been no codification, and not much interest in codifying, how such matters are addressed.
Specifically, Translingual does have simple rules about gender agreement, not unlike those of Latin, that are enforced by the Code authorities.
As to the inflection line for Translingual adjectives (ie, those we have not found in any vintage of Latin), almost all of which imitate Latin in form, it seems a good presumption that all three genders potentially occur, as all three genders are represented among genus names.
I would not be surprised to find that specific epithets that are homonyms of Latin adjectives are used with an apparent definition that differs in some way from the Latin.
The guardians of our Latin entries have not even allowed legal or medical Latin, or modern Latinate inscriptions or mottos (either on the grounds that they are SoP or that they are not properly formed, ie, not SoP) to sully the Latin categories. There is not even much enthusiasm (ie, citation effort) to include modern Church Latin, despite use in running text.
In principle, the same kind of problem can occur with genus names and perhaps names at other ranks. For example, Atlas is a synonym of Dicronorhina, Gaea of Euhagena, Zeus is a genus of ray-finned fish. We treat such terms as both Translingual and Latin (also English), albeit with different definitions.
Finally, in the past century (or longer) many specific epithets have been derived from poorly documented languages that were spoken near where specimens where found. Often there is no Latinate ending grafted on, so the epithet is invariant, its PoS is not obvious, and it is treated as an adjective.
I have no proposal to make and wonder what, if anything, is now being proposed. DCDuring (talk) 22:47, 10 July 2024 (UTC)Reply
Why not remove the genders? Then it can be Translingual and no assumption must be made.
On the other hand, I think specific epithets (or any part of the name) that are obviously construed as being Latin should be given a Latin section in addition to their Translingual section, where you can specify the declension. But if that's no good, then I also see no problem with simply saying that Translingual "can" be a gendered, and declined, language. It's not one language after all, it's any terms that aren't specific to one language, so if some of the vocabulary used translingually is declinable or gendered, is there really be a problem? Kiril kovachev (talkcontribs) 22:47, 10 July 2024 (UTC)Reply
I tend to agree with DCDuring and Kiril here, with one exception. I don't think the occasional use of taxonomic names in Latin literature is sufficient reason to treat all these terms as Latin. We have special rules for Translingual taxonomic names and templates for specific epithets that denote the gender, which is essential imo and cannot be omitted - this is where I differ from Kiril.
The only thorny patch arises due to the fact that Latin is an LDL, so even a single attestation of a declined taxonomic name (like Vespertilione murino) in a Latin text would give us license to add a Latin entry for the genus and the specific epithet. Perhaps time to revisit the idea of treating post-1500 Latin as a WDL. (Although in this case it's a moot point, as vespertilio and murinus both date back to classical Latin.) This, that and the other (talk) 23:14, 10 July 2024 (UTC)Reply
I have been following the criterion that any term that has any use at all in Latin text passes RFV (in accordance with Latin's classification as a LDL). So I haven't attempted to convert any term to Translingual when there are concrete attestations like "Secretum in Vespertilione murino et V. noctula foetidum". This, that and the other, it sounds like you would be in favor of a stricter criterion? Whereas kc_kennylau, it sounds like you are saying that because some species names are attested like this, we should include a Latin entry for any species name, even if zero attestations can be found for that specific term in running Latin text? I'm not seeing a consensus yet.
Even though I wouldn't agree with the viewpoint that binomial nomenclature is a form of Latin, it is of course undeniable that this naming system follows some conventions that are derived from Latin grammar. One of these is agreement in gender (masculine, feminine, or neuter) for adjectival epithets. So I don't think it's adequate to simply have an entry for the form "actinocarpus" with no indication that its feminine form is "actinocarpa" and its neuter form is "actinocarpum". Nor does it make sense to have these as separate, disconnected entries. I don't see it as problematic to include this in a Translingual entry, as Translingual is not itself a language that can have or lack grammatical rules as a whole: different entries in Translingual may belong to their own subsystems of communication that follow their own particular rules.--Urszag (talk) 01:33, 11 July 2024 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── So, it seems like all the respondents so far (@DCDuring, Kiril kovachev, This, that and the other, Urszag) would agree with (or at least accept) a policy like this:

  1. If a specific epithet is found in the Latin literature, then it is to be formatted as ==Latin==, with all the gender and inflection information included (e.g. noctula).
  2. Otherwise, it is to be formatted as ==Translingual==, where there are at most three forms (masculine, feminine, neuter), and no inflection (e.g. actinocarpus).

(For the second point, it should be mentioned that the official guidelines do specify that the specific epithets are gendered and need to agree with the gender of the genus, when they originate as a Latin adjective.)

In addition, I would like to propose the following points:

  1. The Translingual genus names (e.g. Abroma) should be included in the gendered categories (e.g. Category:Translingual neuter nouns).
    (When {{taxoninfl}} was converted to use {{head}} by Special:Diff/59825365/72996213), |nogendercat=1 was specified because the original code did not categorise according to gender. I'm not sure if that was a deliberate decision, or they just forgot to categorize.)
  2. The various other relations between the various nouns and adjectives should be suffixes instead of inflections. For example, abdimii has the suffix -ii that forms specific epithets (even though it comes from the Latin genitive), and Acanthodii has the suffix -ii that forms classes (even though it comes from the Latin plural).
  3. We implicitly agree that these words are not Latin and thus do not have vowel length.
A question remains to be resolved, that I have mentioned above: abbotti originates as a genitive, and lycioides originates as an adjective whose three genders are the same (also see Point 5). When they function as specific epithets, they behave the same in all regards. Should we treat them the same? i.e. should they either both be m or f or n or both have no gender specified?

(All three approaches have potential problems:

  • If abbotti has no gender specified and lycioides is m or f or n, then it's inconsistent descriptively (as specific epithets). (I think this is the original intent of the official guideline, but the guideline itself kind of retains (reasonably so) some features of Latin.)
  • If abbotti becomes m or f or n, the Latinists might not like this. (In my opinion this seems to be the best solution, and if the Latinists don't want to accept neo-Latin then I guess they also cannot decide how Translingual grammar works.)
  • If lycioides becomes no gender specified, then this is kind of inconsistent to our previous ruling that specific epithets agree in gender with the genus.)

--kc_kennylau (talk) 11:46, 11 July 2024 (UTC)Reply

@Kc kennylau I agree with all those points! As for the final point, I principally want to say abbotti should have no gender, whereas lycioides have m or f or n, but you make a good point that, in as much as they aren't Latin, those two ought to behave virtually the same, i.e. be usable after any gender of genus. I had prepared a long argument about why I would propose what I said originally, but now I think we would be best to label them the same, probably with all three genders; as, in both cases, we aren't labelling the inherent gender of the adjective, but what genders of genus it can agree with, which in both cases is all of them. I guess for two- or three-termination adjectives, if those are used in taxonomy, I don't know, they would have one or two genders at the "base" form and links to feminine/neuter versions? Kiril kovachev (talkcontribs) 13:29, 11 July 2024 (UTC)Reply
(barbadensis and cervicornis would be examples. The latter might be a bit problematic.) --kc_kennylau (talk) 13:38, 11 July 2024 (UTC)Reply
Yes. A combination of both. The two main taxonomic codes explicitly state that the names are in Latin and Latinized Ancient Greek, but only a very restricted subset. Basically, all taxonomic writing used to be in Latin, then most of the writing was replaced by the vernacualar except that a diagnosis describing a new name in Latin had to be provided, and finally that faded out, just leaving the names themselves. One could make the argument that taxonomic names really are Latin, but the extremely narrow context in which they're used doesn't allow for us to see verbs (except participles), prepositions, or accusative, dative, locative or vocative nominal/adjectival inflections

As for the names themselves: they all theoretically have gender, but above the rank of genus it's usually impossible to know what it is. The names of genera are nouns in the nominative singulat, but they have gender. Species, subspecies, varieties, etc. modify the name of the genus either as:

  1. An adjective in the nominative that agrees with the generic name in gender and number
  2. A noun in the genitive that agrees in gender and number with the referent (so abbottii isn't an adjective)

or as:

  1. A noun in apposition that only agrees with itself.

The genitive can be used for species named after someone or something, or it can be used to refer to some association with the referent, as in Sempervivum tectorum, which has historically been found growing on roofs, or parasitic species named after their host. The last case is the only way to determine the gender of a name above the rank of genus, since a species that parasitizes members of a taxonomic group would have a name that agrees in gender and number with the name of that group.

Thus a species in the genitive named after Mr. Smith would be smithi or smithii, one named after Ms. Smith would be smithae, after the Smith sisters would be smitharum and after Mr. Smith and at least one other Smith would be smithorum.

There's a lot more I could say, but I don't have time right now. Chuck Entz (talk) 15:07, 11 July 2024 (UTC)Reply

Just wanted to add that the "noun in apposition" is the rule for things that are not Latin. For example: piranga, from Old Tupi is present in some scientific names (Issoca piranga, Pyrianoreina piranga) and remains the same regardless the genus is masculine, feminine or neuter. Some authors can still choose to latinize them, Aulonastus pirangus and Sternostoma pirangae do exist, but just piranga is still valid. Trooper57 (talk) 15:33, 11 July 2024 (UTC)Reply
@Chuck Entz: I understand that we are following Latin rules, and that the guidelines intend the names to be Latin; but that does not change the descriptive reality of how they are currently used in the scientific community (I would suppose that most biologists don't know Latin), nor should this impact our policy making decisions. This is also apparent in the fact that 31.1.2 of this document had to spell out the various genitive endings, just as you did for the Smiths. In Point 4 of my proposals, these endings (-i, -orum, -ae, -arum) would be classified as Translingual suffixes for the purpose of the English Wiktionary.
Also, after reading your reply, I'm not really sure what your opinion is, regarding this matter.
@Trooper57: The apposition rule can also apply for Latin names, with an example given in the document being Cephenemyia (f) phobifer (m), or Acrochordonichthys (m) ischnosoma (n).
--kc_kennylau (talk) 16:32, 11 July 2024 (UTC)Reply
I didn't think to check it before now, but it seems that the essay Wiktionary:Taxonomic names has covered some of the topics discussed here.--Urszag (talk) 21:42, 11 July 2024 (UTC)Reply
Regarding "As for the names themselves: they all theoretically have gender, but above the rank of genus it's usually impossible to know what it is.", I note that LPSN lists genders above genus level. Examples: phylum Acidobacteriota as "neuter" and family Zavarziniaceae as "feminine".
Although re-reading your remarks, perhaps you intended a contrary interpretation of "above the rank of genus"?
—DIV (1.129.106.197 08:02, 22 July 2024 (UTC))Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── I suppose there is some merit in making a distinction between noun and adjective for specific epithets even when there is no descriptive difference, in that it is a nice balance between the Latinists and pragmatism. If we do so, then abdimii can be a noun with no gender specified (specific epithets that are nouns in the Latinist sense do not need to agree in gender with the genus, so there is no need to specify the gender thereof). Would people agree with this approach? --kc_kennylau (talk) 22:39, 11 July 2024 (UTC)Reply

I'd say it's as least as accurate (if not more so) to call them nouns as it is to call them adjectives, so I have no objection to that.--Urszag (talk) 22:45, 11 July 2024 (UTC)Reply

(By the way, there are currently 586 entries using the template {{la-epithet}}. --kc_kennylau (talk) 14:27, 12 July 2024 (UTC)Reply

Automating taxonomic entries

edit

I have recently made {{taxref}} which can automate the reference section for genus entries and tested it on Felis and Autoserica. Basically, those reference sections usually contain a link to Wikipedia, a link to Wikispecies, and a link to the Commons category. This new template can automatically detect if each link exists, given the Wikidata ID.

I am wondering if we should add more links to templates, and more generally if more things about taxonomic entries can be automated.

For example, I think {{taxfmt}} (which links to a Translingual entry) and {{taxlink}} (which links to Wikispecies) can be unified.

--kc_kennylau (talk) 13:32, 12 July 2024 (UTC)Reply

Reasons not to combine {{taxfmt}} and {{taxlink}} are:
  1. Instances of taxonomic name within {{taxlink}}, but not {{taxfmt}}, are occasionally counted and used to create lists of taxonomic names "wanted" in principal namespace.
  2. There is no reason for every instance of {{taxfmt}} to check for the existence of a Translingual L2 section.
  3. The ease of converting one to the other as we add taxonomic-name entries.
Wikidata links are fine, where they exist. If there is no Wikidata link, then it is usually desirable to link to Species, WP, or Commons pages for taxa of a higher rank, which AFAICR wikidata does not do. What are also useful are some of the links to external databases, even though many of the external links do not have data not present in others.
BTW, I continue to believe that we do not need to have entries for 10,000,000 species, nor even just the 1,000,000+ described species. We need entries for those that are important for humans, often as evidenced by the existence of vernacular names that more-or-less correspond to species, genus, or higher-ranked taxon. DCDuring (talk) 19:18, 12 July 2024 (UTC)Reply
@DCDuring On your point about the large number of potential entries, many of those won't pass CFI at the end of the day. If a name is used in one paper and then only ever mentioned in taxonomic databases, I don't think that passes, quite frankly. Theknightwho (talk) 16:18, 18 July 2024 (UTC)Reply
We haven't decided that last point, about whether occurrence in a taxonomic database (or table in a print or electronic document, for that matter) was a use or a mention.
But your point would seem to argue against any simple automation of the creation of taxonomic entries and, I would argue, in favor of a system for directing new-entry creation efforts toward those names that had links thereto (ie, were "wanted") by other entries in principal namespace. There are a fair number of orphan or near-orphan taxonomic entries being created now. Some might be good candidates for RfV. DCDuring (talk) 17:23, 18 July 2024 (UTC)Reply

Implementing auto-glossary

edit

I propose creating pages using {{auto-glossary}} in appendix space add adding User:Ioaxxere/auto-glossary.js to the gadgets list. We should start by beta-testing a couple of pages and get user feedback (@Vininn126). Also, if the gadget gets moved to MediaWiki space it would be ideal if I could be made an interface administrator so I could continue working on the gadget if need be and possibly help maintain our other gadgets as well. What do we think? (Pinging @Benwing2). Ioaxxere (talk) 05:19, 13 July 2024 (UTC)Reply

It seems nice to have. Vininn126 (talk) 09:54, 13 July 2024 (UTC)Reply
I am fine with installing this. Also I'm the one who suggested to Ioaxxere to post about becoming an interface admin. Benwing2 (talk) 18:51, 13 July 2024 (UTC)Reply
I support you for interface admin too. Kiril kovachev (talkcontribs) 19:53, 13 July 2024 (UTC)Reply

User:Ioaxxere/PagePreviews.js

edit

As the name suggests, this is a script to display a preview of an entry when hovering over a link. To try it out, add

importScript("User:Ioaxxere/PagePreviews.js");

into your common.js page. Please try it out! My goal is to create a preview gadget that's good enough to be on by default to match Wikipedia's page previews. Ioaxxere (talk) 05:48, 14 July 2024 (UTC)Reply

Works great on mobile iOS – I tested it on both my iPhone and tablet. I really like how it skips straight to the definitions and remains in the single language that was linked. Just like its use on Wikipedia is to quickly preview an article without having to click it, this will be more convenient for readers who just want to find out the meaning of a word without having to leave the main entry one is viewing. Conversely, I do not really see any negatives here – The only point of discussion for me might be on what exactly is included in the preview, but I do prefer its current configuration for the added convenience. Easy support from me. LunaEatsTuna (talk) 21:16, 14 July 2024 (UTC)Reply

dated/archaic/obsolete in the glossary as well Wiktionary:Obsolete_and_archaic_terms

edit

People seem to have a lot of confusion about these words. Some people seem to use "archaic" for obsolete words, or think that this is a sliding scale.

I propose we change the text in the proposed link to mention the same information as the glossary (i.e. archaic is for stylization) and also mention that this isn't really a scale of oldness, but rather markedness. Vininn126 (talk) 08:34, 14 July 2024 (UTC)Reply

Do you have some example of the confusion? DCDuring (talk) 14:50, 14 July 2024 (UTC)Reply
An exact edit, no. Although compare pośmiewać and the recent edit history. But it's been many a time the subject of discussion on Discord, maybe @PUC can chime in. Vininn126 (talk) 15:01, 14 July 2024 (UTC)Reply
Discord doesn't count as either authority or evidence. DCDuring (talk) 16:30, 14 July 2024 (UTC)Reply
@DCDuring No, but Vininn126's word that this has come up a lot on Discord means that this is an issue, and it's one I can attest to as well. People do get these confused - I've seen it myself. Theknightwho (talk) 17:10, 14 July 2024 (UTC)Reply
Why always so dismissive of the idea? Did you ignore the other part of my comment or did you see "Discord" and think "DISCORD BAD"? Vininn126 (talk) 19:18, 14 July 2024 (UTC)Reply
Why, yes, Discord IS bad, being a potential communication channel for cabals. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply
It's being communicated here with official examples, so I fail to see the issue. Vininn126 (talk) 08:18, 15 July 2024 (UTC)Reply
I am unable to assess the import of the labels in Polish. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply
‘Markedness’ felicitously puts into a single word the impressions I myself have formed through editing. My own thoughts, developing on the idea: ‘dated’ properly encompasses a rather small selection of terms, and ‘archaic’ evenmoreso. A large part of terms common in the past and rare in the present actually elicit no recognition of antiquity from the reader, being neither part of the vocabulary of conventional archaisms, nor a word whose decline is apparent from the recent past. These uses—and I call them that because, from my experience, they mostly encompass senses of polysemous words, and not entire words—are, in my opinion, best described with a neutral {{lb|xx|now|uncommon}} or {{lb|xx|now|rare}}.
I would also like to make a comment on the nature of ‘obsolete’, which our glossary defines as ‘no longer likely to be understood’. I have seen this given as justification to not apply this label, in cases where an obsolete term bears enough similarity to an existing one to be understood even today. (‘Dated’ labels for alternative forms from the seventeenth century!) I find a good rule of thumb in such cases is that a term like this is obsolete when a reader no longer recognises it as antique, but simply wrong, novel or unusual. ―⁠Biolongvistul (talk) 16:01, 14 July 2024 (UTC)Reply
Any thoughts on how these labels should be applied (would be understood) for no-longer-widely-accepted taxonomic names. It isn't just aging taxonomists that might recognize such a term, but also users of older reference works, advocates of some less accepted taxonomic position, etc. DCDuring (talk) 16:36, 14 July 2024 (UTC)Reply
‘Historic’ maybe? ―⁠Biolongvistul (talk) 17:45, 14 July 2024 (UTC)Reply
The history can be quite recent, as DNA analysis has led to many changes in name, not to mention placement (hypernyms) and circumscription (hyponyms). Current experts in the taxonomy of a family or order would know about many old names, but view them as not suitable for their use. True obsolescence takes a long time. I am not sure that dated captures much that is relevant, but those with more exposure to the taxonomic community discourse may know better. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply
I think it's "dated" as soon as the scientific name changes, maybe "obsolete" as people stop using it. CitationsFreak (talk) 02:10, 16 July 2024 (UTC)Reply
Can "non-standard" be applied?
Specifically for taxonomic names, another possibility is "not validly published" (example: Thermodesulfobacteriota), but I feel that this might cause more confusion among general readers (unless, perhaps, linked to a glossary?).
My main concern with applying "dated" automatically is that sometimes the scientific community at large may ignore the standards/guidelines set by official bodies. For example, "Gibbs free energy" remains a common term, even though neither IUPAC (nor IUPAP, from my recollection) recommend it. At WP: "an increasing number of books and journal articles do not include the attachment 'free'," but keep in mind that this is several decades after the recommendation was made (1988). FWIW, I support the IUPAC recommendation!
Thermodesulfobacteriota became officially correct in 2021, but you will still see older forms being used — largely through lack of awareness of the change in status, and (hence) not necessarily being perceived as wrong/dated by author/reader.
Secondly, to me "dated" has connotations that the usage is old-fashioned (like groovy), and I'm not sure (yet) that that is a good fit for describing scientific words that have suddenly been superseded.
Actually, superseded could be a reasonable label too.
—DIV (1.129.106.197 08:28, 22 July 2024 (UTC))Reply
Markedness is already something we have to contend with, vis-a-vis colloquial, formal. Vininn126 (talk) 19:19, 14 July 2024 (UTC)Reply
  • Yeah, I have confusion about them. I generally go for dated=no hits in 50 years, archaic=nothing in 100 years, obsolete=nothing in 200+ years, or if Webster 1913 marks it as archaic. Sometimes it might be on the cusp of different tags, in which case I randomly choose, by instinct. Any tag telling us "you shouldn't use this word today" is better than none, IMHO. Newfiles (talk) 21:45, 14 July 2024 (UTC)Reply
    That's not always indicative of how the words are truly marked. Vininn126 (talk) 08:18, 15 July 2024 (UTC)Reply
    "Dated" definitely doesn't mean "no hits in 50 years." I've never seen anyone interpret it that way before. Take a look at the page linked in the header for this topic. That should give you a better sense of what the consensus has been in the past. Andrew Sheedy (talk) 17:11, 15 July 2024 (UTC)Reply
    I agree with you, @Andrew Sheedy. The rules of thumb described by @Newfiles strike me as overly strict. And I would query what corpus is being used. —DIV (1.129.106.197 08:33, 22 July 2024 (UTC))Reply
    I think @Newfiles' rules of thumb might be a bit rigid, but I see where they're coming from. I don't know if there is any "scientific" way of determining which label should be used, but I do agree generally that a term is "dated" if it feels old-fashioned when used in the present day, "archaic" if it feels very old-fashioned, and "obsolete" if it has fallen out of use for a long time, possibly centuries. So in my view it is a sliding scale. I am not in favour of using now in labels, because of the uncertainty this creates. — Sgconlaw (talk) 11:41, 22 July 2024 (UTC)Reply
    As I've said, this is about how the words are received when heard. Vininn126 (talk) 11:42, 22 July 2024 (UTC)Reply

Ban the POS "prepositional phrase"

edit

I am going through and eliminating the POS "prepositional phrase" from Russian lemmas. I propose we ban new lemmas with "prepositional phrase" as the POS. IMO, all lemmas tagged as "prepositional phrase" are better identified as either an adjective, adverb, preposition or interjection. "Prepositional phrase" tells you nothing about the syntactic function of the phrase. In Russian, at least, there is no consistency whatsoever in whether a given multiword phrase headed by a preposition is identified as a "prepositional phrase" or as an adjective, adverb, preposition or interjection. I think it comes down to the laziness of the editor. Existing cases of the POS "prepositional phrase" have to be grandfathered in, but we can prohibit new ones with an edit filter. The only potential issue I see is that some prepositional phrases can function either as an adjective or an adverb, and we currently don't have a concise way of putting multiple POS's in a single entry. This means every once in a while, a prepositional phrase will have to be converted into two entries, an adjective and an adverb (or maybe identify as one of them and use |cat2= to categorize under the other, although that is less ideal). Benwing2 (talk) 03:01, 15 July 2024 (UTC)Reply

It looks like this header was specifically allowed by a vote in 2010. I don't really agree with calling a prepositional phrase an adjective or adverb; I guess some might be lexicalized to the extent of becoming essentially single words, but we also have entries for idioms that really have to be considered phrases rather than words, such as of a lifetime, on the surface, at first glance, up a tree.--Urszag (talk) 03:15, 15 July 2024 (UTC)Reply
@Urszag But you will find zillions of multiword adverbs here. There is no consistency whatsoever in whether multiword terms are characterized as "prepositional phrases" or "adverbs" etc. Why can't we call on the surface an adverb? It functions exactly like one syntactically. Benwing2 (talk) 03:21, 15 July 2024 (UTC)Reply
It's not that phrases can't be categorized as adverbs (on Wiktionary). What I meant was that I could get behind avoiding the term "prepositional phrase" if it was inaccurate—for example, if the term it was applied to wasn't really a phrase. But in cases where it is a perfectly accurate description, I don't see why it should be avoided. It seems consistent enough to call all phrases with the form of a prepositional phrase "prepositional phrases"; if some are currently called "adverbs", they could be changed to match the others, rather than the reverse. "Adverb" is a pretty diverse and heterogeneous category, so a lot of things can potentially fit in it. I can't think of a specific test to differentiate "on the surface" from a lexical adverb, but maybe someone else knows of one.--Urszag (talk) 03:47, 15 July 2024 (UTC)Reply
@Urszag We don't include "transitive verb" or "intransitive verb" as part of speech headers, even though those might be perfectly accurate. In many cases these are not phrases (since "phrase" doesn't apply to just anything with multiple words), and calling it a "prepositional phrase" obscures the actual function of the term, since it's too broad. Theknightwho (talk) 03:49, 15 July 2024 (UTC)Reply
To me, the analogy with "transitive verb" and "intransitive verb" cuts the other way. We call verbs "verbs", and don't try to include extra information about how they grammatically function in the POS header. Likewise, I think we should call prepositional phrases "prepositional phrases", and not bother with trying to make the header tell the reader details about how they function grammatically in a sentence: that's what the definition, examples, and if necessary usage notes are for, not what the Part of Speech header is for.--Urszag (talk) 15:31, 15 July 2024 (UTC)Reply
I completely agree. It feels like a crutch for people who don't like multiword terms. Theknightwho (talk) 03:48, 15 July 2024 (UTC)Reply
I, too, find that the usage of this POS is generally not helpful. I've eliminated most of its use from Polish entries. Vininn126 (talk) 08:23, 15 July 2024 (UTC)Reply
Quoting opening comment: "The only potential issue I see is that some prepositional phrases can function either as an adjective or an adverb, and we currently don't have a concise way of putting multiple POS's in a single entry. This means every once in a while, a prepositional phrase will have to be converted into two entries, an adjective and an adverb (or maybe identify as one of them and use |cat2= to categorize under the other, although that is less ideal)."
In English probably the majority of prepositional phrases can serve as both adjectives and adverbs. Do we have any facts to support the adverbial "Every once in a while". I believe it is just wrong, at least for English. If there are languages other than English for which there is some good reason to remove the "prepositional phrase" PoS, so be it.
I really don't understand the motivation for this kind of indiscriminate, complicating, revolutionary change. Is it fun? Does it indulge a homogenizing, controlling impulse? DCDuring (talk) 12:29, 15 July 2024 (UTC)Reply
Because not all can be converted, and placing under one umbrella implies that these phrases all have the same syntactic behavior, which isn't true. Why must you always through accusations like this around? It's unbecoming, rude, and frankly I'm tired of it. This is a poor attitude to have. Vininn126 (talk) 12:37, 15 July 2024 (UTC)Reply
My questions can be rephrased as "What is the "attitude" that warrants this kind of proposal?"
All categories are "umbrella classes", with a great deal of diversity of syntactic behavior. That each PP tends to modify sentences, phrases, adverbs, or adjectives with different frequencies might be a call for some kind of usage note, but I doubt that we will do the work to justify such a note, just as we haven't done the work to support even the broader claims on which this proposal is based. It does seem to be much easier to change things all at once in code than to engage in the one-definition-at-a-time effort to improve entry quality. DCDuring (talk) 13:27, 15 July 2024 (UTC)Reply
What is or isn't a prepositional phrase is rather clear-cut. Unless it changed since I was in middle school English, all prepositional phrases start with a preposition and have some more words afterwards (usually articles and nouns). Purplebackpack89 15:21, 15 July 2024 (UTC)Reply
That has nothing to do with what I said. DC's claim is that most prepositional phrases may need to be converted to both an adjective and an adverb, and my argument is that not all will be. By using prepositional phrase we are assuming that they all have the same syntactical behavior, when in reality, they do not. Vininn126 (talk) 15:23, 15 July 2024 (UTC)Reply
The fact that they do not is the reason why we have to preserve prepositional phrase as an acceptable POS... Purplebackpack89 17:17, 15 July 2024 (UTC)Reply
Huh? We need to know which ones act as both adverbs and adjectives, and which ones act as only adverbs, and which ones only as adjectives. If they always behaved as both, it would be predictable. Your logic makes no sense. Vininn126 (talk) 17:21, 15 July 2024 (UTC)Reply
His logic does not make sense, but do we feel confident that users would accurately distinguish the prepositional phrases by adverbs and adjectives? Just two years ago we had to find the common fallacy of categorizing predicatively used adjectives as adverbs, Wiktionary:Requests for deletion/Non-English § extrem, Talk:extrem. Sure we can do it, but I have not discerned the benefit of the eventual effort (only significant in English though, given the numbers in the category), instead of leaving the ambiguity, only that Benwing can apparently more parsimoniously conceptualize the parts of speech, since a rule or theory becomes less convincing with each exception. Fay Freak (talk) 20:26, 15 July 2024 (UTC)Reply
Trusting users to accurately distinguish things should not always be a priority. We should remain faithful to the truth, regardless of how complicated it is. Vininn126 (talk) 20:28, 15 July 2024 (UTC)Reply
I mean it is not wrong, and then can it be more wrong, or less correct or more vague if people prefer to employ the more general category “phrase”. “Prepositional phrase” could also be kept as a tracking category therefore. Fay Freak (talk) 20:34, 15 July 2024 (UTC)Reply
FWIW, Wiktionary consistently misuses the word "phrase". am I under arrest is not a linguistic phrase but that's what we call it. Benwing2 (talk) 20:36, 15 July 2024 (UTC)Reply
Is that relevant to this discussion? @Theknightwho suggested that some of the terms currently labeled "prepositional phrase" are not phrases, but most of the terms in Category:English prepositional phrases seem to qualify. I've relabeled cases like at the bottom of as prepositions. By the way, I noticed Category:English phrasal prepositions seems to be a manually curated category, but couldn't it be implemented automatically as the intersection of Category:English prepositions and Category:English multiword terms?--Urszag (talk) 21:48, 15 July 2024 (UTC)Reply
If you still have the problem with prepositional phrases that function as adjectives (but aren't adjectives) AND function as adverbs (but aren't adverbs), by deleting the prepositional phrase category, you're just replacing one problem with another problem. People are way too focused on trying to label prepositional phrases as adjectives or adverbs, but I don't think that that's an important or necessary distinction, certainly not important enough to delete a part of speech to implement. Purplebackpack89 22:08, 15 July 2024 (UTC)Reply
@Purplebackpack89 But they are adjectives and/or adverbs, whereas "prepositional phrase" doesn't refer to an independent part of speech. All we're doing at the moment is makng it more difficult to tell which ones can be used as adjectives, which can be used as adverbs, and which can be both. Theknightwho (talk) 22:37, 15 July 2024 (UTC)Reply
A preposition is an independent part of speech. Why wouldn't a prepositional phrase be as well? I still say calling something that's preposition + article + noun an adjective or an adverb is inaccurate. Purplebackpack89 23:59, 15 July 2024 (UTC)Reply
What on earth are you even talking about. I feel like you are discussing something completely differently. Basically the idea is to make prepositional phrases more like na bani. Vininn126 (talk) 08:09, 16 July 2024 (UTC)Reply
From what I gather, Purple doesn't know linguistics well but thinks he does. Benwing2 (talk) 08:34, 16 July 2024 (UTC)Reply
@DCDuring Do you find it fun caricaturing every opinion you disagree with? You continually pepper discussions with these kinds of passive-aggressive comments, and they are getting very tiresome. Theknightwho (talk) 21:50, 15 July 2024 (UTC)Reply
How are they passive? DCDuring (talk) 22:10, 15 July 2024 (UTC)Reply
You "pepper discussions with [...] aggressive comments"? CitationsFreak (talk) 01:11, 16 July 2024 (UTC)Reply
No comment right now on the proposal to remove "Prepositional Phrase" but it's a bit funny to me how our current list (outside of "Prepositional Phrase" which had its own vote) was institutionalized by a vote consisting of only 7 editors in a 5-1-1 vote as seen at Wiktionary:Votes/pl-2015-12/Part_of_speech. It was part of User:Daniel Carrero's vote-a-rama in 2015 to rightfully update Wiktionary's policies as of that time, as explained in Wiktionary:Beer parlour/2015/December § WT:EL new votes. It's been 9 years since then. Maybe it's time we do a full review of WT:EL, though I suspect it won't be a simple as it used to be. AG202 (talk) 23:30, 15 July 2024 (UTC)Reply
Delete. If they are just multi-word adverbs/adjectives/other things, just call them what they are. (I also support seeing which phrases are really just other parts of speech disguised as phrases. I know that tail between one's legs is only a phrase due to editing conflicts over if it should be a noun or adverb. CitationsFreak (talk) 01:26, 16 July 2024 (UTC)Reply
@DCDuring, let's remember that "prepositional phrase" is a common header now chiefly because Equinox merged thousands upon thousands of separate adverb/adjective sections under a unified "prepositional phrase" header back in the day. I've always disagreed with that, as it seemed to me we were erasing a useful distinction. Moreover it's not always possible to word a definition both adjectivally and adverbially. PUC17:54, 16 July 2024 (UTC)Reply
Yes, I remember. I also believe that we would be foolish not to recognize that many of the PP entries would probably benefit from {{&lit}} as they often have SoP as well as entry-worthy definitions, however this gets resolved. DCDuring (talk) 21:24, 16 July 2024 (UTC)Reply
  Support eliminating this header. Part of speech headers exist to describe the syntactic function, not the form, of a term. We got rid of "acronym", "initialism", and "abbreviation" for this reason. Ultimateria (talk) 20:12, 16 July 2024 (UTC)Reply
  Oppose. Many of the Italian entries I create are prepositional phrases. Splitting these into separate POS headers for adjectives and adverbs would be both inconvenient and misleading for almost all such entries. Intuitively, it’s also more intuitive for me to categorize these entries as prepositional phrases rather than as adjectives/adverbs. Imetsia (talk (more)) 22:55, 17 July 2024 (UTC)Reply
Could you provide an example where this would be detrimental? Vininn126 (talk) 22:57, 17 July 2024 (UTC)Reply
Just looking through my recent contributions, entries like a lume di candela, a coppia, a palla, and a gonfie vele would be affected. In these cases, we'd have to split them up into adjective and adverb headers, which I find to be an unintuitive way to label these constructions. Imetsia (talk (more)) 23:12, 17 July 2024 (UTC)Reply
I would classify most of these as adverbs, to be honest, in a syntactical sense. Vininn126 (talk) 23:13, 17 July 2024 (UTC)Reply
I don't see how you could. I have the example "cena a luma di candela" ("candlelit dinner"), in which case "a luma di candela" is modifying a noun. I could come up with "gara a coppia" ("competition between pairs of contestants"), "musica a palla" ("loud music"), "vendita ticket a gonfie vele" ("smooth-sailing ticket sales"); in which case the constructions are always modifying nouns. Imetsia (talk (more)) 23:17, 17 July 2024 (UTC)Reply
Taking a noun argument does not exclude it - often phrases built from a preposition + noun are adverbs anyway, asking the syntactic question "when", compare w nocy (at night). Vininn126 (talk) 23:19, 17 July 2024 (UTC)Reply
Aren't these just adjectives? CitationsFreak (talk) 03:51, 20 July 2024 (UTC)Reply
I would also say these are adverbs. In cena a luma di candela it is acting as an adjunct phrase modifying a noun, like in the English translation dinner by candle-light. Do these phrases share any other properties of adjectives, apart from the fact that they modify a preceding noun? Can you inflect them? Nominalise them? —Caoimhin ceallach (talk) 18:59, 9 August 2024 (UTC)Reply
If on the other hand we say that these adjuncts are adjectives, simply because they modify a noun, then we have to allow that almost any adverb can also be an adjective:
Do you see that oak there?
That guy yesterday at the newspaper stand, what was his name again?
They saw in her their saviour, their messiah almost.
In each of these the adverb is most straightforwardly interpreted as qualifying the noun it follows. Either every adverb needs an adjective section or we say that all adverbs and adverbial phrases can be with some effort be forced to serve as adjuncts in noun phrases, but that this doesn't affect their true nature as adverbs. These are my thoughts on the matter anyway. —Caoimhin ceallach (talk) 23:52, 9 August 2024 (UTC)Reply
Honestly this sounds like "it's too much work for me to figure out whether it's an adjective or adverb so I'd rather just put something unhelpful and let the reader figure it out". Benwing2 (talk) 23:02, 17 July 2024 (UTC)Reply
No, for almost all cases, the prepositional phrase can act as both an adjective and an adverb. It's not a matter of "figuring out whether it's an adjective or adverb". And I don't think readers have been confused by the prepositional phrase header. Imetsia (talk (more)) 23:13, 17 July 2024 (UTC)Reply
I don't think it's always about what leaves the reader confused, but rather about what's factually more precise, since some PP can be adverbs, some adjectives, and some both. Vininn126 (talk) 23:15, 17 July 2024 (UTC)Reply
  Support removing this POS. —Caoimhin ceallach (talk) 20:33, 31 July 2024 (UTC)Reply
This seems like something that should be a vote, particularly since it's overturning a 15 year-old vote. —Justin (koavf)TCM 19:35, 9 August 2024 (UTC)Reply

Minor changes to CFI attestation section

edit

At WT:RFVE#myxa, P Aculeius wrote a comment that drew my attention to some divergences between our current practice and the text of CFI. I want to propose two small changes to align the policy with practice.

1. What is a "citation"?

Currently CFI does not actually say what a "citation" is. Experienced Wiktionary editors know that (at least for WDLs) a citation is a quotation - a snippet of (usually) running text that includes the word and is referenced to a particular durably archived work. But CFI somehow manages to avoid saying that. The word citation means different things to different people and non-lexicographers may be unfamiliar with the applicable sense. Case in point: to us, "citation" and "quotation" are interchangeable, but to a Wikipedian, "citation" is synonymous with "reference" (Wikipedia:Citing sources).

This can be cleared up by adding a sentence at the beginning of WT:CFI#Number of citations:

{{l|en|citation|Citations|id=lexicography}}, in the form of [[WT:Quotations|quotations]], provide evidence that a term exists and provide examples of how it is used as part of a language.

For languages well documented on the Internet, three citations in which a term is used is the minimum number for inclusion in Wiktionary. []

2. Dictionary entries are not "uses"

It is well understood by seasoned Wiktionarians that a mere listing of a word as a headword in a dictionary or glossary is a mention, not a use, and only relevant for the attestation of LDLs. But perhaps this is not so obvious to less experienced contributors. It wouldn't hurt for CFI to say this outright.

To help clear up any confusion, I propose to alter the existing example in WT:CFI#Conveying meaning from

For example, an appearance in someone’s online dictionary is suggestive, but it does not show the word actually used to convey meaning.

to:

For example, the fact that a dictionary contains an entry for the word is suggestive, but it does not show the word actually used to convey meaning.

(I note for completeness that users have differing views on whether the appearance of a word as a gloss in a dictionary is a use or a mention. That's why I chose the wording carefully, specifically using the word "entry" instead of the general term "appearance" to avoid any possible confusion.)

These changes are small and, I suspect, uncontroversial, so I'm asking here to see if there is support or opposition instead of opening a formal vote. If it is felt that a vote is required I will start one. This, that and the other (talk) 10:35, 17 July 2024 (UTC)Reply

These are indeed minor changes, in that they only codify "common understandings" rather than addressing significant issues raised by these questions. The fact that the rest of the universe defines a "citation" as any reference to authority for a given point, while on Wiktionary the word is "commonly understood" to refer only to particular usage examples, and to exclude all authority is a significant and frankly nonsensical situation. To provide context, a common situation is that an editor will refer words that have no citations or attested usage in their entries to RFV because they're not mentioned in a particular dictionary, e.g. OED, or because the OED entry doesn't include a particular sense. But if Webster's Third New International Dictionary gives the word and supports the definition for the entry—or for that matter any number of other authorities—then the concern raised hasn't been addressed because dictionary entries don't count as citations, even though the reason why the word or sense was brought to RFV was because of what wasn't included in another dictionary!
Worse, the entry is then vulnerable to deletion as an unattested word—despite literal attestation from strong authority—because we want a minimum of three usage examples. Which is a fine goal to show that a word is actually in use—but this standard is often difficult to satisfy in the case of archaic or technical terms, which may be contained, mentioned, or defined in every textbook and manual of a subject, but may be used without any definition whatever only in old and largely inaccessible works that can't be easily located on the internet.
The case that brought this to discussion was a technical term for "the fused distal end of the lower mandible" of a bird, which evidently was occasionally found in ornithological literature of the late nineteenth and early twentieth century, but which was described as "rare" in its dictionary entries. Although three examples of use besides in dictionaries and glossaries were eventually found by another editor, I made several attempts to locate it in use via Google and Google Books, searching for the term in combination with keywords such as "ornithology", "avian", "anatomy", "birds", and for ornithological texts that might use the word to describe bird mandibles—and I came up with exactly one, which did little more than define it and explain that it was a rare term used by at least one important authority, cited by last name and year only, but for which it recommended the use of a different word.
Of course, Wiktionary hosts countless entries and definitions that lack three attested uses, but which go unchallenged because they appear to be correct anyway, and there seems to be no rush to go out and find them. But once someone discovers that a particular dictionary doesn't include the word or sense, it's not enough to show that others do—you have to go beyond that and find three uses that don't define or discuss the word, or else the entry can be deleted. To be clear, this is not an argument for allowing random rubbish on the internet, or "dictionary-only words", i.e. terms that have never been used for their intended purpose, but were simply invented as examples of words that someone thought should exist, like "hippopotomonstrosesquipedalian", or the fanciful collective plurals of animals that were never used before someone decided that a particular term would be amusing, and which are only ever used as examples of words.
Instead, it is an argument that when a word or sense can be cited to a strong authority—for instance a pre-internet dictionary not stuffed full of nonce words, or technical works that describe what something means and how or even whether it is or was in widespread use, and especially when multiple authorities can be cited, then it makes little sense to hold that it should then be marked as "unverified", or subject to deletion, due to a lack of sufficient attestation, even though it may have significantly more evidence of, and authority for both meaning and usage, than a significant proportion of other, unchallenged entries.
On Wikipedia, we have a number of tags that can be applied to articles, sections, or claims, to indicate that additional sources are needed or wanted, and that potentially-controversial material may be subject to deletion if it can't be verified. But on Wiktionary, even things that can easily be verified—and have been—are subject to deletion under the present CFI. It seems to me that what's really needed is an adjustment to policy that recognizes that words or senses that can be found in and cited to authority have at least a minimal degree of attestation, and that while additional sources or examples may be wanted, the entry or sense does not need to be deleted solely due to the fact that no editor has succeeded in finding—or perhaps even attempted to go out and find those sources or examples. If you can challenge an entry based on its inclusion or lack thereof in a dictionary, then the fact that it appears in one ought to be at least as relevant, otherwise there is a significant imbalance between inclusion and exclusion—one that seems to do a serious disservice to our readers. P Aculeius (talk) 13:08, 17 July 2024 (UTC)Reply
I responded to most of these points at WT:RFVE#myxa. I would only note here that what I'm proposing is nothing new; it is merely seeking to update the policy to reflect longstanding practice on this wiki. I'll let others respond but I don't think you'll see much support for your suggestions here. This, that and the other (talk) 14:44, 17 July 2024 (UTC)Reply
Above, it is mentioned that the three cites rule is "difficult to satisfy in the case of archaic or technical terms, which may be contained, mentioned, or defined in every textbook and manual of a subject, but may be used without any definition whatever only in old and largely inaccessible works that can't be easily located on the internet." This is true for many modern minor geographical terms as well. There are geogrpahical terms with legitimately cited English Wikipedia entries (cited from non English materials) that can't meet Wiktionary's three cites. The three cites rule itself is preposterous absurdity and the emperor has no clothes. But despite this truth, it is indeed because of the three cites rule that I can make entries for many legitimate but rare words like Banmendian that do not appear in other dictionaries, and many Tingyong Pinyin words or similar minor location names. Geographyinitiative (talk) 16:43, 17 July 2024 (UTC)Reply
@P Aculeius What you've said is all very well, but it would be a major policy change. What might make more sense is to change the word "citation" to "quotation" on the policy page, to avoid any confusion over what the word "citation" means. Theknightwho (talk) 21:55, 17 July 2024 (UTC)Reply
Are you building a false dichotomy? I know how to balance inclusion and exclusion.
Given that our manpower is limited, we “send to RFV” terms for which, on a case-by-case basis, there are indications of them being hoaxical, protological, or corrupted. That is for the positive assumption, which editors arrive at differently, that they have not been used at any point in time or corner of the earth, as opposed to you being unable to find them while specifically searching.
For Geographyinitiative’s topographic terms you thereby have a litmus test that if you believe it had to be in a secret military map it is better left included. For living languages I have been content for a spelling to exist only conceptually: Wiktionary:Requests for deletion/Non-English#5-DM-Banknote, Talk:5-DM-Banknote, after all it is the most used banknote in Europe. Inclusion may provide a picture more consistent with existing data. This is perhaps a wider implication of the “assume good faith” principle. But it is only me considering the general spirit, principles and purposes of the primary documents so much, while others outpope the Pope in their legalism; I would leave rixig as a term secured for some period and place in spite of only one quote and anything in general if our resources are convincingly specifically argued to suffer undercoverage: we need a flexibility clause for cases when one term points to more occurrences which we just don’t find, if a word has arisen organically in a community which presumably shared it rather than being an occasional literary creation. We know how to build a dictionary, if not for the CFI, indeed, wherewith we have been increasingly creative in order not to make the text work against its own goals. … Fay Freak (talk) 21:12, 17 July 2024 (UTC)Reply
  Support These minor changes are an improvement. It is possible that major changes might yield a greater improvement, but major changes will surely take longer to negotiate, and shouldn't preclude minor changes in the meanwhile. @Theknightwho, I don't think it makes sense to change the wording in the policy to "quotation" unless the "Citations" tab for each entry is likewise renamed. Besides that, I can see that it could be possible to cite an example of usage, or to cite appearance in a wordlist. (WP is perhaps more focussed on citations of concepts/opinions/facts.) So, optionally:

{{l|en|citation|Citations|id=lexicography}} of usage, in the form of [[WT:Quotations|quotations]], provide evidence that a term exists and provide examples of how it is used as part of a language.

—DIV (1.129.106.197 12:11, 22 July 2024 (UTC))Reply
Yes, I could support that. Theknightwho (talk) 14:56, 22 July 2024 (UTC)Reply
:-) One thing I overlooked is that the expansion button does currently read "quotations ▼". So it's not quite uniform already. —DIV (1.129.106.197 01:25, 23 July 2024 (UTC))Reply
On further reflection, I feel that what we really mean must contain two elements, which are the reproduced passage and the details of where it came from. On its own "quotation" skews more to the first element, and "citation" skews more to the second element, so I'm now thinking that perhaps a word like "sourced" could also be inserted, as in:

{{l|en|citation|Citations|id=lexicography}} of usage, in the form of [[WT:Quotations|sourced quotations]], provide evidence that a term exists and provide examples of how it is used as part of a language.

Although arguably that would not quite be in keeping with the explanation/definition of WT:Quotations.
—DIV (1.129.106.197 01:37, 23 July 2024 (UTC))Reply

consensus on inclusion/exclusion of "someone" in multiword English verb lemmas

edit

Hi. I'd like to get consensus on how to handle the placement of "someone", "something", "one" and "it" in multiword verb phrases. I already made a BP post about this last month, here: Wiktionary:Beer parlour/2024/June#standardizing the form of phrase lemmas. Most of the rules I proposed were well-received, but there was disagreement over whether and when to include the word "someone" etc. in verb phrases (except when it occurs as someone's, where it's generally mandatory). The statistics indicate that mostly it is excluded in the lemma:

The upshot is that the vast majority don't contain someone or something.

Some people said that we should put "someone/something" in the lemma when it belongs in the middle, e.g. "see something through", because it's supposedly mandatory here. I note that this isn't actually the case; even expressions like this can have the object placed at the end if it's heavy, e.g. "I saw through all the projects that were assigned to me", whereas "I saw all the projects that were assigned to me through" is maybe possible but awkward to say the least. Furthermore, it's not generally the practice to include "someone" or "something" in the lemma, as exemplified by see through, which contains both the "see through it" and "see it through" senses.

What I do propose instead is to indicate the position of the object, especially when it goes before the preposition, in the headword but not the lemma. This would mean that under see through we'd have two separate entries with separate headwords, one for the meanings that are construed using "see through it" and another for the meanings that are construed using "see it through". This might be indicated something like this:

===Verb===
{{en-verb|see<,,saw,seen> through (sth/so)}}

# {{lb|en|transitive}} To perceive visually through something [[transparent]].
#: {{ux|en|Their fabric is so thin that I can '''see through''' these curtains.}}
#: {{ux|en|We '''saw through''' the water with ease; it was as clear as glass.}}
# {{lb|en|transitive|idiomatic}} To not be [[deceive|deceived]] by something that is [[false]] or [[misleading]]; to understand the hidden truth about someone or something.
#: {{ux|en|I'm surprised she doesn't '''see through''' his lies.}}
#: {{ux|en|I can '''see through''' his [[poker face]]. He isn't fooling anyone.}}
#* {{quote-book|en|title=Rationality and the Pursuit of Happiness: The Legacy of Albert Ellis|author=Michael E. Bernard|year=2010|passage=Now, when you awfulize you go beyond that and tell yourself, instead “It's horrible, awful and terrible!” You then mean several things, all of which are clearly unprovable and which any self-respecting Martian with an IQ of 100 could easily '''see through'''.}}
# {{lb|en|transitive|idiomatic}} To [[recognize]] someone's true [[motive]]s or [[character]].
#: {{ux|en|In that moment, I finally '''saw through''' her; this petition drive had nothing to do with her love for animals, and everything to do with impressing Michael, the cute intern.}}

===Verb===
{{en-verb|see<,,saw,seen> (sth/so) through}}

# {{lb|en|transitive|idiomatic}} To provide support or cooperation to (a person) throughout a period of time; to support someone through a difficult time.
#: {{ux|en|And may we all, citizens the world over, '''see''' these events '''through'''.}}
#* {{quote-song|en|title=w:Never, Never Gonna Give Ya Up|author=w:Barry White|year=1973|passage=Forever and ever, yeah / I'll '''see you through''' it}}
#* {{quote-song|en|year=1976|title=Coney Island Baby|author=w:Lou Reed|passage=The glory of love might '''see you through'''}}
# {{lb|en|transitive|idiomatic}} To do something until it is finished; to continue [[work on|working on]] (something) until it is finished.
#: {{syn|en|see out}}
#: {{cot|en|carry out}}
#: {{ux|en|Despite her health problems, Madame Prime Minister '''saw''' the project '''through'''.}}
#* {{quote-journal|en|date=2022 January 12|author=Sir Michael Holden|title=Reform of the workforce or death by a thousand cuts?|journal=RAIL|issue=948|page=25|text=But if the Government really wants our railway to reduce the level of its subsidy and improve value for taxpayers' money, then it must provide the political air cover to enable managers to get on and make the hard decisions that are needed... and then '''see''' them '''through'''.}}
# {{lb|en|transitive|idiomatic}} To constitute ample supply for one for.
#: {{ux|en|Those chocolates should '''see''' us '''through''' the holiday season.}}

Here, the format of the argument to {{en-verb}} is provisional. The proposed syntax displays as see through something/someone (for the first entry) and see something/someone through (for the second entry). Maybe there's an even more efficient but still understandable syntax. For example, the Italian verb module has a large list of built-in verbs and I may do the same here; then you can just say {{en-verb|see<@> through (sth/so)}} where @ means "consult the built-in verb list for see", so you don't have to duplicate the principal parts of see in each expression involving it if you don't want to.

Note also that I'm about to change things so that multiword verbs link each word separately by default in the listed inflections instead of linking the whole expression as a green link; it seems overkill to have non-lemma entries for the inflections of all these expressions. Benwing2 (talk) 06:06, 18 July 2024 (UTC)Reply

Seems good. I note that one/one's (used to placehold for a reflexive pronoun) is omitted from the list of placeholders. But having placeholders (something, someone etc.) with the same orthography as the core terms of the expression makes the placeholders seem to be a required part of the expression. Some dictionaries use parentheses for such cases. Perhaps we could just not embolden the placeholders. (We could then use parentheses to orthographically distinguish optional from required terms and placeholders.)
I also wonder how we could detect whether we have usage examples that actually match the definitions when someone (a person) is common, as well as something (inanimate). I mention that because the first definition in see sth/so through explicitly includes "(a person)" whereas the usex has an inanimate object. If we are going to have "something" and "someone" used and distinguished in our definitions, as we should, then we should make sure our cites and usexes fit. If it can't be done automagically, then we need some human engineering to induce contributors to clean these things up. We need to make such cleanup a bright shiny object. The only way I've experienced is cleanup lists. Creating such lists would be an important task for magic. DCDuring (talk) 14:42, 18 July 2024 (UTC)Reply
@DCDuring I agree with everything you say. I don't think it's possible to automatically determine whether a cite matches a usex and the header; this would have to be done manually, through cleanup lists as you suggest. Benwing2 (talk) 21:29, 18 July 2024 (UTC)Reply
Do you have any ideas about how to generate useful cleanup lists? They don't have to be perfect in inclusion or selectivity. Actually, maybe all we need is to look at all the cases that have each of the placeholders in the headword and either of the standard indicators of usage examples: {{usex}} or *:. The numbers you've provided above are not vast. All one needs is sufficient Sitzfleisch. DCDuring (talk) 21:49, 18 July 2024 (UTC)Reply
  • How would you handle cases like break up, where the noun can go on either side of the preposition in some cases? "I broke up the fight" and "I broke the fight up" are AFAICT synonymous, but I'm not sure that holds for all senses (does it work for sense 3, which I think of as break someone up)? Smurrayinchester (talk) 15:27, 18 July 2024 (UTC)Reply
    @Smurrayinchester Hmm, this is interesting. It seems there may be at least four separate cases to consider:
    1. It is mandatory to place the noun or pronoun after the preposition, as in see through (it)/see through (the ruse).
    2. It is mandatory to place a pronoun before the preposition, but nouns normally go after, as in break up (the monotony).
    3. It is mandatory to place a pronoun before the preposition, but nouns can go before or after, as in break up (the class into groups) or break (the class) up (into groups).
    4. It is mandatory to place a pronoun before the preposition, but nouns normally go before, as in break (the class) up (into fits of laughter) or see (the project) through.
    If you look at the definition of break up in [4] for the Farlex Dictionary of Idioms, they seem to distinguish these cases in that they identify case #4 using In this usage, a noun or pronoun is commonly used between "break" and "up." and case #3 using In this usage, a noun or pronoun can be used between "break" and "up.", while case #2 doesn't have any trailing verbiage. (Note although that by this criterion they put "break up into pieces" in case #4 instead of #3). Under see through in [5] for this same dictionary, they have three separate headers, "see (one) through", "see (something) through" and "see through (someone or something)".
    A few things to add:
    1. "someone" and "something" are pronouns, and accordingly they get placed before the preposition whenever possible, even in case #2 above; "break up something" sounds a bit strange to me unless you put a pause between "up" and "something".
    2. Even in case #2 it's possible to put short nouns before the preposition, as in "break the monotony up", it's just not so natural.
    3. In case #4, sufficiently heavy/long nouns have to be placed after the preposition, as in the example I gave above: "I saw through all the projects that were assigned to me".
    Anyway, I'm not quite sure how to distinguish cases #2 - #4 above. It seems we have two choices: fit this into the header somehow or use labels or similar. My intuitive sense is that labels might be better, because otherwise there might be a lot of duplication of headers and because the labels can appropriately link to an appendix that explains the usage in more detail.
    BTW I'm sure there have been oodles of papers written on this topic but I don't know of any good ones. Can anyone find a definitive explanation of the above phenomena? Benwing2 (talk) 21:28, 18 July 2024 (UTC)Reply

Wikimedia Movement Charter ratification voting results

edit
You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello everyone,

After carefully tallying both individual and affiliate votes, the Charter Electoral Commission is pleased to announce the final results of the Wikimedia Movement Charter voting.  

As communicated by the Charter Electoral Commission, we reached the quorum for both Affiliate and individual votes by the time the vote closed on July 9, 23:59 UTC. We thank all 2,451 individuals and 129 Affiliate representatives who voted in the ratification process. Your votes and comments are invaluable for the future steps in Movement Strategy.

The final results of the Wikimedia Movement Charter ratification voting held between 25 June and 9 July 2024 are as follows:

Individual vote:

Out of 2,451 individuals who voted as of July 9 23:59 (UTC), 2,446 have been accepted as valid votes. Among these, 1,710 voted “yes”; 623 voted “no”; and 113 selected “–” (neutral). Because the neutral votes don’t count towards the total number of votes cast, 73.30% voted to approve the Charter (1710/2333), while 26.70% voted to reject the Charter (623/2333).

Affiliates vote:

Out of 129 Affiliates designated voters who voted as of July 9 23:59 (UTC), 129 votes are confirmed as valid votes. Among these, 93 voted “yes”; 18 voted “no”; and 18 selected “–” (neutral). Because the neutral votes don’t count towards the total number of votes cast, 83.78% voted to approve the Charter (93/111), while 16.22% voted to reject the Charter (18/111).

Board of Trustees of the Wikimedia Foundation:

The Wikimedia Foundation Board of Trustees voted not to ratify the proposed Charter during their special Board meeting on July 8, 2024. The Chair of the Wikimedia Foundation Board of Trustees, Nataliia Tymkiv, shared the result of the vote, the resolution, meeting minutes and proposed next steps.  

With this, the Wikimedia Movement Charter in its current revision is not ratified.

We thank you for your participation in this important moment in our movement’s governance.

The Charter Electoral Commission,

Abhinav619, Borschts, Iwuala Lucy, Tochiprecious, Der-Wir-Ing

MediaWiki message delivery (talk) 17:53, 18 July 2024 (UTC)Reply

It's my understanding that the vote fell a bit short of the 2% quorum requirement as well. DCDuring (talk) 19:34, 19 July 2024 (UTC)Reply

Deprecating MediaWiki:Gadget-Navigation popups

edit

This gadget does not work correctly and has a replacement in the form of Page Previews (see #User:Ioaxxere/PagePreviews.js). If no one objects I will remove it from the gadgets list. Ioaxxere (talk) 04:03, 19 July 2024 (UTC)Reply

As someone who uses Navigation popups, I notice the following differences: your popups do a better job of creating a preview of definitions for entries, but don't create previews for other pages, and lack the suite of other links that the 'Navigation popups' have in the "actions" dropdown, which I use to do things like get a preview of—or click through and go directly to—the edit history of the page upon hovering over the link (instead of having to click the link to go to the page, and then click the "history" tab and go that page, and then go back to my watchlist or the recent changes feed and do that for the next entry that catches my eye). I also like to use Navigation popups to preview diffs. So, I object to removing them (for now). The ideal solution IMO might be to incorporate the full functionality of the Navigation popups into your popups, but other (possibly easier?) solutions might be to make one or the other shift where it pops up so that they could just both be used. - -sche (discuss) 20:59, 19 July 2024 (UTC)Reply
@-sche: Unfortunately there's no way I can incorporate the full functionality of Navigation Popups into my gadget while still maintaining its minimalist aesthetic. Also, it seems like Navigation Popups has virtually no restrictions on what can be displayed, whereas I would like to ensure that the previewed content actually makes sense — so I won't try to match it on that front, either. But if there's something specific that I could add which would get you to switch, please let me know. Ioaxxere (talk) 05:35, 20 July 2024 (UTC)Reply
@-sche: Do you mind if I move this gadget into some userspace, since being actually functional ought to be a basic requirement for gadgethood? Ioaxxere (talk) 02:45, 31 July 2024 (UTC)Reply
It is actually functional: you hover over a link to a history page, it generates a useful preview of the recent history; you hover over a diff link, it generates a useful preview (failing if the diff is convoluted; truncating if it's long); you hover over a link to a mainspace page, it generates a preview (with definitions) some (I would need to test and quantify how much) of the time; and it always generates the suite of links to click to go directly to the edit window, etc. It'd be good to improve it further, but it's more functional than a number of our gadgets (most famously aWa) which don't even load in most skins. (I'd move or a leasts copy the red "this is broken" notice from the popups gadget to aWa accordingly, which might inspire someone to repair it.) I think it'd be ideal if one gadget combined the functionalities of both this gadget and new gadget (or offset their windows so they were at least both usable at once), but at the moment they seem to do different things, and one is not a replacement for the other: the new gadget reliably generates previews of mainspace entries, but doesn't preview other kinds of links and pages; the old gadget's previews of mainspace pages are sometimes useless, but it generates previews of the other aforementioned kinds of pages, and its titular navigation links. - -sche (discuss) 05:15, 31 July 2024 (UTC)Reply

Admin abuses

edit

To admins: I opened a discussion on a problematic administrator at Wiktionary talk:Administrators#Admin abuses rv rights. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:05, 19 July 2024 (UTC)Reply

synonyms, antonyms ▲

edit

Do you guys use these buttons to hide and show inline synonyms? I'm thinking of removing these (at least on mobile) as they create visual clutter for seemingly little benefit. I don't think we should even have so many nyms on a single definition that someone would want to collapse them. Ioaxxere (talk) 19:37, 19 July 2024 (UTC)Reply

I strongly prefer having them hidden (which is what my preferences are set to) and I'd prefer having them hidden by default. I think there's more clutter added to entries by the -nyms themselves. I would be very opposed to removing the collapsibility feature. Andrew Sheedy (talk) 20:20, 19 July 2024 (UTC)Reply
Collapsed by default would be fine. (Thus, set to collapse syn and ant and cot by default, just as hyper and hypo and mer and hol are already set to collapse by default.) The option to unhide should not be taken away, though. Its button could be as unobtrusive as anyone likes, so long as it exists. It is true that list length can sometimes be trimmed via the final list item being a Thesaurus entry, and that is always nice to do when we get a chance. But it is not always practical, and there should not be any banning of semantic relations links. Quercus solaris (talk) 22:08, 19 July 2024 (UTC)Reply
I prefer them visible/expanded by default, because having them collapsed makes them hard to find even if you're an adept user actively looking for them, so it surely makes them unnoticeable for many people who would be interested in them but don't realize to look for them: there was a discussion semi-recently (which I can't relocate at the moment) about no longer collapsing coordinate terms, altform-inline, etc, because when they used to be collapsed, even I had gone to entries which I thought I added e.g. a coordinate term to, hadn't seen it (because it was collapsed and the 'expand' button was small and easily overlooked), hadn't found it when Ctrl-F-ing, and only noticed when I went to edit the wikitext of the page to add it that it was already there in the wikitext, just hidden from view. But since some people prefer having them collapsed, to me it seems most reasonable to let people individually opt in to collapsing content they don't want (vs relying on people to notice content is being hidden and specifically opt-out of it being hidden).
I notice that on at least some mobile devices the buttons seem to be the same size and font/colour as the definitions; perhaps we could change that to make the buttons look like the distinct thing that they are, so the entry was less of an undifferentiated lump of clutter? In general our mobile interface seems suboptimal, e.g. it's nonobvious how to get to the page history, and I'm not sure if my phone screen is just too small for the little 'circles' to display that allow selecting individual revisions to compare, but they don't display for me on my phone, though I see them if I access the mobile version of the site from my computer. (In turn, our desktop interface is uncompact, with wasted whitespace; one idea there would be to make the "synonyms:", "antonyms:" etc text stand out in some way and then optionally put them all on one line, "synonyms: foo, antonyms: bar, coordinate terms: baz".) - -sche (discuss) 22:16, 19 July 2024 (UTC)Reply
Agreed. That last idea would be A-OK in my opinion. A follow-up to my comment above. Wiktionary serves various user personas, and that's A-OK. Everyone from (1) people who barely even speak a language and don't want anything except (what Collins would call) a gem definition, to (2) people who can handle all of the semantic relations and would be well-served by having the option to see them, even if they are hidden by default (in service of the aforementioned users who either don't want or can't handle them). The knowledgeable users can simply unhide them if they choose to do so. The button to unhide could be unobtrusive but also should not be such a desperately hidden Easter egg that serendipitous discovery becomes unlikely. A nice balance can be struck. Quercus solaris (talk) 22:23, 19 July 2024 (UTC)Reply
I don't understand why we have individual [synonyms/antonyms] buttons on every sense for users who have them expanded by default. It's just clutter. There is no need to offer the capability to hide the nyms of one individual sense and leave all others expanded. I think the [synonyms/antonyms] buttons should only be visible to users who have chosen, via the sidebar toggle, to collapse these nyms by default. This, that and the other (talk) 04:12, 22 July 2024 (UTC)Reply

Applying {{ux}}

edit

Could someone create a bot to automatically replace raw markup with {{ux}} in entries? This would have the benefit of allowing users to easily change the appearance of usage examples if desired as well as make it easier to extract structured data from Wiktionary. Maybe @JeffDoozan would be interested. Ioaxxere (talk) 23:13, 19 July 2024 (UTC)Reply

You need to be careful of this and {{co}}. Vininn126 (talk) 23:17, 19 July 2024 (UTC)Reply
Yes, and this request would have been easily implemented before the creation of {{co}}. I always manually apply the templates whenever I spot uncategorized usage examples, collocations, quotations, etc. Inqilābī 15:41, 20 July 2024 (UTC)Reply
Sounds good. Vininn126 (talk) 15:55, 20 July 2024 (UTC)Reply
My bot already does this for unambiguous bare UX entries that use the formatting recommended by WT:UX. JeffDoozan (talk) 15:54, 20 July 2024 (UTC)Reply
@JeffDoozan: Thank you, I did recall it was you. Do you why your bot seems to have missed attire#Noun? Ioaxxere (talk) 16:22, 20 July 2024 (UTC)Reply
@Ioaxxere: attire#Noun is ignored for not following the WT:UX guidelines Example sentences should ... not contain wikilinks (the words should be easy enough to understand without additional lookup). JeffDoozan (talk) 16:26, 20 July 2024 (UTC)Reply
@JeffDoozan That text seems to have been present from the earliest revision of the page and doesn't seem to reflect current practice. I'll start a discussion about removing it. Are there any other usage example formats that your bot is ignoring? Ioaxxere (talk) 16:39, 20 July 2024 (UTC)Reply
@Ioaxxere: The bot is very careful to convert only text that is unambiguously a UX: sentences that are completely enclosed in italics, contain exactly one bolded item, do not contain wikilinks, do not contain templates, start with a capital letter A-Z, end with punctuation mark "." "!" or "?", do not contain "sibling" text at the same indentation level (except expected templates like ux, syn, coi, etc), do not contain any text or templates at a deeper indentation level except for non-English sections that may contain a single translation one level deeper that is italicized and contains bolded text. JeffDoozan (talk) 17:10, 20 July 2024 (UTC)Reply
@JeffDoozan: It seems like periphery (corrected by me in diff) follows all these rules yet wasn't converted by your bot. Ioaxxere (talk) 19:46, 6 August 2024 (UTC)Reply
@Ioaxxere It turns out I never ran the bot on English entries. It's running now and should clean up about 14k bare UX items. JeffDoozan (talk) 20:01, 11 August 2024 (UTC)Reply
@JeffDoozan: Wow, thank you very much! Ioaxxere (talk) 20:03, 11 August 2024 (UTC)Reply

CAT:Dialects

edit

I propose that this category be deleted. All dialects are languages on their own right, and all languages are ultimately dialects as well— hence a distinction between languages and dialects in our categories is unnecessary and is a source of chaos due to the lack of well-defined criteria (compare Old Icelandic, Old Novgorodian, the Prakrits, Arabic lects, Chinese lects, etc.). Further, many language varieties, as opposed to real dialects, are wrongly categorized as dialects (Italian English, Korean English, Sri Lankan English etc.). On the other hand, if we are ready to systematically draw a line between them (say, by cleaning up the categorization, or by categorizing as both a language and a dialect), it would be justified in creating more linguistic categories, namely, for Regiolekte, creoles, etc. Also, I noticed lots of lect names are sums of parts, which should actually be deleted. What do you think? Inqilābī 15:30, 20 July 2024 (UTC)Reply

Sister project boxes

edit

Today I noticed that water has sister project boxes for Wikipedia, Commons, Wikiquote, and Wikiversity. Do we really need all those? Ioaxxere (talk) 16:28, 20 July 2024 (UTC)Reply

This is why variants of these templates exist that can be added under the Further reading section (e.g. {{pedialite}}). A hot take is that we should do that everywhere and retire floating sister project boxes from entries entirely. — SURJECTION / T / C / L / 22:37, 20 July 2024 (UTC)Reply
@Surjection I would   Support this for the floating boxes other than {{wikipedia}}. I think that one can still be useful. Ioaxxere (talk) 02:40, 21 July 2024 (UTC)Reply
That is a hot take. I personally don't have a strong opinion, but I do feel we should synchronize it perhaps through a wikidata entry, since often many things are kept there. Vininn126 (talk) 12:48, 21 July 2024 (UTC)Reply
Yes this is IMO totally pointless. I think we should in general ban project boxes for anything other than Wikipedia and in some cases Commons. Note that topic category pages are more likely to have Commons links than mainspace lemmas (AFAICT), which is probably fine. Benwing2 (talk) 01:09, 23 July 2024 (UTC)Reply

Wiktionary:Example sentences

edit

I propose removing the point: Example sentences should [] not contain wikilinks. This text was added in a 2007 vote, but is does not longer seem to be followed, particularly in Chinese entries. There are many good reasons to wikify usage examples, even in English: see deafaz for an example. Ioaxxere (talk) 16:39, 20 July 2024 (UTC)Reply

  Support. As with linking in definitions, the criterion should simply be that useful links are welcome and overlinking (a pointlessly excessive degree) will be pruned back. As with pruning bushes, no need for overpruning. Quercus solaris (talk) 17:56, 20 July 2024 (UTC)Reply
@Ioaxxere: I don't see how the deafaz entry illustrates a problem. Could you explain it? DCDuring (talk) 20:20, 20 July 2024 (UTC)Reply
@DCDuring: According to the policy, all the wikilinks in the example sentence should be removed, but this would obviously be less than ideal as all of the wikilinked terms are slang terms which are unknown to most English speakers. Ioaxxere (talk) 20:33, 20 July 2024 (UTC)Reply
Maybe change it to something like "Example sentences should [...] use wikilinks only sparingly" or something. Or, yes, just remove it; I don't think we want people to start wikilinking every word or even just most words, but I do agree there are cases where wikilinking is reasonable. - -sche (discuss) 20:39, 20 July 2024 (UTC)Reply
I think that would be preferable. The original idea was that usexes were supposed to be simple enough to actual illustrate the word to someone who didn't know it, not be the cause for looking up more words. We should continue to discourage wikilinking usexes, while being flexible when it makes sense. Andrew Sheedy (talk) 22:04, 20 July 2024 (UTC)Reply
Yes, I agree. As an alternative to blue links, I ended up giving "translations" at we#Etymology 2, but that only makes sense in some contexts (and even then, I'm still unsure if it's the right approach). Theknightwho (talk) 22:50, 20 July 2024 (UTC)Reply
  Support relaxing the prohibition in general, but especially for non-English example sentences. The policy states that the words should be easy enough to understand without additional lookup, but readers of the English Wiktionary should not be assumed to understand other languages, and such wikilinks can be a convenience to language learners. Voltaigne (talk) 21:01, 20 July 2024 (UTC)Reply
  Support but not in blue. Useful for examples and for quotations. _1. Perhaps a l2 (link2) like word dashed, black ? _2. Hoping: that in the future, for Examples and Quotations, double-click links would be available for all words, with manual links provided if linking to a diffferent form. and _3. Quotations linked to dozens of lemmata e.g. an Ancient Greek paragraph linking to 50 pages. A repository of texts?, at least some? Thank you ‑‑Sarri.greek  I 18:12, 21 July 2024 (UTC)Reply
  Oppose at least complete removal. I think links in usage examples (and quotes, but that's not being discussed here) tend to be cluttersome and distracting, drawing attention away from the bolded word in question. I'd actually remove the {{ux}} on deafaz because the quotes do a decent job of showing the word in use and because I should not have to click on another page just to understand what this other word is. I suppose when it comes to foreign languages, things might be different, and exceptions may exist regardless, so perhaps this can be lightened at least by adding "generally" before "not". I also like Sarri.greek's suggestion of being able to double-click on any word or term as needed. -BRAINULATOR9 (TALK) 22:30, 21 July 2024 (UTC)Reply
  Support. Imetsia (talk (more)) 22:36, 21 July 2024 (UTC)Reply
  Support. This also need to be changed at EL: WT:EL#Example sentences. The principle that words used in example sentences "should be easy enough to understand without additional lookup" is a good one and should be kept, perhaps with added guidance at WT:UX that difficult or unusual terms should only be used where it significantly adds to the illustrative value of the example sentence. This, that and the other (talk) 03:20, 22 July 2024 (UTC)Reply
  Support as links can be useful. J3133 (talk) 05:31, 22 July 2024 (UTC)Reply
  Oppose: if a user is looking up deafaz, that implies they're reading/hearing that kind of slang, and they're either already familiar with most of it (in which case the links are mostly useless), or they should start with a basic course (in which case we should link to a basic course). The same goes for foreign languages. (The usex at deafaz could be considered a foreign language, so add a translation into "normal" English rather than a gazillion links.) I really dislike looking up Chinese entries because of the completely pointless sea of blue. MuDavid 栘𩿠 (talk) 07:58, 22 July 2024 (UTC)Reply
The philosophy that slang entries should only cater to those already familiar with it or who have taken a course in it (???) is simply not aligned with reality. No-one is suggesting that other languages should copy Chinese by linking every term, but that doesn't mean we should never have links. It's one of those silly, context-blind policies that's no longer fit for purpose. Theknightwho (talk) 18:58, 24 July 2024 (UTC)Reply
  Comment. It seems to me that two elements of the current policy are in conflict: (i) "the words should be easy enough to understand without additional lookup", and (ii) "place the term in a [typical/representative] context". This is why the example at deafaz runs into trouble: the editor adhered to (ii), and thereby infringed (i). The alternative would be to allow examples to disregard context, resulting in something like, "Past generations of schoolchildren often got a deafaz from their teacher for misbehaving." (an atypical/unrepresentative context). I am open to the idea raised above of providing a 'translation' for examples such as in deafaz and we, rather than hyperlinking.
A slightly different set of cases where conflict between the two existing requirements might arise are examples for highly technical terms, rare terms, archaic/obsolete terms, and so on. Although the two examples at Lebesgue integral don't need 'translation' or hyperlinking.
—DIV (1.129.106.197 13:22, 22 July 2024 (UTC))Reply
  Support. This policy seems written only for English usexes and is not followed at all in non-English usexes. It might make sense to have wording that reflects this; for English usexes it makes sense to only wikilink unfamiliar or "hard" words, but for non-English usexes it should be up to the language community what to do (e.g. Russian usexes tend to link to the lemma of most words, which IMO helps language learners a lot). Benwing2 (talk) 01:06, 23 July 2024 (UTC)Reply
For English usexes I would suggest to wikilink unfamiliar or "hard" words only if it's really impossible to avoid the use of such words in the first place. I agree that it's up to each language community what to do with their non-English usexes. However relying on a simple vocabulary is still probably preferable for all languages. --Ssvb (talk) 05:20, 25 July 2024 (UTC)Reply
  Support per Benwing2, though even some English usage examples need links occasionally, as noted above. Theknightwho (talk) 19:00, 24 July 2024 (UTC)Reply
My understanding is that usage examples are expected to be simple, while quotations are free to flex their language muscles. But the deafaz entry deliberately does everything completely backwards. Just compare:
  • a usex: "Two twos the mandem gave this likkle yute a deafaz."
  • a quotation: "But don't walk in the bikers' lane or you're gonna catch a deafaz to the brain."
Looks like the editors violated the policy on a whim for no good reason and now use this violation incident as a pretext for changing the policy. --Ssvb (talk) 04:57, 25 July 2024 (UTC)Reply
@Ssvb: I think it depends to some extent on what you consider "simple". The usage example (which I didn't write, by the way) is very short and is grammatically and semantically simple. If you're basing your opinion purely on "how many people can understand the example", you could claim that any usage example whatsoever written in an obscure extinct language is impossibly complex. Ioaxxere (talk) 17:58, 25 July 2024 (UTC)Reply
@Ioaxxere: Usage examples in other languages have English translations and this helps a lot. Also usage examples are normally constructed by Wiktionary editors, who are native speakers of that particular language. Does anyone even create usage examples for extinct languages? And if yes, then how do we know that they are actually correct? --Ssvb (talk) 23:55, 28 July 2024 (UTC)Reply
  Oppose: I think that striving to simplicity in usage examples and avoiding links in them is a good policy, so I don't see any need to change it. Is it really impossible to construct a much more simple example for deafaz, so that it doesn't mention the other obscure slang words? Maybe try to follow the https://simple.wikipedia.org/wiki/Wikipedia:How_to_write_Simple_English_pages guidelines for the English usage examples and apply similar rules to the other languages as well? It's fairly easy to implement a Lua script to automatically validate usage examples against the top850/top1500 most frequently used words and highlight the undesired words. --Ssvb (talk) 02:57, 25 July 2024 (UTC)Reply
@Ssvb What you're proposing would effectively obliterate all usage examples in dialects other than standard English. No thanks - I strongly oppose that. Theknightwho (talk) 22:12, 25 July 2024 (UTC)Reply
@Theknightwho What I propose would only change "Two twos the mandem gave this likkle yute a deafaz" into something like "Two twos the mandem gave this likkle yute a deafaz".
Automatically highlighting the difficult/uncommon words and giving the editor a chance to come up with a different simpler example right on the spot. And also adding the uncorrected entries into the category of "excessively complicated usexes" to allow monitoring the policy compliance for the whole dictionary. Of course, the template could also have an option to suppress this highlighting in special cases.
In what way does this "obliterate" anything? Is there anything special about dialects? Why would you want to use uncommon words in the usage examples specifically for dialects? The headword itself obviously is allowed to be an uncommon word and it's already shown as bold. Alternative spelling variants of common top1500 words, such as "color" vs. "colour", are obviously also allowed. A real example would be very much appreciated to see what you are worried about. --Ssvb (talk) 23:04, 25 July 2024 (UTC)Reply
@Ssvb It's the part where you'd have us suggest that usage examples only include one dialectal term at a time that I have a problem with. That's not how dialects work in real life, so in many cases it would be actively misleading to make the substitutions that you're suggesting. "Suddenly the gang gave this little youth a deafaz", which is what you're arguing for, is something nobody would actually say, because it's an unnatural mix of registers. Dialects are not merely the standard language with a few strange words sprinkled in here and there. Theknightwho (talk) 23:29, 25 July 2024 (UTC)Reply
@Theknightwho The current WT:UX policy already suggests including one term at a time: "Care should be taken that example sentences for rare or technical terms avoid including additional rare or technical terms". Do you have a real example of a real usex, where this policy is actually problematic? The deafaz entry isn't one of them, because even its quotations are simpler than its usex. Needless to say that the quotations are not restricted in terms of grammar or vocabulary, not to mention that they are used for attestation. --Ssvb (talk) 23:49, 25 July 2024 (UTC)Reply
@Ssvb These terms aren't rare or technical if you're familiar with the relevant dialect, so no, I don't agree that that applies. That policy has to be interpreted in a contextual way. Plus, looking at the citations for deafaz, I see numerous "common" terms being used with dialectal or slang senses: caption, heat, trapping etc. You yourself didn't even spot two twos in the usage example. Again, the only effect of what you're proposing would be to sanitise usage examples given in dialects other than standard English. Theknightwho (talk) 23:57, 25 July 2024 (UTC)Reply
@Theknightwho: Do you mean that my Lua script didn't spot two twos? That's its current design limitation, but I would argue that this isn't critical for its mission. BTW, I didn't suggest "Suddenly the gang gave this little youth a deafaz". My suggestion was to consider something like "Don't walk in the bikers' lane or you're gonna catch a deafaz to the brain", which was a part of one of the quotations. It's up to the native speakers to pick a good and naturally sounding usage example, but I hope that they can make a honest effort to come up with something reasonably simple. Also don't forget that it is me, who prefers to keep the existing policy, while you belong to the group, that is in favor of changing it. Your comments somehow strangely insinuate that it's allegedly the other way around. --Ssvb (talk) 01:18, 29 July 2024 (UTC)Reply
@Ssvb I'd say that's because my view reflects existing practice, really, irrespective of what the letter of the policy says. Theknightwho (talk) 01:22, 29 July 2024 (UTC)Reply
  Support: It's basically established practice to do this for Koreanic languages. AG202 (talk) 00:17, 10 August 2024 (UTC)Reply
  Support Many languages on Wiktionary have just ignored this rule for years. I know this is not a court but I feel like the legal concept of Desuetude applies here. — BABRtalk 07:05, 10 August 2024 (UTC)Reply
  Support: They're useful, and I find the arguments against them uncompelling. — excarnateSojourner (ta·co) 03:00, 20 August 2024 (UTC)Reply

Transcriptions of the nurse vowel

edit

Can someone remind me why Appendix:English pronunciation uses /ɜɹ/ for GenAm but many entries use /ɝ/? Was the recommendation revised after a bunch of entries had inherited the older recommendation? No big deal, I am just idly curious about it. Quercus solaris (talk) 18:03, 20 July 2024 (UTC)Reply

I don't know if it [the appendix] was intended as a firm "/ɜɹ/ good, /ɝ/ bad" declaration; there was a similar discussion of /əɹ/ vs /ɚ/ recently; each option (/ɜɹ/ vs /ɝ/) has arguments for it, maybe we just need to take a straw poll / vote about which to use. I will note that we almost never notate /ɑ˞/ or /ɔ˞/, it's always V+ɹ, so /ɜɹ/, /əɹ/ would be consistent with that and would require positing fewer phonemes / using fewer symbols (which is I suspect why the Appendix is set up like it is). In any case we should definitely add footnotes that /ɜɹ/ and /ɝ/ (and likewise for /ɑ˞/, etc) are the same phoneme, so anyone reading one work (or entry!) that has one, and another work or entry that has the other, knows they're not separate phonemes in English. - -sche (discuss) 20:47, 20 July 2024 (UTC)Reply

The Spanish salle problem

edit

Is there some way we can amend the selected combined forms table at salir so that it doesn't claim that the forms salle (et al.) don't exist? I appreciate that this stems from a well known problem with the Spanish orthography that renders these forms "unwriteable", since the morphemes are "sal-le" (/ˈsalle/), which clashes with the orthography rules that dictate it should be read "sa-lle" (/ˈsaʝe/), but that's something we should be explaining in a footnote. What we shouldn't be doing is brushing it under the carpet by pretending they don't exist, since it's misleading to learners, who may well encounter these forms in speech; particularly given that salir is a really basic verb. Theknightwho (talk) 03:45, 21 July 2024 (UTC)Reply

I would support a usage note, and we can make entries for salle and such, if they're attestable, but they'd have to have a nonstandard label. However, while salir is a common verb, the usage that an imperative salirle brings is not nearly as common, and in this specific instance many choose to rephrase it as this page explains.
  • Looking at the RAE's historical corpus, salir gets 63857 hits, while salirle gets 314. Searching for sálgale gets 3 hits and salile gets me 0. salle gets 673, but the vast majority of them are pre-1600s and the usage is not the same as what's being discussed. After the 1600s, the only hits are clear cases where "salle" is part of the name. There are no hits for sal-le (and the RAE does keep track of hits with dashes). The CORPUS XXI also has no hits for sálgale, salile, nor sal-le, and the salle hits are all names of something.
  • Moving to Google Books, the queries are a bit harder to find, so I'll be using the construction with al encuentro behind it to illustrate. Out of the searches for salle al encuentro, only one is an actual use, with the others being a mention or explicitly talking about the fact that you can't write it. Sálgale al encuentro only has 3 pages of hits. Salile al encuentro has slightly more, but some of those usages are clearly salí (1p preterite) + le from back when pronouns could be added to indicative verbs as enclitics. Finally, sal-le al encuentro paints the same picture, as there are only one, maybe two, genuine uses for it in running text (with the others either simply repeating the text or talking about the phenomenon). Even when I expand the search to sal-le al, I get maybe one more use.
As such, overall, I don't think this is really that serious of an issue, and I highly doubt that a learner would come across it in running text. I doubt that they're likely to hear it in speech either considering how rare the other forms are as well. Folks clearly took the "you can't write this in Spanish!" headline and ran with it, but the data shows that it's not common at all. Honestly I'm not even sure if "sal-le" is attestable under CFI at this rate, and if it is, it'd be exceedingly rare. It's more trivia than anything. AG202 (talk) 05:46, 21 July 2024 (UTC)Reply
Knight did say that the term was encountered in speech more than in writing, so maybe there are more uses in recorded speech (like talk shows and vinyl recordings.) Although I do acknowledge that the data suggests that this form is rare.
(Also, do we need one use of the nonlemma form to make an entry, or three?) CitationsFreak (talk) 08:06, 21 July 2024 (UTC)Reply

German Low German and Low German

edit

First of all, there's no such a thing as "German low German" there's just one Low German, with western and eastern varieties. I've seen some people explaining that Low German is Low German spoken in Netherlands (even if it would be truth, why just don't call it Dutch Low Saxon, it already has a name) but then I notice that we classify Westphalian as German Low German (which is spoken both in Germany and Netherlands) so what's Low German and what's German Low German? It seems like none really knows, I've seen people adding Low Prussian entries (a dialect of colonial east German) as sometimes Low German and sometimes as German Low German. My proposal to fix this issue is to merge Low German with German Low German, and split it into "East Low German" or "Low Saxon" and into "West Low German". I'm not good at technical side of Wikitionary, but I think it's also possible to automatically label entries east or west when someone would for example would label entry as Low Prussian it automatically becomes East German so we don't need to write "East German, Low Prussian" Rakso43243 (talk) 13:32, 21 July 2024 (UTC)Reply

  Support. Label aliases are a thing, yes, so we can make it that if someone writes Low Prussian it'll be categorised as East Low German. -saph668 (usertalkcontribs) 13:42, 22 July 2024 (UTC)Reply
I've been asked for an opinion. Based on what I've seen on Discord, the current classification does not make sense. I'm not an expert, and that's all I can really say on the subject matter. Vininn126 (talk) 15:55, 22 July 2024 (UTC)Reply
  Support Many people wanted to have it, or even merge all as “Low German”, over the last decade, just didn’t care enough about systematic or coherent treatment of the language. The only reason why we have these langcodes is some kooky database we drew our language data from, when information was not easily accessible, in the 2000s, whence there have been proper ghost languages on Wiktionary. Fay Freak (talk) 21:16, 22 July 2024 (UTC)Reply
I tried to push for codes for West Low German and East Low German some while back, but with little support. --{{victar|talk}} 23:15, 22 July 2024 (UTC)Reply
  Support eliminating Low German; the current three-way division makes no sense. Can you elaborate your proposal more? What would happen to Dutch Low Saxon? Would "East Low German" and "West Low German" be etym-lang variants of Low German or separate L2 languages? Also how would we merge the lemmas? Merging templates across languages can be tricky depending on the particular languages, and generally requires someone who knows the grammar of the languages in question. Benwing2 (talk) 01:01, 23 July 2024 (UTC)Reply
My initial idea was to merge German Low German, Low German and Dutch Low Saxon as "Low German" what from a linguistic point of view would be correct, however from a technical point of view would be merely impossible to do. After a consideration I think making them separated L2s would be the best way to fix the problem. For example, Low Prussian has around 10 dialects, so if we would merge all dialects of Low German as one language, we would need to do a phonology for all of 10 dialects, now imagine how a Low German entry would look like: there would be at least 50 IPAs from all dialects and subdialects, trillion definitions depending on a dialect, and even more alternative forms, and don't forget that some dialects keeps genitive case and other have the ablative, so making grammar templates would be a horror. A West and East division would't completely eliminate the problem, but it would reduce it.
The main problem with Dutch Low Saxon is the Dutch influence, orthography, vocabulary, grammar etc. but then if we would keep Dutch Low Saxon as a separated L2, why wouldn't make for example make East Pomeranian a separated L2 as well? For this reason I think we should merge Dutch Low Saxon with West Low German (plus Dutch Low Saxon entries are very low quality they don't have any quotation and they have only 100 entries, so it wouldn't be a great loss)
To sum up, I propose merging German Low German and Dutch Low Saxon to Low German, and then separate them into East and West Low German which would be a distinct L2s. Rakso43243 (talk) 08:23, 23 July 2024 (UTC)Reply
Having West Low German (WLG) and East Low German (ELG) as separate languages is a bad idea: Dithmarsisch (part of WLG, like from Klaus Groth) and Mecklenburgisch (part of ELG, like from Fritz Reuter) for example are more similar to each other than any of them to Westphalian (part of WLG), especially to South and East Westphalian, because of the westfälische/Westfälische Brechung... — This unsigned comment was added by 2a01:599:642:988c:68c6:f6ef:961:2f6e (talk) at 17:51, 23 July 2024.
I   Oppose merging Dutch Low Saxon and Low German. (That's the part I have a strong view about.) It is true that you can still argue that they form a kind of dialect continuum (see for example [6]), but for about a century they have drifted apart quite strongly under the influence of the respective official languages Dutch and High German. This is most obviously the case for the lexicon and orthography, but I suspect also for phonology, morphology, and syntax. In this case the national border has become a hard linguistic border. —Caoimhin ceallach (talk) 21:24, 31 July 2024 (UTC)Reply

Am I allowed to correct Tahitian entries to Wikt-consensus orthography?

edit

If you look at Category:Tahitian lemmas, you'll see that we use U+02BB for the letter ʻeta, apart from a couple short words that begin with ʻeta that used an ASCII apostrophe and that I recently copied over. However, when I copy-pasted 'e to ʻe, admin Chuck Entz got angry, accusing me of appointing myself "judge, jury and executioner for other people's hard work", and saying that I need to either reference a discussion permitting this or get consensus for the obvious. So here I am.

BTW, AFAICT the Tahitian govt hasn't decided whether ʻeta should be encoded as U+02BB or as U+02BC (neither is a perfect match), but it looks like we don't use 02BC here on Wikt. kwami (talk) 22:05, 22 July 2024 (UTC)Reply

@Kwamikagami: Since neither is a perfect match for it, are you aware of any proposal to add the properly rotated ʻeta to Unicode? 0DF (talk) 07:32, 23 July 2024 (UTC)Reply
AFAIK it's considered a graphic variant, to be handled by a Tahitian font, not a distinct character for the purposes of Unicode. But there seems to have been no official decision on which character to use. kwami (talk) 12:11, 23 July 2024 (UTC)Reply
@Kwamikagami: Here's my review of the orthographies of the related Polynesian languages that have /ʔ/:
Other things being equal, similar orthographies in cognate languages are to be preferred because it increases the likelihood that cognates will be homographic. Since the saltillo is apparently not an option for Tahitian and since none of Tahitian's cognate languages use ⟨ ʼ ⟩ (the modifier letter apostrophe, U+02BC) for /ʔ/, we have that reason to prefer ⟨ ʻ ⟩ to ⟨ ʼ ⟩ for /ʔ/ in Tahitian. There is the accessibility issue that both ⟨ ʻ ⟩ (U+02BB MODIFIER LETTER TURNED COMMA) and ⟨ ʼ ⟩ (U+02BC MODIFIER LETTER APOSTROPHE) are generally more difficult to input than ⟨ ' ⟩ (U+0027 APOSTROPHE), but given that Tahitian also uses the tārava (macron), which faces similar accessibility issues, that isn't a serious problem. However, it may be worth having hard redirects from otherwise-blank pages using bare vowels and U+0027 APOSTROPHE to pages spelt with the tārava and the ʻokina to catch the search queries of those who can't input the latter. 0DF (talk) 09:07, 24 July 2024 (UTC)Reply
I picked a word at random, mouʻa, to see how we've been handling this. The page history has:
(Chuck Entz talk moved page mou'a to mouʻa without leaving a redirect: ʻokina)
Which is precisely what he said I needed discussion for, because as far as he knew there was no consensus to do this. kwami (talk) 18:13, 24 July 2024 (UTC)Reply
@Kwamikagami: You can't attribute emotion to others ("admin Chuck Entz got angry") based on a text message. This makes it feel like you're trying to paint them as being irrational. Also, you're misrepresenting the context, which is the fact that you moved three language sections, and then you're disregarding that context here and pretending that you had only moved the Tahitian language section and then saying that it's "obvious". That makes it feel like you're trying to downplay the impact of your edits. --kc_kennylau (talk) 12:07, 23 July 2024 (UTC)Reply
I'm not contesting the other two, so only brought up Tahitian. That's the one where there is a clear consensus, but I don't want to continue to act on that consensus because I was reverted. If in the future I want to do more with Guarani or Tupi, I'll bring that up then.
He certainly did seem angry to me.
But could you answer the question? Can I follow Wiktionary consensus on Tahitian orthography? kwami (talk) 12:15, 23 July 2024 (UTC)Reply
Without starting a discussion how these should be handled before making these wide-sweeping actions. I believe kwami does more harm to the site than good and has not shown any improvement on cooperativeness or anything, and I honestly think we should discuss a ban. Vininn126 (talk) 12:10, 23 July 2024 (UTC)Reply
These were not "wide-sweeping" actions. A few edits, which when reverted I either conceded or came here to discuss. How is my coming to the Beer parlour reason to believe that I'm not being cooperative? That's what I've been instructed to do in the past: if there's disagreement on something I want to do, I should come here for a decision before doing anything more. Now you're effectively saying I should be banned because I came here. With one edit, I mentioned the consensus orthography to an editor who reverted me, and they said to go ahead and restore it, but I didn't want to do that without getting an okay here first. That pretty much defines 'cooperative' in my mind. kwami (talk) 12:20, 23 July 2024 (UTC)Reply
Typical behavior. Downplay your actions and make yourself seem innocent. I'm not going to get emotionally involved, I'm merely stating that I see 1) you made massive moves without consulting ANYONE 2) when confronted about you, you accused the other person of being angry while also downplaying your actions. That's how I, an outside observer see the situation. No matter how heated you get I will not change that stance. I've never seen you take responsibility for your actions. You promised to clean up a lot of bad characters like "Old Slavic" and then never did, therefor your "cooperation" is meaningless, and the result of your editing is a mess that is harmful to the site. Vininn126 (talk) 12:26, 23 July 2024 (UTC)Reply
"Massive moves"? We're talking about half a dozen articles, all involving Tahitian, where I was clearly following consensus. If I was inappropriate in my language above, it was because I was annoyed that Chuck could have verified in seconds that those moves were consistent with all other Tahitian entries.
I don't recall promising to clean up "Old Slavic", though I did recently change half a dozen to "Old Church Slavonic", after I'd satisfied myself that's what was intended by my source. (I didn't want to claim it was OCS at first, as I wasn't sure that the sources were referring specifically to that language.) Anyway, if there are other cases of "Old Slavic" that need to be fixed, please let me know, or show me where I made that promise so I can try to reconstruct it from there. kwami (talk) 12:35, 23 July 2024 (UTC)Reply
Vininn126 is referring to the letters which were labelled "Early Slavic", and you only did that after I deleted them for having invalid language headers after many months of no clean-up. Given that you had also used the mul templates, it was entirely unclear what the correct language was supposed to be. Theknightwho (talk) 20:26, 23 July 2024 (UTC)Reply
He's referring to a promise I made, but I don't know when/where that was. I don't recall any notification about those articles until you deleted them, which caused them to pop up on my notice or alert list (because I created them). I don't know if there are others, or how I would have known about them in the meantime. kwami (talk) 21:47, 23 July 2024 (UTC)Reply
I'm sensing a pattern here. Theknightwho (talk) 18:52, 24 July 2024 (UTC)Reply
Okay, Chuck clarified that it was not the moves that he objected to, but rather my lack of edit-summaries, because that obscured the authors of the material. I've now added empty edit summaries at both the source and target articles to clarify, and removed the duplicate material. kwami (talk) 20:49, 4 August 2024 (UTC)Reply

Admin abuses rv rights

edit
Discussion moved from Wiktionary talk:Administrators.

Not sure if this is the right place to report the issue, but Fenakhay (talkcontribs), despite their great contributions, shows a pattern of randomly reverting other users’ additions to entries in various languages without explanation, while ignoring the legitimate questions as to why posted by these users on their talk page – unless one insists, as I have done, and only to get bitter replies; check the page for yourselves. I have made mistakes myself in the past here, but this is hardly in the spirit of collaboration of this project, even less so if it regularly comes from an admin. I wonder if other administrators can do something about it. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 20:38, 18 July 2024 (UTC)Reply

@Benwing, Chuck Entz, EncycloPetey, Hippietrail, Paul G, Ruakh, SemperBlotto, Surjection – tagging the bureaucrats lest this goes unnoticed. The user is continuing the practice and no one is stopping them. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:41, 21 July 2024 (UTC)Reply

Fenakhay's far from the only admin who's abusive. On this project, some of the worst-behaved editors are the admins. They also have a bad habit of never holding each other accountable. Purplebackpack89 13:51, 21 July 2024 (UTC)Reply
Gosh, that doesn’t sound very reassuring, and at least a couple of bureaucrats seem to be ignoring this talk despite notification. Let’s just hope for the better though. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 14:50, 21 July 2024 (UTC)Reply
I'm unsure what you expect to happen. It sounds as though you are unaware of the job responsibilities on Bureaucrats. They are not the police. They have the ability to grant or remove administrator rights from individuals, but they do not do so because one person makes a claim that an admin is abusing their privileges. The community would have to make that decision, then request a bureaucrat to make the change. --EncycloPetey (talk) 16:31, 21 July 2024 (UTC)Reply
I expect fellow admins to look into the user’s behavior and decide what to do about it. Of course I don’t expect any of you to just remove their rights on the sole basis of my claim, but to review them. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 16:37, 21 July 2024 (UTC)Reply
Most (5/8) of the admins pinged above are not very active as admins. B, CE, & S are. This page is not watched much AFAICT. DCDuring (talk) 15:38, 21 July 2024 (UTC)Reply
Aside from SemperBlotto, who I had not noticed last edited two years ago, they have all edited recently, so I expect at least one of them to respond when they log in, considering I pinged them. Of the three you mentioned, Chuck Entz and Surjection have edited after being pinged so they are willingly ignoring this, at least for the moment. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 16:04, 21 July 2024 (UTC)Reply
It's easy for once-active admins to step back to being contributors and lose touch with the dominant line of thinking and personalities. DCDuring (talk) 16:10, 21 July 2024 (UTC)Reply
Keep in mind that these users may have notifications for pings turned off in their preferences. The best way to notify people of a discussion is to write something on their talk page. ArcticSeeress (talk) 17:31, 21 July 2024 (UTC)Reply
Also, it's hard to assess admin behavior when it involves a language one doesn't know well or at all. We don't have (m)any besides @Fenakhay who are both admins and have high level of Arabic knowledge. If one becomes a major contributor in an area to which few others contribute, it is easy to take offense at what one perceives as low-value contributions in the area. An attitude adjustment is required. DCDuring (talk) 18:18, 21 July 2024 (UTC)Reply
The point is another. Just take a look at this edit which is their latest revert involving my contributions. It’s not hard to tell there are no good reasons to perform this revert twice, and if one does that, they should care to provide an explanation. And as I said, the behavior extends to several other languages, not just Arabic. See their talk page and read the way the user has dealt with many others coming to ask simple clarifications. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 18:45, 21 July 2024 (UTC)Reply
I couldn't tell that from the edit. I don't know about the other languages involved either. I think I had interaction with Fenakhay once about something on an Arabic entry with some taxonomic content. I think I was wrong.
Wiktionary tries to "describe all words of all languages". We have entries in more than 4,000 languages, but only about 70 admins, some of whom are not very active, none of whom claim native or advanced knowledge in more than 9 languages, most only one or two, and many are only proficient in English. We don't have all that many active advanced contributors in Arabic. The end result is a governance problem that is hard to resolve without risking bad consequences: losing new, less-expert contributors or losing expert contributors, who may be stubborn, or authoritarian, or .... DCDuring (talk) 19:40, 21 July 2024 (UTC)Reply
Again, it’s not about languages. It’s about behaviorial issues. You can’t just undo my edits, give no explanation, ignore my rationale, and then threaten me with a block because I didn’t stay quiet and expected you to discuss rather than dictate. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 19:55, 21 July 2024 (UTC)Reply
I can't assess the behavioral issue without understanding the dictionary-building issue. DCDuring (talk) 23:23, 21 July 2024 (UTC)Reply
@IvanScrooge98 In fairness, Fenakhay has not given no rationale (see here). It is completely reasonable to revert someone who keeps trying to change the consensus over how a language's entries are laid out, especially when they haven't started a thread about it on the Beer Parlour or at Wiktionary talk:About Arabic, as has been suggested to you. This isn't about ownership - it's about how consensus works. You can't just ignore that, and it's not a good idea to pretend like you're being reverted for no reason, either, since you are obviously aware that there is a reason, even if you disagree with it. Theknightwho (talk) 20:03, 21 July 2024 (UTC)Reply
Since when is completing the etymology of an entry trying to force a different layout or to change the consensus on Arabic? I’m not the one acting like they own the pages. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 20:09, 21 July 2024 (UTC)Reply
I’m sorry, I misinterpreted part of your comment. If you look closely, that was my third attempt at getting an answer, after being ignored above when the user, blinded by a disagreement over one specific matter, performed bulk reverts removing other useful additions. And did not explain. They only gave a partial explanation after they did the same elsewhere, pretending it was all about not cluttering one section when they had actually undone other edits that sought to expand the entry and improve categorization. And they blocked me in the process. See Special:History/مهرگان. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 20:52, 21 July 2024 (UTC)Reply
(See also Special:History/qubbajt, regarding my previous interaction with the user.) [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 21:09, 21 July 2024 (UTC)Reply
While conversations on a user's Talk page can be progress, I would consider trying to get a discussion going on the entry's own Discussion page (infrequently viewed) or at the Tea Room (regularly viewed) if it relates to a specific entry. (Beer Parlour and Wiktionary_talk:About_Arabic have already been mentioned above in connection with broader proposals.) It may be that your proposals would attract a lot of support from other editors, in which case I wouldn't expect ensuing edits (allude to the discussion in the edit summary) to be reverted. Or if support is lukewarm but no opposition, there's a greater onus on anyone thinking of reverting the edit to provide sound reasons. Or maybe there'll be strong opposition, but you learn something out of it. —DIV (1.129.106.197 13:57, 22 July 2024 (UTC))Reply
The thing is I never made any “proposals” (as in changes on the usual layout and such), I simply expanded the entries according to the general practice which the user claims I don’t follow. I didn’t think of opening a discussion to gather consensus on whether a specific entry needs to be expanded or not, that is the very goal of Wiktionary as a whole. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 14:20, 22 July 2024 (UTC)Reply
@IvanScrooge98 I would say:
  1. Please take what Purple says with a big handful of salt; based on their past statements they have a grudge against several admins.
  2. Don't assume that admins or bureaucrats are willfully ignoring a discussion just because they don't say anything, as they tend to have lots of things going on at once and it takes time to formulate responses.
  3. Fenakhay's talk page does show a fair number of complaints about reversions, so you may have a point; but I think you would have more traction if you continued this discussion in the Beer Parlour. This page isn't read much whereas the Beer Parlour is.
Benwing2 (talk) 22:47, 22 July 2024 (UTC)Reply
Thanks and sorry for getting a little carried away but I was and am still very mad that an admin can get away with that attitude. I only noticed later I should have opened this at the Beer Parlour, that’s why I left a notice there. Maybe you can move this thread directly? I don’t have much else to say, I need some feedback from you guys. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 23:05, 22 July 2024 (UTC)Reply
@IvanScrooge98 Feel free to copy the contents of this thread over there and use the {{movedto}} and {{movedfrom}} templates to indicate that the stuff has been moved; that's what I would do. Benwing2 (talk) 23:09, 22 July 2024 (UTC)Reply
losing expert contributors, who may be stubborn, or authoritarian, or...
Wait, are you implying that merely questioning Fenakhay's use of administrative privileges will cause him to leave the project as a whole? Is his true raison d'être as a user the abuse of admin privileges and not enriching this dictionary? Is that what we should understand? 79.147.122.134 10:42, 24 July 2024 (UTC)Reply
after e/c. No. I'm simply explaining why there is not necessarily swift, enthusiastic support for chastising him or whatever it is that you might propose people do. But ChuckE says it better below. DCDuring (talk) 17:08, 24 July 2024 (UTC)Reply
That's not how I read it. The OP asked for us to "do something about it". That "something" could very well be enough to alienate someone if not done right. Any one who does a lot of patrolling of new edits is going to accumulate messages on their talk page demanding to know why their edits were reverted. There are lots of people making good-faith edits who are sadly mistaken about their competence in working with dictionary entries. Arabic languages are tricky, with a writing system for most that leaves a lot to the knowledge of the reader for interpretation, and grammar that's quite foreign to those who speak other languages. There are similar issues with other languages that Fenakhay patrols. There are also a number of known cranks working with those languages who want to rewrite history to make their languages look more important or who insist on removing anything proscribed by their standard. This particular content dispute may be different, but we need to be careful not to read too much into the quantity of protests on someone's talk page. Chuck Entz (talk) 15:03, 24 July 2024 (UTC)Reply
“Do something about it” means stop them from abusing their rights to dictate the content of the entries even when other editors make improvements such as providing further etymological details. I never asked for Fenakhay to be “alienated”, and I’m disappointed that you ignored this talk for days, as well as the issue I raise here, just to reply that you basically don’t care about potential abuses since they are also a competent editor and a useful administrator in other fields (which I never denied—all the opposite). I suppose I’m gonna have to give up and let them bully me; they already told me they are going to block me next time I dare edit some language against the “practice” (or rather, their will). [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 15:51, 24 July 2024 (UTC)Reply
But your edits weren’t even good enough for you to argue for your content instead of against the reverter? Nobody is motivated not to ignore you if you have not shown reflection about the merits and demerits of your bold good-faith edits. If you show a pattern of wavering self-awareness then troubleshooting will also have random patterns within an air of vague threats. All circumstantial that makes interaction unattractive. Maybe it’s you and not him that is bitter? Compare fundamental attribution error against self-serving bias. I have consistently observed Fenakhay to put actual thoughts into consideration, though less explanation. Why don’t I do these supposed mistakes but you do? I haven’t seen myself threatened, though I am obviously retarded. You could think about it, then you might point out double-think if there is any, but you need the ability to view things from different perspectives rather than in automated escalation spirals. Fay Freak (talk) 17:01, 24 July 2024 (UTC)Reply
As I have repeatedly explained, I tried to discuss with the editor, and after being ignored a couple times, they replied pretty harshly and still failed to point at where my edits were not in accordance with their understanding of “practice”, basically expecting me to just stay silent when my contributions are reverted, or not to contribute at all. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 17:16, 24 July 2024 (UTC)Reply

It’s been more than a week since I opened this thread. No one has shown any will to even check a fellow admin’s abuse or warn them about it. Meanwhile one of the admins I pinged here who didn’t even bother respond blocked me for over one day for “disruptive editing”. So consider this thread closed, I guess. Bullying has won. This project needs a fresh restart. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 09:25, 26 July 2024 (UTC)Reply

Look, I will be honest with you. I have had my run-ins with Fenakhay and have some concerns about the way they go about reverting other people's changes. However, when I look at your history I see multiple blocks by multiple admins going back to 2017 for disruptive editing, i.e. editing in languages you don't understand and making a mess of it, as well as edit warring; and I see no evidence that you either understand why what you're doing is bad or promise to not do this any more. This is probably why you are not getting traction on your complaints, not because "bullying has won" or because of any supposed conspiracy to protect admins. Benwing2 (talk) 17:46, 26 July 2024 (UTC)Reply
I saw little interest among other admins in taking this issue seriously even though Fenakhay’s behavior has involved several other editors. When I make a mistake—an actual mistake—I am more than willing to learn from it and repair. But I am not an admin. And when one or more admins seem more concerned about reverting my edits even when they are improvements or in any case are not disruptive, it means the problem is not with my history and, I’m sorry, that is not a conspiracy theory. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 19:00, 26 July 2024 (UTC)Reply
It definitely is with your history; it introduces a bias: “he again”, introducing novelties lacking conspicuous arguments. And now multiple people have looked what was wrong and have been unable to see much, and you are unable to see how they are only able to see that you are unable to provide much—signal for the noise you make—, this amounts to a conspiracy theory. Fay Freak (talk) 20:21, 26 July 2024 (UTC)Reply

Categorization of Borrowed Coined Terms

edit

To use robot as an example, should borrowings of a coined term be labeled as coinages for the borrowing language? i.e. should only Czech have the related categories or should all the languages that are borrowing the Czech term be categorized under the coinage template? Akaibu (talk) 18:22, 23 July 2024 (UTC)Reply

Obviously not. ―⁠Biolongvistul (talk) 18:54, 23 July 2024 (UTC)Reply
No. Fay Freak (talk) 16:23, 24 July 2024 (UTC)Reply
No. — Fenakhay (حيطي · مساهماتي) 17:16, 24 July 2024 (UTC)Reply
Nope. Vininn126 (talk) 17:24, 24 July 2024 (UTC)Reply
nah — SURJECTION / T / C / L / 18:00, 24 July 2024 (UTC)Reply

Obscene gestures

edit

I feel that entries such as middle finger and bras d'honneur should be grouped somehow, such as a Category:Obscene gestures Justin the Just (talk) 15:33, 24 July 2024 (UTC)Reply

No, we don’t even have sets of gestures, only related-to category Category:Body languages, which is small enough. If larger you can use the search function for terms containing both the category and a vulgar label or category. Fay Freak (talk) 16:23, 24 July 2024 (UTC)Reply

Changes to Southern Min romanization

edit

(Notifying Atitarev, Benwing2, Fish bowl, Frigoris, Justinrleung, kc_kennylau, Mar vin kaiser, Michael Ly, ND381, RcAlex36, The dog2, Theknightwho, Tooironic, Wpi, 沈澄心, 恨国党非蠢即坏, LittleWhole):

After some discussion with @TongcyDai, we would like to make the following changes to the current system:

  1. Change the four nuclei ir, er, ee, ere to ṳ, o̤, e͘, o̤e respectively. The first three were agreed upon one year ago.
  2. (This is not a change:) What should we do with the Tones 6 and 9? Tâi-lô uses ǎ a̋ (caron; double acute) respectively, but 台字田 uses ã ă (tilde; breve) respectively.
  3. We would not link the POJ romanization if it uses the above four non-standard nuclei. (We will still label it as POJ.) (What about the two non-standard tones?)
  4. We would link to TL as well. (This would require a bot job to make sure that the existing POJ entries that are also TL are labelled correctly.)

Discussion is welcome. --kc_kennylau (talk) 17:23, 24 July 2024 (UTC)Reply

The author of 台字田 uses æ to represent ee in TL reportedly. Also, using æ is somehow more consistent with other extension letters, which don't use the "Combining Dot Above Right" diacritic. TongcyDai (talk) 17:40, 24 July 2024 (UTC)Reply
Regarding how the ninth tone should be marked, in 2001, 張裕宏 (Tiuⁿ-Jūhông) wrote a treatise called 白話字基本論:台語文對應&相關的議題淺說 (Basic Theory of Pe̍h-ōe-jī: A Brief Discussion on Taiwanese Writing Correspondence and Related Issues), which was the earliest document to propose marking the ninth tone with a breve.
In Tiuⁿ's article 四十年 ê 文字思考 (Forty Years of Orthographic Contemplation), he mentions that traditionally, if one really wanted to indicate the ninth tone, it was generally written with the fifth tone mark. However, he felt that the ninth tone needed its own distinct diacritic, so he invented the method of marking the ninth tone with a breve. This practice has only become common among Pe̍h-ōe-jī users in recent years, possibly due to the popularity of his TJ 台語白話小詞典 (TJ Taiwanese Colloquial Dictionary).
In Pe̍h-ōe-jī input methods, PhahTaigi writes the ninth tone diacritic as a breve, while Lohankha and Gboard use a double acute accent. TongcyDai (talk) 18:00, 24 July 2024 (UTC)Reply
(The FHL input for POJ also uses a breve for tone 9.) — justin(r)leung (t...) | c=› } 18:02, 24 July 2024 (UTC)Reply
I am on board with <ir> (/ɯ ~ ɨ/) and <er> (/ə/) being represented as <ṳ> and <o̤> respectively, as these are the conventions used by ChhoeTaigi as well. (This would also include other rimes with such vowels.) The only Zhangzhou POJ non-dictionary work that I know of, the Acts of the Apostles in Zhangzhou dialect ([7]), uses <ɛ> for /ɛ/. Rev. Douglas' dictionary also uses <ɛ> for this vowel. <e͘> seems to have some traction (e.g., in "The Eclectic Nature of Penang Hokkien Vocabulary, Its Historical Background and Implications for Character Writing" by Catherine Churchman), but it has some confusion within the POJ community, as 張裕宏's POJ extensions use this same graph for /ə/ instead.
As for the tone marks, I am fine with using tilde and breve for tones 6 and 9, following the conventions of 台字田 (for both tones), 張裕宏 (for tone 9) and ChhoeTaigi (for tone 9). (We have already implemented breve for tone 9 in entries.) There is an alternative for tone 6, which is the breve used in Fielde's A pronouncing and defining dictionary of the Swatow dialect, but that would clash with tone 9.
I agree that we should not link to POJ romanization if there are vowels or tones that do not normally appear in POJ (based on Amoy and 通行腔 Taiwanese). We should probably also allow inclusion of any romanized forms (including TL) that are used in writing proper (as opposed to use as pronunciation aids, e.g. in brackets, ruby, or pronunciation guide given in a dictionary). I think the update to what entries are allowed would require a vote, as the previous vote only allowed POJ entries. — justin(r)leung (t...) | c=› } 18:01, 24 July 2024 (UTC)Reply
Firstly, I would just like to say that I am appreciative that so many of the wikt sinitic community is willing to address this question. I really only wish here to share my opinion on the representation of /ɛ/. I can understand the arguments for <ɛ>, and I do believe the historical attestation is a good case for it. However, I strongly encourage the adoption of <e͘> for a few reasons.
My first point is that <ɛ> is certainly a more unusual symbol than <e͘> in the context of these romanisation systems. To my knowledge - and please, correct me if I am wrong - <ɛ> is not really a character used in neighbouring systems in a historical nor linguistic context. <e͘> at least benefits from containing a familiar common letter 'e'. Personally, I would say this way of representing the sound seems more 'natural' and even intuitive to the POJ system - but I understand if people disagree.
This lends to my second point which is that <e͘> is a sensible pairing for <o͘>, lending consistency to the diacritic as a marker of openness in the quality of the vowel (in mainstream Hokkien dialects which have the sound, of course.) Furthermore, as <o͘> is a much more well-established orthographic norm than any solution used to write <ɛ>, I think having the representation for it conform to <o͘> is a good idea.
Finally, I do get where the concern that it could be a little confusing comes from. On the flip side, this is a rarely written sound in POJ either ways. I do also think that as we are here trying to introduce a form of extended POJ based on but not entirely beholden to historical and attested forms of various romanised systems, we do not really need to account for all current potential interpretations as we are not trying to create a unified POJ which accounts for all existing variations - I hope consistency is also valued next to attestation; and to that end I believe <e͘> has both whereas <ɛ> is more about the historical aspect.
I would not oppose <ɛ> if it the rest of the editors believe it to be the superior choice. But I hope my arguments are at least thoughtful enough to warrant consideration. Eyteo (talk) 10:19, 27 July 2024 (UTC)Reply
@Eyteo: Thanks for these points, which are all very good points. I do think internal consistency and overall "aesthetic" of the system are important considerations for sure. While I still hold to the issues that <e͘> has, I can see <e͘> as a competitive candidate here. I do think, though, that in principle, the Wiktionarian spirit should be weighing more heavily on "attestation" whenever possible. On this ground, I still think <ɛ> has an upper-hand as it has attestation in running text (alongside <o͘>, which is what we're comparing it to for "consistency"), rather than just being an idea that is only attested in reference works. That said, it's a close margin, so I think it's still hard to decide between the two options here. — justin(r)leung (t...) | c=› } 15:56, 27 July 2024 (UTC)Reply
@Justinrleung, TongcyDai: I would also like to bring to the discussion the following finals:
  • Found in the Longyan dialect: ēe iēe ōee iōa iōaⁿ;
  • Found in the Penang dialect (mostly for loanwords from Cantonese, except the first one): ōiⁿ e̍ek ēeng ēi eōi īng ōi ōu u̍k ūm ūng ȳ ȳn.
In particular, please note the tone placement. I suppose whatever proposed replacement for "ee" will also apply for these finals, but what about the other finals? --kc_kennylau (talk) 15:27, 26 July 2024 (UTC)Reply

Arabic-script affixes

edit
Discussion moved from Wiktionary:Beer parlour/2024/April#Arabic-script affixes.

As we all know, Latin-script affixes are lemmatised with a hyphen (such as -ed); what about affixes in Arabic script? Currently, Arabic affixes are lemmatised in plain form without any hyphen, but get a connecting character in the headword (such as at ون) or do not get one (such as at وت). Ottoman Turkish affixes are lemmatised with the connecting character in the pagetitle (such as ـدن). Pashto affixes randomly do or do not get a hyphen (such as -تون versus تون). What is the correct way to do this? MuDavid 栘𩿠 (talk) 02:43, 17 April 2024 (UTC)Reply

@MuDavid I agree the current situation is a mess and should be harmonized. I would propose lemmatizing using the tatweel sign like in Ottoman Turkish, or if that is rejected, lemmatizing at the form without connector but placing the tatweel in the headword (like ون). Benwing2 (talk) 21:35, 17 April 2024 (UTC)Reply
@MuDavid, Benwing2:   Support lemmatising using the tatweel sign. 0DF (talk) 15:37, 21 April 2024 (UTC)Reply
I’m fine with the tatweel (I’ll try to remember the word) in lemma and headline. But someone should then clean up the mess… MuDavid 栘𩿠 (talk) 00:57, 24 April 2024 (UTC)Reply
@MuDavid, Benwing2, 0DF: Ottoman Turkish is lemmatized like that because I started filling the language late and correctly—recently mostly @Samubert96 is doing the job—, as with ASCII hyphens the entry links become insufferable aesthetically and BiDi-wise, so we employ the so-called tatweel (a term barely attested in Arabic, and if so borrowed from the Unicode chart). The Arabic language pages, less so the Persian ones if I have observed correctly, were just kept were they had been created in Wiktionary’s dark ages, with the language-specific module adjustments in order not to change existing pages much. I have noticed greater problems in 2019 and performed some complicated considerations due to Persian distinguishing connecting and non-connecting suffixes, after which Erutuon (talkcontribs) added the relevant capabilities to the modules, but whether the intended system is intelligible is open to critique. Fay Freak (talk) 01:44, 24 April 2024 (UTC)Reply
@Fay Freak I did a lot of work on Module:affix to correctly support the different ways of handling affix indicators in Arabic-script languages; it's code I'd gladly get rid of if possible. If it's agreed to follow the Ottoman Turkish approach, I can clean up the other languages without too much difficulty, I think. Benwing2 (talk) 01:47, 24 April 2024 (UTC)Reply
@Benwing2 @MuDavid @Fay Freak The second option seems more preferable and "standard" to me, since the tatweel really isn't a counterpart of the hyphen. It should only appear in the headword to indicate affixes. And most languages using the Arabic script are already following it. - Ash wki (talk) 12:14, 4 May 2024 (UTC)Reply
@Ash wkiStatus-quo bias. Do you achieve the same opinion if you ignore the situation of the Arabic-script pages we have? Though I also wonder where the idea that the “tatweel” is equivalent to the hyphen as a delimiter comes from, not to say this is another etymology question. The BiDi and CTL behaviour is advantageous though. Fay Freak (talk) 12:35, 4 May 2024 (UTC)Reply
@Fay Freak I mean if we do use the Tatweel in the page title, then we are technically considering it as the Arabic script couterpart of the hyphen. And I'm not using the status quo to justify the practice. Just saying it's easier than having to, as @MuDavid said, clean up the mess. - Ash wki (talk) 12:59, 4 May 2024 (UTC)Reply
@Ash wki: Thanks for clarification. If somebody has volunteered to clean it up then it hardly matters anymore. We make decisions for decades coming. Right now there is a mess and multiple okay options which we weigh, with different amount of work involved only up to the point of achieving a final uniform result. Fay Freak (talk) 13:08, 4 May 2024 (UTC)Reply
@Fay Freak @Ash wki Yeah, cleaning it up either way is not a big deal. Benwing2 (talk) 03:40, 5 May 2024 (UTC)Reply
@Ash wki, Benwing2, Fay Freak, MuDavid: So, what was the resolution in the end? 0DF (talk) 18:27, 21 May 2024 (UTC)Reply
@0DF I think we should follow the Ottoman Turkish approach everywhere. If others are in agreement I will proceed. Benwing2 (talk) 18:51, 21 May 2024 (UTC)Reply
@Benwing2: Splendid. 0DF (talk) 19:37, 21 May 2024 (UTC)Reply
I also agree with the Ottoman Turkish approach. MuDavid 栘𩿠 (talk) 00:44, 22 May 2024 (UTC)Reply
  Oppose. Unfortunately, I did not see this conversation until recently and didn't get a chance to read it until now. I only have two points to add:
  • Persian dictionaries generally match the current practice for Arabic affixes and list affixes without a hyphen or a kashida/tatweel (i.e. affixes are listed naked). And, FWIW, I don't think a kashida or a hyphen should even be in the header, since that does not match how Persian dictionaries list affixes. (I've never seen a dictionary use a kashida for anything other than as a place holder for diacritics. Though, perhaps @ZxxZxxZ could tell us if he's aware of any exceptions?)
  • It is actually quite difficult to type a kashida on a Persian keyboard. I assume this is also true for Arabic (though I'm not sure). Thus, lemmatising with kashida may cause accessibility issues, since it would be difficult for readers to type the necessary characters to look them up.
I do want to apologize to @Benwing2, since I did not read this discussion until after he had already done so much work. Perhaps we should put a pause on this and reopen this discussion in this months beer parlor with relevant editors tagged? I'm sure there are languages specific concerns that I am not aware of. — BABRtalk 07:42, 25 July 2024 (UTC)Reply
@Babr FWIW It's easy to type a kashida at least on the Mac OS Arabic keyboard; it is on the ` key. Also if you look up any of the Persian suffixes or prefixes without the kashida, the first thing listed under {{also}} is the version with the kashida. As for whether/when to use a hyphen or kashida, longstanding practice is to use a kashida, hyphen or ZWNJ to mark a prefix or suffix in {{af}} and the like; otherwise it's impossible to distinguish affixes from non-affixes. Until recently, this was stripped when generating links for some languages (e.g. Persian, Arabic) but not others (e.g. Pashto, Ottoman Turkish). Consistent with the stripping or non-stripping, the term needs to be lemmatized appropriately; e.g. Ottoman Turkish affixes are generally lemmatized using the kashida. Anyway, I will not make any changes to other languages (I've only changed Persian and Pashto, and the latter was an utter mess before); please start a new BP discussion, pinging the participants of this discussion as well as the relevant Arabic script editors (I do not know who they are other than you for Persian and Fenakhay for Arabic; you might try using {{wgping}} although the relevant lists for Arabic, Persian, Urdu and Panjabi are large and contain several editors who are not currently active). Benwing2 (talk) 08:28, 25 July 2024 (UTC)Reply
Arabic-script editors were not notified of this conversation so notifying editors of Arabic script languages (as listed in module:workgroup ping):
Editors in the Arabic list
(Notifying Alarichall, Atitarev, Benwing2, Esperfulmo, Erutuon, عربي-٣١, Fay Freak, Assem Khidhr, Fenakhay, Fixmaster, Roger.M.Williams, Zhnka, Sartma):
Editors in the Urdu list
(Notifying AryamanA, ImprovetheArabicUnicode, Kutchkutch, Notevenkidding, RonnieSingh, Svartava, نعم البدل):
Editors in the Persian list
(Notifying Ariamihr, Atitarev, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Saranamd): BABRtalk 08:46, 25 July 2024 (UTC)Reply
  Support. Affix entries in scripts with spacing between words should have hyphens or their in-script equivalent in the lemma entry. In fact, the lack of the tatwil has discouraged me from editing any affix entries.
The tatweel is very easy to type on iOS for me so that is not an issue personally.--Saranamd (talk) 03:35, 26 July 2024 (UTC)Reply
But I have not yet seen any dictionaries use tatweel as a hyphen, at least, none of the dictionaries I regularly use do so; The only usage I could find in dictionaries was as a placeholder for diacritics. I'm happy to reconsider my position if there are actually dictionaries that do this, but currently, I am not aware of any. — BABRtalk 21:30, 28 July 2024 (UTC)Reply
Linux and BSD contain the tatweel in the default Persian keyboards. In the file /usr/share/X11/xkb/symbols/ir the following line key <AE11> { [ minus, Arabic_tatweel, underscore ] }; provides for it being accessible via the shift key + the super-accessible key right of the 0 key.
The line include "nbsp(zwnj2nb3nnb4)" adds ZWNJ, no-break space and narrow no-break space.
The Arabic layout includes the tatweel on the same place and ZWNJ, NNBSP, and BiDi characters on more obscure places, given that they are more exclusionary, but settings from the /usr/share/X11/xkb/symbols/nbsp file and others can be added in common desktop environments to modify the behaviour of space keys and other control keys as defined by keyboard layouts, for instance I have disabled caps-lock this way and only activate the functionality via both shift keys, or you specify another modifier key than Shift and AltGr. Fay Freak (talk) 09:06, 25 July 2024 (UTC)Reply
FWIW on mobile devices (incl mine) its three layers deep. It's not accessible on the main keyboard at all and on the secondary keyboard it doesn't even have a visible key, the key for it is hidden under the key for the tanwin diacritic.
keyboard -> numbers & symbols keyboard -> tanwin (press and hold).
Yeah, sorry, I guess that still typable, but it's super out of the way and is literally hidden with other infrequently used characters (diacritics). And we are assuming that the average reader (who, statistically, is significantly more likely to be using a phone than an editor) knows how to find kashida, which may be true for all of us, but it's not true for a lot of average people! Imagine if we required English speakers type § to find suffixes, because § is just as hidden as kashida for me! — BABRtalk 09:59, 25 July 2024 (UTC)Reply
Mobile OS touchscreen editing is relatively insufferable in general due to hidden characters as you describe, mostly due to curly brackets and pipe letters and even equal signs in templates, though this is less relevant for other wikis as Wikipedia which are less template-heavy. I have only ever done it if I sat in the library and yet found a relevant quote but had nothing else with me for the variety of not looking upon screens, after using the library’s databases at home, otherwise anyone has a notebook computer there. I figure though the equipment may be different in developing countries, and there you get this practice in your community in the US even. For this one character you can do it though, at least via the Special characters in the editing box of Wiktionary → Arabic → there is the tatweel. Alternatively you can just use {{suf}} which does not even require a delimiter since it is implied by the template name and argument position. Fay Freak (talk) 10:25, 25 July 2024 (UTC)Reply

It's complicated to support the tatwil mark verses a hyphen. Tatwil aims at forcing the characters to appear in the connected form, however, not all characters are connected, like ⟨ء⟩. Imagine using the tatwil, ⟨ـء⟩. If we used the hyphen, we need to make sure it appears on the right ⟨-ء⟩, not ⟨-ء⟩. --Esperfulmo (talk) 13:47, 25 July 2024 (UTC)Reply

Category:en:Kimchi

edit

This is a completely useless category, I propose we remove it. Vininn126 (talk) 11:20, 25 July 2024 (UTC)Reply

You just need to go to w:Category:Kimchi and create the terms therein. Due to Category:ko:Kimchi we can’t delete it cleanly. Fay Freak (talk) 12:06, 25 July 2024 (UTC)Reply

Setting MediaWiki:Gadget-PagePreviews as a default gadget

edit

After several changes and tweaks the gadget is in a pretty finalized state. If no one minds, in a few days I'll set it as a default gadget in order to conduct a larger-scale test. Ioaxxere (talk) 17:27, 25 July 2024 (UTC)Reply

No issues. Benwing2 (talk) 02:18, 26 July 2024 (UTC)Reply
@Ioaxxere: MediaWiki:Gadget-PagePreviews is now in CAT:E because you included executable wikitext in a comment and MediaWiki doesn't recognize JS comment syntax. Please fix it. Chuck Entz (talk) 05:28, 26 July 2024 (UTC)Reply
It's probably a good idea in general to put //<nowiki> at the top and //</nowiki> at the bottom of every JS page of any complexity- just to be safe. Chuck Entz (talk) 05:44, 26 July 2024 (UTC)Reply
@Chuck Entz: fixed. Ioaxxere (talk) 06:09, 26 July 2024 (UTC)Reply
  Defaulted, enjoy. If you don't like you can easily disable it in your preferences. Ioaxxere (talk) 02:28, 31 July 2024 (UTC)Reply
Great job! Looks nice and does its job. I checked your script on the page that I used to test my popup script, and your script handles most of the cases correctly.
I have several critiques though – two critical and two less so:
1. It's way too easy to accidently hover over a link inside a popup while scrolling, thus triggering opening a new definition instead of the existing. I would recommend to turn off that behavior as introducing too much noise, volatility, and possible frustration for the user when they accidently lose the definitions they were reading.
2. (That's a delicate matter – I will message you the details privately not to give w:WP:BEANS advices and post here after it's solved.)
3. I would recommend to make use of Wikimedia's design system, Codex, including colors and icons (e.g. the popup uses a book icon very similar to  ).
4. It is best to use classes and not inline styles, both for the easiness of development and the ability of third parties to customize and reuse your code. Since it is a gadget now, you can create a .css page for it. JWBTH (talk) 03:39, 31 July 2024 (UTC)Reply
Also, it is advisable to take web accessibility into account and use HTML elements with the appropriate semantics (e.g. <h2> for headings, resetting styles on them if needed) and/or put ARIA roles (e.g. role="tooltip" on the tooltip element itself). JWBTH (talk) 12:27, 31 July 2024 (UTC)Reply
Now that (2) is solved – there was a XSS vulnerability allowing the attacker to run arbitrary JavaScript in the user's browser. For other script authors: please make sure you don't put unsanitized user input (e.g. from HTML attributes or the page's rendered text) into the page's HTML. JWBTH (talk) 17:50, 31 July 2024 (UTC)Reply
I notice that it tries to display the section links in diff headers, which never works. Chuck Entz (talk) 05:52, 31 July 2024 (UTC)Reply
I'm impressed by this advance!
I share a few simple observations.
  • The new Wiktionary previews work in the SeaMonkey browser (whilst Wikipedia previews do not). Nice one, Ioaxxere!
  • I agree that having previewable links within a preview (JWBTH's first point) is potentially problematic. Potential solutions:
    • Turn off that specific behaviour for everyone by stripping all hyperlink markup out of the previewed text. (Wikipedia previews appear to do this.)
    • Turn it off by default, but available to be manually enabled.
    • Implement a (longer?) time delay before accessing previews of terms within an existing preview.
    • Retain clickable hyperlinks within the preview, but don't make them previewable. As a first impression, this last option would be my preference.
  • There is no preview for Wikipedia links. Sorry this might be way outside of your scope, but I am just thinking of users who see a blue hyperlinked word/term (without the little box+arrow icon) and find the preview behaviour a bit erratic. Currently a tooltip is displayed in place containing "w:..."; perhaps that tooltip could be tweaked to say "Wikipedia" rather than just "w"? To be fair, there is already also a URL shown in a kind of status bar tip (look may be browser-dependent).
  • There is no preview for external (non-Wikimedia) links. That's fine, and as expected.
—DIV (1.145.13.161 07:09, 31 July 2024 (UTC))Reply
Two technical gripes:
  1. Language metadata (tags) are stripped from e.g. usage examples,
  2. {{tlb}} are not shown at all (at least when they are on or after the headword line).
SURJECTION / T / C / L / 15:21, 31 July 2024 (UTC)Reply
Yeah, imho it also should display the headword line as that holds crucial information for many languages. And I'd also support simply turning off previews within previews. AG202 (talk) 15:53, 31 July 2024 (UTC)Reply
Replying to a few people:
  • @JWBTH: I don't really like the book icon... why is it half white and half black? But if you have any specific suggestions for improving accessibility I can implement that.
  • @AG202: Headword lines are tricky to work with because they sometimes contain very important information (gender, conjugation type) and are sometimes very long and repetitive. Imagine trying to squeeze wake up and smell the coffee into a preview! So I'm open to any suggestions on this front. Also, I don't want to turn off preview-in-preview because it is very useful when you get to a page like shish kebob, defined as "alternative spelling of shish kebab". However, I did increase the hover time to make it harder to open a preview by mistake (DIV's third suggestion).
  • I implemented previews for Wikipedia articles since it seemed like a good idea!
Ioaxxere (talk) 22:20, 31 July 2024 (UTC)Reply
I don't think it's necessary to display the headword line, at least definitely not the forms. Leaving out tlb's however is dangerous, because they can contain labels like "derogatory". — SURJECTION / T / C / L / 10:40, 1 August 2024 (UTC)Reply
I imagine you're right about the tlb's, but it might be more evocative if you could cite an example. I found things like gratia gratiam parit and abstruse, where missing the tlb is not quite as big a deal. Presumably you have in mind situations where the reader might otherwise be unaware of the sensitive nature of the word (so presumably the reader has originally looked up some 'innocent' word). ——DIV (1.129.111.243 15:14, 1 August 2024 (UTC))Reply
{{tlb}} is common in Finnish entries. Category:Finnish derogatory terms has plenty of examples. — SURJECTION / T / C / L / 06:43, 2 August 2024 (UTC)Reply
Wow, quick work! Making the in-preview hover-time delay a bit longer may be a very fair compromise; from my brief experience after your adjustment I think it will indeed be considerably less likely that previews of previewed terms will be activated by mistake. And you're right that there would otherwise be quite a few rather trivial previews (other examples: "plural of ..." or "past participle of ..."). I had naïvely assumed that implementing previews of Wikipedia content would be a separate massive project; apparently this is already working quite well :-) Thanks. —DIV (1.129.111.243 14:52, 1 August 2024 (UTC))Reply
I don't think it is a fair comprimise unfortunately. This is no good:
ingest
 
Besides, if the user scrolls, they generally don't pay attention where their cursor ends up, and it often appears above a link thus triggering a reload. Having to move the mouse out of the way of potential links while scrolling is not a good UI. I'm afraid a popup-in-popup is the only sound solution here if the behavior is kept. You can borrow some logic for handling such cases from w:MediaWiki:Gadget-ReferenceTooltips.js (e.g. when I child popup is hovered, parent popups shouldn't disappear, which is done using .upToTopParent() method). JWBTH (talk) 15:18, 1 August 2024 (UTC)Reply
FWIW, I quite like that I can go from page to page in the preview (if I need to go back to the first preview, I can simply unhover from the original word and repreview it). In fact, it's one of the features I found especially helpful (it's a much faster way of looking up glossary terms, for instance!). So it's worth keeping, at least as an option for some users. Or perhaps another compromise could be to have a little back arrow to the previous page? Andrew Sheedy (talk) 21:24, 1 August 2024 (UTC)Reply
@Surjection: I have added support for {{tlb}} in previews — see polski#Finnish for an example. @JWBTH: I have added a temporary lock which takes effect as the popup is moving down and prevents the issue you pointed out. @DIV: Yes, the code for scraping Wikipedia articles is extremely simple as I decided not to implement section previews. The main issue was actually dealing with interwiki links, which are so complex that there are probably many edge cases that aren't handled properly. The new version seems to be working okay for now... Ioaxxere (talk) 03:02, 4 August 2024 (UTC)Reply
Minor procedural note that if this is going to be on by default, the little "on by default" text (as seen in e.g. MediaWiki:Gadget-RhymesAdder) should be added to the summary. (So that people can tell which gadgets they've turned on vs we've turned on for them, and so they can tell which ones to turn back on to get back to 'default settings' if they have to turn all their gadgets off to troubleshoot cases of one gadget not playing nicely with another.) - -sche (discuss) 19:57, 5 August 2024 (UTC)Reply
  DoneSURJECTION / T / C / L / 21:39, 5 August 2024 (UTC)Reply

Plural noun/adjective/verb etc. categories

edit

Recently "noun plural form" categories have been deleted. While I understand this, it also begs the question what to do with certain langauges that do distinguish a special "plural" form that is not a simple inflection, but rather a derivation.

Some examples of this I know of are:

  • Afar - forms its noun plurals in a myriad of unpredictable ways, including various suffixes and root changes, with multiple possible per word (cf. búuk m (book) which has the plural forms buukitté f, buukwá f and abwáak m). Note also how the gender is not connected to the gender of the singular.
  • Creek - forms its noun plurals only for a handful of words, non-productively, but through a range of different suffixes (cf. wvcenv (white man) and wvcenvke but eppuce (son) and eppucetake). Creek does not have declension, but it has possessive inflection. Furthermore, Creek also features dual and plural verbs, which are also all derived but semantically related.
  • Tokelauan - has plural verbs for a small number of verbs, e.g. tautala and tāutatala. Again, unpredictable and not regularly formed.

Placing these in the "noun forms" or "verb forms" categories does not seem correct - there are many other forms that are part of a regular paradigm (with the exception of Tokelauan, where you don't generally have inflection at all). On the other hand, these are not lemmas, they are hosted under a lemma entry in print dictionaries and not considered a separate word by the speakers.

How should we handle these cases? Thadh (talk) 11:02, 26 July 2024 (UTC)Reply

None of the facts mentions sounds like it absolutely compels treating these forms as derivations rather than inflections. So if speakers and other dictionaries treat them as non-lemmas, I don't see why we should do any differently. Languages can have multiple plural forms or plural forms that have a separate grammatical gender from the singular form (e.g. Italian braccio, braccia and bracci). There can be cases of inflection that are non-productive and only marked on a small number of words: e.g. in English be has some special inflected forms that don't exist for other verbs, like the distinctively first-person "am". I can see how you might argue that Muscogee/Creek plurals are derived rather than inflected forms (by analogy, pairs of nouns like "actor/actress" are not treated as forms of a single noun inflected for grammatical gender, but as separate lemmas), but it doesn't sound completely clearcut to me.--Urszag (talk) 11:25, 26 July 2024 (UTC)Reply
@Urszag: Italian, unlike Afar and Creek, doesn't have any other type of noun inflection, so there is no need in a separate category. The latter two, however, make a clear distinction between case/possession and plurality. This is why I'm not sure it's a great idea to just dump all forms into a single category. Thadh (talk) 11:30, 26 July 2024 (UTC)Reply
It seems no different to me since all forms are dumped into a single category just as much in languages where plurality is uncontroversially inflectional, such as Latin (with Category:Latin noun forms). The considerations mentioned at Wiktionary:Beer_parlour/2024/May#get_rid_of_noun_and_adjective_plural_form_categories_once_and_for_all seem to apply about the same: these are said to be "trivial category intersections" per Theknightwho and users are expected to know how to use the intricacies of the search system to find them (I know how to search for categories using "incategory", but I'm not sure how to make a search find only pages that use Template:plural of).--Urszag (talk) 11:47, 26 July 2024 (UTC)Reply
@Urszag: Again, I think you misunderstand the point. There are languages out there that make a distinction between inflection and plural formation, as two distinct processes, with the plurals inflecting just like singulars do. Latin does not make this distinction - nominative plural is just another form of the lemma. The languages I mentioned above do - abwáak is not a "predicative plural" of búuk, it's the plural, with its own, proper inflections. I think it's important to reflect this distinction in our categories. Thadh (talk) 13:19, 26 July 2024 (UTC)Reply
You're right that I don't understand the point that you're making. When languages inflect nouns for categories other than number, it's normal for those inflections to be orthogonal to number. I don't see how that is relevant to the question of whether number marking is or isn't inflection. Latin singular nouns inflect for case (nominative, accusative, dative., etc.) and Latin plural nouns have separate forms, inflected likewise for case: e.g. it isn't really incorrect to say that inflecting equī for accusative case gives you equōs. So hypothetically, you could treat Latin plural noun forms as a separate lemma with their own parallel paradigm (compare fraces), although it isn't traditional to do so, and since plurals are systematically distinguished and generally predictably formed in one way, it's pretty convincing to treat plural forms as part of a unified paradigm with singular forms, which are all lemmatized by convention at the nominative singular form. If number marking in non-Latin languages is not inflectional, the reason for adopting that analysis won't just be because plural nouns inflect for case (or some other category) the same way as singular nouns do: all that means is that the case markers are agglutinative rather than fusional in relation to number marking. Inflection can be agglutinative as well as fusional. Turkish marks plural nouns with an inflectional suffix, and also inflects them for case with the same endings as singular nouns (e.g. nominative singular ev, ablative singular ev-den; nominative plural ev-ler, ablative plural ev-ler-den).--Urszag (talk) 14:33, 26 July 2024 (UTC)Reply
@Urszag: Okay, let me rephrase.
  • English has the words "tree", "grove" and "forest".
  • Each of the two words inflects for the genitive: tree - tree's, grove - grove's, forest - forest's.
  • Now imagine English forgoes the need of number marking for suppletion: instead of tree - trees you get tree (singular) - grove (paucal) - forest (plural).
I am proposing that we don't handle "grove" and "forest" as the same type of noun form as we do "tree's" and "grove's".
Now you'll say: "But Thadh, this is suppletion, not derivation! There is no common stem!" - but what about things like caravanist - caravan, grape - grapevine, rose - rosery? I'm sure there are better examples.
Now, do you agree that this system is fundamentally different from Latin or Slavic languages? It's not about agglutination, Afar isn't agglutinative, and Creek is bordering on polysynthetic - I work with Uralic languages which are about as agglutinative as can be, but they all form their plurals within a paradigm, rather than outside of it.
The Afar or Creek plurals simply don't form a paradigm with the singulars - they form a semantic pair. Thadh (talk) 14:52, 26 July 2024 (UTC)Reply
I just don't think any facts purely about the form of such words are relevant to the question of whether they are derivational or inflectional. I think I would agree about such plurals being derivational/lexical if singular and plural nouns don’t show any distinct syntactic behavior. For example: do morphologically plural nouns act as triggers of plural agreement on other words, such as verbs/adjectives/determiners? When distinct plural forms exist, is their use obligatory in contexts where the noun is semantically plural, or can the non-plural form be used in this context (as it can for nouns that have no distinct plural form)? That is, is a word like eppuce specifically singular, or undefined for plurality? In the case of English, one reason for categorizing "people" as a suppletive plural form of "person" is that it is practically required for most speakers in natural speech to replace one with the other depending on the context, e.g. "one person" alongside "three people": this suggests that they do not function as distinct words but as distinct forms of one word.--Urszag (talk) 03:39, 27 July 2024 (UTC)Reply
@Urszag: Creek doesn't have a third-person plural marker at all. But yes, the plural is obligatory when present and a numeral is used. I don't think that can be taken as evidence for anything though: English also has singular-only nouns, like "one news" being correct but "two news" being nonsensical.
Afar has, next to the singular and plural, also the collective and the collective plural, which are just as related to the singular and plural as those are to each other, but take the singular and plural agreement respectively. Some nouns have collectives, others don't, some nouns have plurals, others have both plurals and collectives, and a third one has plural collectives but no plurals.
Tokelauan is a bit of a case apart, since it doesn't really have any other kind of morphology at all. Thadh (talk) 12:56, 27 July 2024 (UTC)Reply
@Thadh: Which of the criteria for inflection vs. derivation do these plural formations break? --kc_kennylau (talk) 15:19, 26 July 2024 (UTC)Reply
@Kc kennylau: If you're referring to Haspelmath & Sims (2002), as you did on Discord, then many of the "differences" between inflection and derivation as given there are pretty nonsensical: Anything forms a "new concept", including inflection, anything is "limited" by semantics and imagination, and while plurality here is more concrete than the other types of inflection, that's true for any language with plural forms, including Latin.
I think I've stated it above pretty well: The plural forms in these languages do not form a paradigm, they form a semantic pair (and in the case of Afar, which can have collectives and plural collectives, a whole semantic cluster). In the case of Creek and Tokelauan, it these are also closed classes, unlike inflected forms (with Afar it is slightly more open). Thadh (talk) 15:31, 26 July 2024 (UTC)Reply
I'm gonna add to this just to say that I have discussed this with Thadh on Discord and fail to see why we need to treat Afar plurals as derivations rather than inflections. They appear to work much like broken plurals in Arabic, which we (rightly IMO) treat as inflections not derivations. I can't speak to Creek or Tokelauan as I haven't looked into them. Benwing2 (talk) 17:58, 26 July 2024 (UTC)Reply
Spitballing: for Creek and Tokelauan, if only a small number of words have plurals, it might be useful to categorize the main entries (the singulars?) as "Creek nouns with plurals", "Tokelauan verbs with plural forms" (or whatever better name we might decide on), similar to how we categorize Category:English nouns with irregular plurals, Category:Arabic nouns with long construct singular, etc. Alternatively (or additionally), I am not personally against categorizing "Tokelauan plural verbs" or something, and I could even get behind allowing individual languages with unusual pluralization situations to have "Foobar noun plural forms" added by the headword/definition templates systematically, if other people think it's appropriate; I just don't think most languages need such categories, and don't think they should be haphazard and often(?) manually added like before. - -sche (discuss) 22:00, 26 July 2024 (UTC)Reply
I agree with the last point, I think we need either some leeway for these kinds of languages or another way to handle them while keeping the plurals distinct in the categories. But indeed using "noun plural form" for Latin or Finnish makes no sense. Thadh (talk) 22:05, 26 July 2024 (UTC)Reply
No objection to this approach (i.e. to something like "Creek verbs with plurals" for the lemmas and/or "Creek plural verb forms" for the non-lemma plurals). Note that for example we categorize both the English nouns with irregular plurals and the plurals themselves. Also cf. 'LANG comparative adjectives' and 'LANG comparative adjective forms'. These can be implemented as lang-specific categories. Benwing2 (talk) 22:07, 26 July 2024 (UTC)Reply
Either of those sounds fine to me too.--Urszag (talk) 03:39, 27 July 2024 (UTC)Reply

Vote now to fill vacancies of the first U4C

edit
You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Dear all,

I am writing to you to let you know the voting period for the Universal Code of Conduct Coordinating Committee (U4C) is open now through August 10, 2024. Read the information on the voting page on Meta-wiki to learn more about voting and voter eligibility.

The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community members were invited to submit their applications for the U4C. For more information and the responsibilities of the U4C, please review the U4C Charter.

Please share this message with members of your community so they can participate as well.

In cooperation with the U4C,

RamzyM (WMF) 02:47, 27 July 2024 (UTC)Reply

Mass adding of request templates by User:Akaibu

edit

User:Akaibu has been mass-adding a bunch of "request" templates to entries, such as thermal rocket (diff), drotebanol (diff) and even alternative forms such as wakeel.

I think this type of editing is distinctly unhelpful. Request templates should only be used where the request seeks to fill a real gap in our entry (e.g. only use {{rfp}} on a term whose pronunciation is not obvious, or {{rfquote}} where the context or usage of a term is not clear). Worse, alternative form entries do not need their own pronunciation and etymology sections if they are merely alternative spellings, so these should definitely not be requested.

I'd like to mass revert all these edits. This, that and the other (talk) 02:48, 27 July 2024 (UTC)Reply

I agree. There are some helpful additions as well (e.g. this diff also added Category:en:Tools), so any mass-revert should make sure to keep things like that. Theknightwho (talk) 03:00, 27 July 2024 (UTC)Reply
I've manually removed all but rfquote templates to the alternative form entries I've done, as those are still valid for those entries, as for the others, i don't think they should be reverted on basis of "being obvious" as what's obvious to a native speaker may not be obvious to non native speakers. Akaibu (talk) 03:32, 27 July 2024 (UTC)Reply
@This, that and the other, Theknightwho, Akaibu: Adding {{rfp|en}} to drotebanol seems fine to me, since its pronunciation isn't all that obvious (I assume /dɹəʊˈtɛbənɒl/, for the record). However, I reverted the edit to thermal rocket; I hope you will all understand and agree with my edit summary. 0DF (talk) 06:26, 27 July 2024 (UTC)Reply
Yes, actually this is a good point. On looking at the non-altform terms to which Akaibu added {{rfp}}, I wouldn't know how to pronounce many of them! But the {{rfquote}} was not really warranted in my opinion. This, that and the other (talk) 13:38, 27 July 2024 (UTC)Reply
A bit of explanations about the request templates and where/when they are appropriate would be useful in WT:FAQ or even in WT:EL. Because I can imagine that many newcomers may assume that request templates look like generally useful additions, especially when creating new barebone entries themselves. Also the documentation of each individual request template, e.g. Template:rfap, could explicitly mention something in the "Usage" section about the fact that they are not always desirable. --Ssvb (talk) 13:12, 27 July 2024 (UTC)Reply
A little perspective: there are literally millions of entries where one or more requests could be appropriate in theory, but a very small number of volunteers who actually work on filling requests. The idea of the request templates is to let those volunteers know that someone is interested in particular entries so they can set their priorities. Mass adding of requests defeats this: if everything is a priority, nothing is a priority. Some requests are added by people who genuinely want or need to know- why should the volunteers ignore those in favor of the mass-added ones? Chuck Entz (talk) 15:18, 27 July 2024 (UTC)Reply
Re: requests for quotes, especially for unique strings like swamp Spanish oak, let's teach: User:Akaibu, if you go to Special:Preferences#mw-prefsection-gadgets and turn on "Quiet Quentin (QQ), a gadget assisting in finding citations (quotations)", it adds a "QQ" tab to the top of each page, which you can use to easily search and copy-paste quotes from Google Books. Use this to find and add quotes yourself! Or if you can't find any quotes: maybe the term doesn't meet CFI and needs to be RFVed. :) - -sche (discuss) 17:36, 27 July 2024 (UTC)Reply
Re: requests for pronunciation, I second Chuck (and I know Anatoli has said similar), it's a good idea to learn what's a useful request and what's not; multiword terms are usually silly to request pronunciation of because it's usually just "like foo + bar" (and in the past I sometimes fulfilled the requests like that; now I sometimes remove the requests).
More generally, re: how to make the requests categories feel more 'doable', I wonder if we could take some inspiration from Wikipedia: for one thing, they have a bot that dates request tags and categorizes accordingly so we could see which requests were oldest, and work on subcategories divided by month, which might make the whole process feel more like one in which progress was being made, and specific monthly subcategories were being eliminated.
(Another thing they do is, for requests to make something a Good Article, they display how many GAs a user has vs how many GA reviews they've done, so people can prioritize reviewing articles by people who themselves review articles, to encourage people to review articles, but that's probably not a good fit for us, nor easy to implement.) - -sche (discuss) 16:50, 27 July 2024 (UTC)Reply
i was told QQ is shit and shouldn't be used Akaibu (talk) 19:14, 27 July 2024 (UTC)Reply
QQ is better than nothing. It makes finding and creating quotations easier. But its information is unreliable and often inaccurate, so it can't be blindly trusted, everything needs to be double checked. The culprit is the Google Books service, which is used under the hood. If Google Books eventually improves, then QQ will improve as well. --Ssvb (talk) 14:15, 28 July 2024 (UTC)Reply
Ah, if you haven't yet developed a sense of whether the metadata Google Books generates is plausible, you'll need to learn to check/add quotes manually. Still, I encourage you to learn how to pitch in helpfully and don't just rely on other people to do it for you! :) - -sche (discuss) 16:34, 28 July 2024 (UTC)Reply

'The ick' and 'boop' newest entries in dictionary

edit

BBC News story about updates to the Cambridge Dictionary. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:52, 27 July 2024 (UTC)Reply

Apabhramsas

edit

As a continuation to this discussion, where the long-proposed idea of merging the late Middle Indo-Aryan lects referred to as Sauraseni, Gurjara, Takka and Vracada Apabhramsas into one while preserving the lect-specification in the form of labels, similar to the Prakrits. All the Middle Indo-Aryan editors have been notified but I'm bringing this up in a more visible community forum before making any kind of major change. Svartava (talk) 17:33, 27 July 2024 (UTC)Reply

Has been done. Svartava (talk) 12:13, 3 August 2024 (UTC)Reply

Block of User:Purplebackpack89 by User:Theknightwho

edit
 

The following discussion has been moved from the page Wiktionary:Beer parlour/2024/July.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Earlier this month, User:Purplebackpack89 was blocked for three days by User:Theknightwho.

Purplebackpack89 wrote in response to this block, "Blocking somebody for criticizing a close, or an undo of a close, is highly inappropriate. Blocking somebody for feeling victimized is 1984 territory. Criticism is disruption and criticizing an admin should NOT be a blockable offense". I am not seeing a basis in policy for this block. In response to my inquiry, Theknightwho wrote, "It was not due to this incident in isolation". I would like to see diffs demonstrating the overall behavior meriting such a consequence. bd2412 T 01:46, 28 July 2024 (UTC)Reply

@BD2412 See this diff where I explained to PB89 in detail why I blocked them at the time. In terms of diffs:
  1. PB89 would make a habit of claiming to be harassed/victimised, while simultaneously badgering other users on their talkpages, often forking pre-existing discussions to single-out certain things people have said ([8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] - and these go on, and on, and on - I'll provide more if you want). Any of these in isolation would be fine (for the most part), but the totality paints a picture of low-level intimidating behaviour. This diff from May, while not directly relevant to the block, is illustrative as to how bad it can get ([20], And I'm going to send you another email. They ARE out to get me.).
  2. PB89 frequently assumes bad faith in discussions while demanding good faith from others, often while making excuses for their own bad behaviour (which @Benwing2 can certainly attested to): [21] [22] [23] [24] [25] [26]. Again, this is only a small sample.
  3. Specifically on RFD, a few days earlier PB89 had removed a nomination by me ([27]), then speedy-closed it ([28]), which was obviously disruptive and unacceptable.
Also note that just after that block expired, PB89 then carried on in exactly the same way, to the point that Benwing2 had to threaten a one week's block if they kept doing it ([29]). Thankfully, this did seem to make a difference, as I haven't noticed any disruption from PB89 since then. Theknightwho (talk) 02:23, 28 July 2024 (UTC)Reply
So, let's review what's going on in what Knight is saying and what logical conclusions can be drawn:
  1. It's OK to block somebody if they feel harassed
  2. It's not acceptable to ask questions about changes or reverts other editors make to your edits
  3. If an editor thinks your actions make them uncomfortable, ignore how they feels, and not only keep doing what you're doing, escalate things, like you did before in a manner that resulted in multiple desysop requests.
  4. Even if you're acting in bad faith toward that editor, call him out if he notices you're treating him that way
  5. It's OK to block somebody for behavior that occurred as a result of your escalation and baiting
So, what I'm basically saying is, the way TheKnightWho is behaving, capped by his ridiculous comments above, demonstrate that his actions toward me are YET ANOTHER in his long, sordid history of escalation and combativeness, and, if anything, the diffs provided say more about HIS behavior than mine. Furthermore, if the rationale he seems to be using for this ridiculous block is applied more broadly, the project is headed for a dark future indeed. Purplebackpack89 02:48, 28 July 2024 (UTC)Reply
Okay, I can see that we have a problem here. I still tend to think the block was excessive, but there is also a clearly a pattern of behavior on the part of Purplebackpack89 that could reasonably be seen as provocative. Now, of course, this isn't Wikipedia, but over there we have an Administrator's noticeboard where problematic patterns of behavior that are not purely vandalistic or promotional or the like are subject to community consideration rather than individual action. I also think that, even with the smaller scale of participation in this project, neither any admin nor any editor should be under the impression that either problematic behavior or questionable RfD nominations require their specific intervention to be correctly resolved. I think the following measures should be considered here:
  • An interaction ban between the parties, and
  • A limitation on Purplebackpack89 from participation in RfD discussions beyond initially expressing their opinion in such a discussion, and responding directly to questions or comments directed to them.
The latter would include user talk page confrontations regarding the substance of a discussion. bd2412 T 03:44, 28 July 2024 (UTC)Reply
@BD2412 I just want to reiterate one of the points that I made in my initial post on PB89's talkpage following the block, where I pointed out 4 separate instances that I could see on that very page (at the time) of them accusing an admin of bad faith for having tried to manage their behaviour. From a 5 minute check, I can find repeated interaction issues with no less than 6 admins going back years (me, Benwing2, Metaknowledge, Equinox, Kephir and Daniel Carrero), and I'm sure I could find a few more if I went through in detail; in at least one case, there was an interaction ban in place (Metaknowledge).
The point being that PB's modus operandi seems to be to make accusations to poison the well against any criticism of their behaviour, making it impossible to manage how they behave unless you want to be subject to a barrage of personal attacks and accusations of bad faith. Even in PB89's response here, you can see more instances of exactly the kind of things I was pointing out in the first place. In that context, I'm really not sure that an interaction ban would be productive, because they'll just carry on in exactly the same way, only to a different admin instead.
I should also add that this issue is pretty stale, too: it was 3 weeks ago, and there haven't been any issues since that time. Theknightwho (talk) 04:08, 28 July 2024 (UTC)Reply
Likewise, Knight, I could point out for a period of several hours, approximately 6:30 PM-9:15 PM PDT or 1:30-4:15 UTC, you've made no contributions except to collect diffs and respond to this thread.
Or, Knight, I could point out that you also have problematic behavior going on quite awhile and involving a lot of editors other than myself
The more you keep talking about me, the more you prove BD and my point that you seem to have trouble getting along with me. You seem to have completely ignored what BD said that it doesn't have to be YOU specifically who deals with any one editor or problem. You lack any introspection in a thread ostensibly about YOUR behavior.
Interaction ban needs to start now. RfD restriction is excessive and unnecessary. Purplebackpack89 04:23, 28 July 2024 (UTC)Reply
@Purplebackpack89: It also does not have to be you, specifically, that responds to perceived deficiencies in RfD, beyond what can be said in an initial vote in response. My sense is that you do try to provoke editors, and admins in particular, whom you believe to be in the wrong. Perhaps that is not your intent, but that is the impression that you are creating.
For the record, I am not suggesting any unilateral imposition of restrictions. I am leaving it to the community to determine whether my impressions are correct, and warrant the remedies proposed. bd2412 T 04:36, 28 July 2024 (UTC)Reply
No. I do NOT try to provoke. Why do you think I do, what's your evidence? Rather a bad-faith claim, no?
Telling another editor you disagree with them, that there is additional information they should consider, or that you feel put upon should NOT be treated as a provocation, nor should it be treated as grounds for a block or editing restrictions.
This goes back to something you or I said earlier: that we shouldn't live on a project that is intolerant of criticism of the admins. And a related issue is that there ARE admins who respond to questions or criticism incredibly poorly. Kephir was one, and Knight is another. Purplebackpack89 04:52, 28 July 2024 (UTC)Reply
PB89, the only reason why I am talking about you is because I was specifically asked to justify why I blocked you three weeks ago, so you cannot possibly argue that that demonstrates I have trouble getting along with you. Please think about the implications of what you're saying. I also think you're being quite misleading here: you know very well that the issue was not "criticism", but the fact that you consistently assume bad faith during discussions, as you've been told over and over. Theknightwho (talk) 04:53, 28 July 2024 (UTC)Reply
What, that you don't reflect on your own behavior and your entire contribution to this discussion is a "but Purplebackpack..." with a collection of diffs construing perfectly reasonably criticism in the baddest-faith manner possible? You could have a) ignored this discussion entirely, b) admitted maybe the block was in error, or c) defended your block in a less-winded, less-time-consuming manner. You did none of those things, you completely ignored points people made about your own behavior, and instead spent literally hours better spent on the mainspace edits you usually do defending yourself and attacking me. Purplebackpack89 05:17, 28 July 2024 (UTC)Reply
Literally, Knight, if you applied the rationale you made for blocking me on yourself, you'd have to block yourself too! Purplebackpack89 05:28, 28 July 2024 (UTC)Reply
This isn't Wikipedia. Interaction bans aren't practical because we don't have a massively overburdened maze of bureaucracy like Wikipedia does, so people cannot reasonably be expected not to interact. Given the overwhelming evidence above, I cannot foresee any other viable resolution to this than PBP eventually getting blocked permanently. — SURJECTION / T / C / L / 08:09, 28 July 2024 (UTC)Reply
I support an interaction ban and agree that Theknightwho has shown poor judgement (on several occasions) using admin tools. —Justin (koavf)TCM 16:15, 28 July 2024 (UTC)Reply
@bd2412: I've already made a post about the problematic nature of PBP's behaviour a few weeks ago. PUC08:17, 28 July 2024 (UTC)Reply
I don't think an interaction ban is a satisfactory resolution. I consider Purplebackpack89 to have exhibited uncooperative behavior in interaction with various other users. I haven't seen any signs that Purplebackpack89 is apologetic or intending to show less belligerent interaction patterns in the future. In this conversation, we see that Purplebackpack89 has already accused bd2412 of making "a bad-faith claim": will we need to add bd2412 to the interaction ban list? The argument that "no one specific editor or moderator needs to be the one to deal with problematic behavior" isn't convincing when it seems like any specific moderators who have done something Purplebackpack89 dislikes are faced with these kinds of accusations.--Urszag (talk) 10:02, 28 July 2024 (UTC)Reply
I agree. This user does not display appropriate behavior. Vininn126 (talk) 13:07, 28 July 2024 (UTC)Reply
Despite the concerns that an interaction ban will not work, we really won't know until we have tried. I would prefer that we take intermediary measures rather than absolute ones. I would also note to Purplebackpack89 that any condescension directed towards TheKnightWho in this thread is poorly considered. I specifically requested evidence of diffs justifying his imposition of a block. While I continue to think that the block was excessive relative to the annoyance, TheKnightWho can not be faulted for taking the request seriously and responding thoroughly. bd2412 T 13:56, 28 July 2024 (UTC)Reply
@BD2412 One other thing I forgot to mention: it's also worth considering that this was the second block imposed in a relatively short timeframe, as Fenakhay had imposed a 1 day block a month earlier for intimidating behaviour/harassment ([30]). 3 days for a second block is quite typical. Theknightwho (talk) 14:03, 28 July 2024 (UTC)Reply
Fenakhay's block was also rather questionable and should be examined for its appropriateness. s I said above, criticism and discussion should NOT be labeled intimidation. A better thing to label intimidation is bad blocks, such as Fenakhay's and Knight's. Making a comment doesn't prevent other editors from editing in the way blocking somebody does.
And it is appropriate to discuss Theknightwho's behavior in this thread. Purplebackpack89 15:04, 28 July 2024 (UTC)Reply
@Purplebackpack89 Genuine question: do you think I don't actually believe any of this, and that I'm just out to get you? In what way am I acting in bad faith here? Theknightwho (talk) 15:25, 28 July 2024 (UTC)Reply
I DEFINITELY think you've had opportunities to disengage from me and have chosen not to (and I'm not the first editor to point out you have a problem with disengagement; even people who supported you holding on to your tools said you have problem walking away from fights). I think you engaged with me in ways I either can't (not an admin) or don't (I don't patrol your edits except when they are on pages I've created or edited).
Furthermore, it FEELS LIKE you were, and are, fishing for ANYTHING that would justify you blocking or chastising me. Purplebackpack89 15:44, 28 July 2024 (UTC)Reply
@Purplebackpack89 That isn't what I asked. Do you think I don't believe what I'm saying? That's what it means for someone to be acting in bad faith. Theknightwho (talk) 15:54, 28 July 2024 (UTC)Reply
There is no point in trying. Let me make clear here that I will not enforce any interaction bans, since such sanctions are not Wiktionary policy, nor is there significant consensus to institute them as such. I do not expect any other administrators to enforce them either. — SURJECTION / T / C / L / 18:15, 29 July 2024 (UTC)Reply
Wiktionary's lack of a formal dispute-resolution process and meaningful user conduct remedies isn't a good thing. If Wikipedia is bureaucratic, this place is the Wild West. Two-way interaction bans might be the mildest conceivable sanction for questionable conduct and/or interpersonal disputes. It's asking two users with a fractious history to make a conscious effort to limit future interaction in the interest of avoiding further conflict. It doesn't account for nuances (power imbalances, differences in severity of conduct, etc.), but it can create enough of a buffer for productive users who mix like oil and water to work in the same space, provided they're both willing to abide by the interaction ban's terms. Things can get complicated if both editors work in the same topic area, or if they're both particularly prolific commenters in discussions. But TKW and I don't work in the same areas. I don't comment in discussions much outside of things involving my work. TKW has also shown more restraint recently than in the past. I think a two-way interaction ban between us would be feasible and beneficial. For PBP89 and TKW? Less certain, but worth a try, I'd say. WordyAndNerdy (talk) 19:01, 29 July 2024 (UTC)Reply
No. Vininn126 (talk) 19:08, 29 July 2024 (UTC)Reply
What are you disagreeing with, specifically? WordyAndNerdy (talk) 19:11, 29 July 2024 (UTC)Reply
The need for Interaction bans. Vininn126 (talk) 19:20, 29 July 2024 (UTC)Reply
Interaction bans are the lowest setting on the User Conduct Policy machine. I am no longer comfortable contributing in the absence of policy safeguards. Do you want to lose out on my contributions and those of others alienated by Wiktionary's policy vacuum and frequently hostile atmosphere? Do you want Wiktionary to have the kind of policy structure that can support a large and diverse pool of contributors? Or do you wish policy – such as it is – to remain structured toward protecting a status quo favoured by a select few? WordyAndNerdy (talk) 19:37, 29 July 2024 (UTC)Reply

(Links for BD and anyone who didn't see them at the time:) Some background is in this May discussion + this June discussion, the latter already linked. As I said in the first of those, I don't know how to solve the general issue, because (as noted by various editors) both editors behave in ways that escalate the temperature of situations they're involved in, not limited to interactions with each other — so while in any specific situation it may be possible to say "the person more in the right / wrong in this situation is X / Y" (and overall, one editor does more good for the project), and an I-Ban might tamp down on a specific circumstance where each editor's individual issues surface / interface, solving the underlying issues that result in one or the other's behaviour being a topic here every few months is harder. I also worry that I-Bans generally (and here specifically) invite baiting, where one editor or the other comments on a deletion request in a way that doesn't mention, but rebukes known opinions of and inflames, the other. But I'm not opposed to trying it, if other people want to.
More generally, we face a problem Wikipedia also faces, which (as I remarked in the earlier BP discussions) is: if other volunteers don't spend their time monitoring or dealing with someone who makes questionable edits, do we stop the person who does? I noticed this user adding some good but also some malformatted unreferenced/speculative etymologies to entries, so I time-consumingly checked several of their other contribs; in the past, some makers of similar edits have objected to checking as 'stalking', but... if I instead raise "X is making problematic edits" in a general forum, that seems neither friendlier nor (crucially) likely to result in the edits being fixed: I brought this up in a general forum and no-one's had time to fix it. (I suspect it'll eventually be "fixed" by some general-cleanup bot run very reasonably converting the (''chemistry'') nonlabels to {{label}}s, unaware of the RFC thread and the actual solution.) - -sche (discuss) 16:51, 28 July 2024 (UTC)Reply

Checking a particular user's edits should not be considered stalking or anything remotely like that. How else are we supposed to keep an eye on problematic editors? Furthermore their edits are public, anyone can look at them. Vininn126 (talk) 16:57, 28 July 2024 (UTC)Reply
Also, to reiterate: nothing new has happened here. It’s not clear why this issue from 3 weeks ago suddenly warrants an interaction ban, when the only bad interactions I’ve had with PB89 recently have been caused by this very thread, and they’ve been entirely one-sided. @-sche I don’t want to be difficult, but if you’re going to equate my behaviour with PB89’s (even if only in part), I’d appreciate some diffs to actually show that’s warranted; particularly given you’ve said you’re worried about baiting happening in this specific case. I’ve brought receipts to this thread, so if my behaviour is now under scrutiny then I’d appreciate the same courtesy. Theknightwho (talk) 19:05, 28 July 2024 (UTC)Reply
I do not see this as a parallel situation between two problematic editors. In reality, although User:Theknightwho has in the past unnecessarily escalated some situations they've been involved in, more recently they've shown a great deal of restraint, and have not had a problem acknowledging when they've made mistakes. This is not the case for User:Purplebackpack89, who not only has no idea when to back down but (a) seems completely unable to admit mistakes, (b) consistently believes everyone they disagree with is acting in bad faith (i.e. is "out to get" them), (c) has a terrible signal-to-noise ratio in terms of contributions they make vs. trouble they stir up, (d) frequently makes mistakes in their contributions (which completely merits the oversight they obviously loathe) and (e) based on statements made on their user page, does not accept some basic, settled Wiktionary principles, such as WT:Sum of parts. Their general modus operandi, as pointed out before, is to badger and harass anyone who challenges their contributions while accusing anyone who calls them out for this of having a grudge against them and being "out to get" them. Unless their behavior changes dramatically, I think they will end up with a permablock after they have utterly exhausted the community's patience. In general I would ask User:BD2412 to evaluate all the evidence before coming to conclusions about who is in the right vs. the wrong; although I know this was not their intention, the effect of inserting themselves into this issue has been to cause an issue that had died down to flare up again. Benwing2 (talk) 19:47, 28 July 2024 (UTC)Reply
I don't think the situation needs to be parallel for an i-ban to be useful to its resolution. This is not a right vs. wrong evaluation, but one of what will be likely to make the project work most smoothly. bd2412 T 21:15, 28 July 2024 (UTC)Reply
In truth, Purple has had run-ins with a lot of users, basically anyone who has called them out on anything. I don't think we can reasonably create interaction bans between them and everyone else (other than by blocking them). Benwing2 (talk) 22:19, 28 July 2024 (UTC)Reply
Benny, you've made a lot of grandiose and exaggerated generalizations, but you haven't backed them up with a single diff, not one. Some are inaccurate (for example, c, as I've created over 600 entries). Through e, you also seem to be attempting to criminalize suggesting changes to CFI, which any member is welcome to propose. Your comments are inappropriate and should be ignored. Purplebackpack89 22:28, 28 July 2024 (UTC)Reply
First, there's already the diffs, linked by TKW. The evidence is here, why re-show it? (This is not to say that new evidence isn't accepted.)
Second, Ben said that your good edits (which include your entries) are outweighed by your bad ones, not that all your edits are bad in point C.
Third, it's not wanting to change CFI that's the issue, but instead you acting as if your ideas are Wikt rule, and harassing people who don't see it that way. CitationsFreak (talk) 08:10, 29 July 2024 (UTC)Reply
Y'all gotta stop referring to my behavior as harassment. What's REALLY harassing is Knight's questionable blocks, and Benny threatening equally questionable ones. My behavior doesn't prevent anyone from editing the project, Knight's bad block does. Knight has serious problems dealing with other people, and I'm starting to realize Benny may have some too. For example, he got incredibly upset when I suggested he withdraw an RfD that had 5 keep votes and 1 delete vote, and since then his behavior toward me has been rather confrontational. Frankly, he's guilty of some of the things he accuses me of. I am NOT the real problem here, the problem here is that we've elevated people to admin that have no people skills and refuse to hold each other accountable. Purplebackpack89 11:41, 29 July 2024 (UTC)Reply
I call a duck a duck. Multiple people agree that your behavior is unacceptable. You can either not be hypocritical (i.e. expect others to admit to mistakes when called out) and admit that you can be at fault and that your behavior is problematic and change it, or you can continue and ultimately be banned for unacceptable conduct. Vininn126 (talk) 11:58, 29 July 2024 (UTC)Reply
@Purplebackpack89 Could you please provide some diffs of your own to evidence what you've just said? I've provided evidence about your behaviour, so I think it's time for you to step up and do that as well. Otherwise, these are just personal attacks. Quite frankly, you don't get carte blanche to say things like this just because you're upset; you need to back it up. Theknightwho (talk) 16:33, 29 July 2024 (UTC)Reply
Above I linked to your two de-sysop requests. And the diffs you've provided above suggest that you were involved in provoking my supposedly-bad behavior. If you were to look at the diffs you yourself provided to examine your and Benny's behavior, you would conclude that the two of you respond poorly to criticism and can't drop a stick. As for carte blanche, you've seemingly been allowed carte blanche to treat other editors combatively, so you don't really get to talk about other editors' behavior. Purplebackpack89 16:42, 29 July 2024 (UTC)Reply
@Purplebackpack89 Which of those diffs shows me responding poorly to criticism? I don’t see responses by me in any of them. Failed desysop votes from long before we ever interacted are not relevant here, either: you need to provide actual specifics, for once. Theknightwho (talk) 17:44, 29 July 2024 (UTC)Reply
"My behavior doesn't prevent anyone from editing the project": I've already shown beyond a reasonable doubt that your behaviour did play a role in driving out an editor from the project. It is also very offputting to me personally; not to the point that I'd leave just because of you, but if more people behaved like you do I would definitely reconsider my participation here. PUC17:31, 29 July 2024 (UTC)Reply
Behavior does absolutely prevent you from editing, unacceptable conduct is one of the default options when banning someone. Vininn126 (talk) 17:34, 29 July 2024 (UTC)Reply
And it takes two to tango. Perhaps we need to ask ourselves why most of the diffs Knight provided are either interactions between me and Knight, or interactions between me and Benwing. That does not look good for Knight and Benwing. Purplebackpack89 11:55, 29 July 2024 (UTC)Reply
perhaps because other people have decided to simply not bother exhausting themselves by interacting with you? —Fish bowl (talk) 17:47, 29 July 2024 (UTC)Reply
For the love of everything on this godforsaken Earth, can we get a block?! There have been several threads at this point about this user, with several admin stating that they've been problematic and disruptive. Why does it take so long for us to take a stand and block them? Why must other users suffer through cleaning up messes, dealing with threads like these, and way more for admin to be more proactive about blocks? This has been brought up so many times to the point where I'm starting to believe there's some ulterior motive as to why Wiktionary is so block-averse. I'd say that people don't like conflict here, but if that were the case, then we'd limit the amount of times we have to go through this. Please, let's put a stop to this chaos. AG202 (talk) 17:52, 29 July 2024 (UTC)Reply
@AG202 Because Wiktionary has a chronic problem with long-term disruptive users accusing admins of conflicts of interest as a way to escape being sanctioned, as evidenced by the fact I have been dragged through the coals yet again for issuing a three day second-block to a user that is widely perceived as disruptive. That is not BD2412’s fault, but PB89’s behaviour in this thread has been appalling.
It’s incredibly obvious this is what’s going on, because the same users with a grudge pop up in these discussions every single time, and seem to feel they can throw around personal attacks with impunity. On Wikipedia, doing that kind of thing on WP:ANI would result in a hefty block. Theknightwho (talk) 17:56, 29 July 2024 (UTC)Reply
@AG202 I also believe that PB89 needs a long-term block due to their consistent unacceptable behavior. Since by the letter of the law I am not an uninvolved admin, I don't feel I should be the one issuing it, but I welcome any uninvolved admin to evaluate the evidence and make up their mind whether such a block is appropriate. Benwing2 (talk) 18:45, 29 July 2024 (UTC)Reply
I am in the same position. Vininn126 (talk) 18:48, 29 July 2024 (UTC)Reply
I have instituted a three-month block. — SURJECTION / T / C / L / 18:54, 29 July 2024 (UTC)Reply
Uh, is there no way someone could be blocked from editing discussion pages and yet be able to contribute to the dictionary mainspace? Inqilābī 20:03, 29 July 2024 (UTC)Reply
Thank you, and @Benwing2: I understand that, so no worries on your end. AG202 (talk) 20:21, 29 July 2024 (UTC)Reply


Requests for interaction bans

edit

I am requesting an interaction ban between myself and Theknightwho, as well as a separate interaction ban between myself and Fay Freak. I'm not going to re-litigate the circumstances that precipitated these requests. The context can be found in the previous BP threads linked by -sche above. This is the bare-minimum policy safeguard I'd accept to feel comfortable contributing here.

It would only be fair to require TKW to be subject to some form of restriction/oversight on issuing blocks if PBP89's RfD participation is to be selectively limited. WordyAndNerdy (talk) 17:48, 29 July 2024 (UTC)Reply

Pinging others who might be interested in considering this remedy with TKW: @Mahagaja, @LlywelynII, @Huhu9001. WordyAndNerdy (talk) 19:07, 29 July 2024 (UTC)Reply
Just FYI, it looks to me like you have pinged people who have had prior conflicts with User:Theknightwho. This comes across as canvasing, which isn't generally allowed. Benwing2 (talk) 20:10, 29 July 2024 (UTC)Reply
This is the fourth time WAN has canvassed uninvolved users who she thinks have personal issues with me (past instances: [31] [32], and these two go together: [33] [34]). There is a clear pattern at this point, and it feels like a way to manufacture consensus to further a personal grudge. Theknightwho (talk) 20:16, 29 July 2024 (UTC)Reply
More canvassing that's just occurred: [35]. Theknightwho (talk) 23:05, 29 July 2024 (UTC)Reply
That isn't "canvassing." BD2412 is already a participant in this discussion. In any case, Wiktionary appears to have no prohibition on "canvassing." To break away from unproductive tit-for-tat wikilawyering, are there topic areas you'd prefer I minimize contribution in to decrease chances of incidental run-ins? My main beats are fandom slang and LGBT-related terms. WordyAndNerdy (talk) 23:13, 29 July 2024 (UTC)Reply
This is not canvassing. There's no vote taking place, and as WAN said, BD has already participated in this discussion. And please, what I've said before today includes you too. There's no need to continue discussions like this; it doesn't help your case. AG202 (talk) 23:30, 29 July 2024 (UTC)Reply
Okay, then it's smearing me on someone's userpage in an attempt to manufacture consensus and discredit me. I don't really care what word we specifically use for it. Alright - I'm out. Theknightwho (talk) 23:39, 29 July 2024 (UTC)Reply
To be fair, you do have a personal stake in this discussion, and thus AG202's request above seems a bit misplaced. What topic areas would you prefer I limit contribution in to avoid the chances of us bumping into each other as part of the two-way interaction ban? WordyAndNerdy (talk) 23:47, 29 July 2024 (UTC)Reply
I don't think that's necessary, as no issues have come up in the time since you returned to editing. Theknightwho (talk) 23:52, 29 July 2024 (UTC)Reply
From your end, perhaps. From my end, the issue has been ongoing and unresolved for over a year now. No remedy has been forthcoming during that time. The community doesn't seem to have the stomach for it. You've still got the power to implement retaliatory blocks. And I've made it clear that I will not return as a contributor without bare-minimum user-conduct protections. Making an effort to stay away from each other seems to be the best option available here. WordyAndNerdy (talk) 00:05, 30 July 2024 (UTC)Reply
You know what? I'm not invested in this project enough to struggle and plead to contribute to it. There are better ways to spend my time than having to push through hostility and indifference to do free work. Wiktionary desperately needs some form of dispute resolution process and user conduct policy. But I'm tired. You cannot bring change to those who don't want it. You can't find agreement with those who only see the errors of others. WordyAndNerdy (talk) 06:46, 30 July 2024 (UTC)Reply
@Benwing2: "Canvassing" would mean instructing people to participate in this discussion and telling them what to say. I did neither. I simply signposted. Not everyone has this page watchlisted. And treating questions over TKW's admin conduct as the isolated grievance of a single problem user (Dan Polansky, PBP89) rather than part of a larger pattern effecting a broader group of users is how we got here.
@TKW: TKW stated above that PBP89 has a "habit of claiming to be harassed/victimised." I could provide numerous examples of TKW using similar accusations to shut down scrutiny of his own conduct. I'm not interested in personally re-litigating TKW's conduct. I'm interested in pursuing a policy remedy that could allow us both to contribute in relative harmony. WordyAndNerdy (talk) 20:29, 29 July 2024 (UTC)Reply

Maybe someone will think this is a bad move (by an editor who's 'involved' [commented] in the discussion, no less) and undo it, but I am boldly closing this (and T:archive-top is the closest template I can find for that purpose) on the grounds that it's accomplished about as much in the positive direction as it seems like it's going to, PBP has been blocked, and especially with the ill-advised pings, the discussion is just reopening old disputes. Since multiple editors (coming at this from different angles or 'sides') above have said they're not interested in relitigating old conduct, a separate/new discussion which avoids focusing on specific editors, about the general question of "Should we introduce interaction bans?" and would they be workable, might be productive (inasmuch as the biggest impediment to implementing them in any specific case is that they aren't A Thing). - -sche (discuss) 20:58, 29 July 2024 (UTC)Reply

You are overstepping here. I have requested interaction bans between myself and TKW, and between myself and Fay Freak. Others may be similarly inclined to seek interaction bans. Such measures are sometimes necessary to allow wikis to be safe and accessible to all contributors. Sweeping this under the rug as is Wiktionary's usual wont will only drive a major systemic issue deeper. I'll also note that I haven't found a prohibition on "canvassing"/signposting in Wiktionary policy (including under WT:VP, where one might expect to find it). WordyAndNerdy (talk) 21:19, 29 July 2024 (UTC)Reply
Not really related to the original thread, as -sche mentioned. Vininn126 (talk) 21:21, 29 July 2024 (UTC)Reply
Problem solved. WordyAndNerdy (talk) 21:23, 29 July 2024 (UTC)Reply
@WordyAndNerdy I am pessimistic about the outcome. Wiktionary admins do not even respect their own policies such as wt:Blocking policy. They pretty much enjoy the feeling they ARE the rules, like I am not really supprised to see they immediately picked up "canvassing", which is not based on any rules or policies, to incriminate and shut up the opposite side. Despite I support a supposed new user conduct policy, I doubt it will make any difference.
But thank you for all your efforts anyway. -- Huhu9001 (talk) 11:25, 30 July 2024 (UTC)Reply

Language code for Baltic German

edit

I would like to request adding a language code for Baltic German on this platform. A lot of Estonian terms (and Latvian terms) are derived from Baltic German and there's currently no real way of displaying that, other than Baltic {{der|et|de|<term>}}, which not only looks ugly, but is also wrong. It categorizes the term to [[CAT:Estonian terms derived from German]], which is incorrect, as there is a clear distinction (at least in Estonian) between terms derived from (High) German and the Baltic German dialect spoken here. As such, the code could also be etymology-only. In essence, the Baltic German dialect is a vernacular dialectal form of a mixture of High and Low German with a clearly recognisable regional flavour (Estonian and Latvian dialects) in pronunciation, morphology, syntax and vocabulary. EKI (Institute of the Estonian Language) has an online dictionary of Baltic German, with a myriad of sources for various terms: https://arhiiv.eki.ee/dict/bss/. I feel like having a language code for Baltic German is justified. Joonas07 (talk) 19:31, 1 August 2024 (UTC)Reply

My comprehension of German is pretty rudimentary, but from my understanding, there is a distinction with Baltic German and central European German varieties, so I agree that this is justified. Joonas, do you know to what extent this is also true for Latvian or even Lithuanian? Danke. —Justin (koavf)TCM 19:37, 1 August 2024 (UTC)Reply
Lithuanian barely has any influences from Baltic German, if at all. Are you asking whether this distinction exists the same way in Latvian? I'm not extremely familiar with Latvian, but I believe both of these languages have been influenced the same way by Baltic German, as the history is the same. A quick look at [[Category:Latvian terms derived from German]] as well makes me believe that is the case. Joonas07 (talk) 20:21, 1 August 2024 (UTC)Reply
Ja, that was my question. I figured that the influence wouldn't be as strong due to the Polish–Lithuanian Commonwealth. —Justin (koavf)TCM 20:23, 1 August 2024 (UTC)Reply
vernacular dialectal form of a mixture of High and Low German with a clearly recognisable regional flavour – this is fiction. It is either Standard High German with Baltic characteristics in vocabulary (e.g. Burkane, but admittedly they suffice for whole dictionaries which maintain borrowings from Low German and the local Baltic language) or it is Low German, with Baltic-influenced accent. I also speak German with Slavic twang due to speaking Russian, doesn’t mean I have created a new dialect or creole. Most commonly it is High German like “Austrian German” is High German, or just German. w:de:Baltisches Deutsch knows that even around 1600 High German “setzte sich durch” prevailed over Middle Low German – which seems exaggerated to me, but perhaps only by half a century, and Middle Low German ends in 1650 precisely at the point of being supplanted by High German for cultivated and literary purposes –, and then around the first half of the 19th century the academic upper class was oriented towards “a trim Standard German”. But Baltic Germans were only upper class, so there is no third language for even diglossia to fit in.
Baltic German should be no more than a label of German and occasionally Low German for any traces of it remaining, a distinction between (High) German and the Baltic German dialect spoken [in Estonia, Latvia, in St. Petersburg or the Baltics in general] is incorrect, it was not present for speakers. It was like Euro-English in Brussels: behind it in Brussels there is Flemish and French, and in Reval now Tallin and Dorpat now Tartu Estonian and on another level Russian. Fay Freak (talk) 20:35, 1 August 2024 (UTC)Reply
I would argue the Baltic German varieties developed enough from their High or Low German origins to warrant an etymology-only code. There is a noticable difference between you speaking German in a Russian accent, and German settlers in Estonia and Latvia speaking a variety of their language for hundreds of years. I definitely disagree that the distinction wasn't present for the speakers. Besides, that isn't even that important, as the distinction is present in target languages. Joonas07 (talk) 20:55, 1 August 2024 (UTC)Reply
German settlers … variety of their language There you have it, their language of the mainland.
The distinction is not present in the remaining sources either. It must be in use and not merely by declaration in target languages. Its texts look like Standard German texts with peculiar words we of course seek. E.g. the sentences quoted in Wörterschatz der deutschen Sprache Livlands – there aren’t actual dialect dictionaries. And all quoted on w:de:Baltisches Deutsch.
Of course only a single word suffices for a Baltic Germans speech to be marked and ridiculized further west, which they themselves did not expect, since they only knew one German, Standard German, not a diglossic situation of local Standard German plus dialect as is now known from Switzerland and Arabic countries, hence the quoted Harry Siegmund from Liepāja writes about his stay in Königsberg, because of the sensitive nationalist climate in Germany: “Ich schwieg auch, weil ich fürchtete, mit meiner baltischen Sprechweise als Fremder aufzufallen und ihnen in jeder Hinsicht unterlegen zu sein.” – “I was silent for fear of raising attention as a foreigner due to my Baltic mode of speech and be outgunned in every way.” It was a mode of speech. This is the situation in its last 150–250 years. Going further back, the variance is within the standard variance of all Early New High German and Middle Low German. For a 16th-century text it is highly problematic to e.g. claim it specifically Swabian or Category:Alemannic German language instead of Early New High German, which was just developing as a standard. And then in the Baltics you don’t even have a solid basis of untarnished dialect speakers because the peasants spoke Estonian and Latvian, and “German” were those who worked in administration and churches and their language—occasionally also a Russian, an Englishman, or a Swede, your target language sources may generalize it—, quite different also e.g. from the Volga German situation, which were homogenous German societies with little if any Russian or Turkic etc. encroachment until Sovietization.
This also does not mean though we can’t have “Baltic German” as an etymology-only language. As I implied with the Austrian German we can have a code, I probably would have added it myself if I had cared enough about the variety. For your purposes you should know it is still German however. Reminds me a bit of the pendants amongst Hungarian editors who liked to be sure whether a Hungarian word is borrowed “from German”, “from Austrian German” or “from Bavarian”. Linguistic works vary in the declaration. There is no actual idea behind such questions. Fay Freak (talk) 22:28, 1 August 2024 (UTC)Reply
Yeah, I didn't mean it in the sense that it has developed into a language of its own right, rather that the variety of Standard German that is Baltic German has developed far enough to be notable. Didn't quite understand what you're getting at in the second half of your third paragraph. Re: Hungarian, it doesn't hurt to be exact. I don't know what you mean by "there is no idea behind such questions". For Estonian, it is often significant whether a word was borrowed from German or Baltic German. Joonas07 (talk) 23:30, 1 August 2024 (UTC)Reply
This significance I understand. It might be from the historical German there or it might have intruded into the standard from present Germany, or even its predecessor Reich. Similarly one judges whether a word entered Ethiosemitic from Egyptian Arabic or Yemenite or Ḥijāzi usage, but all under one Dachsprache. I feared that you tried to introduce a distinction that is impossible to make out, exaggerating diglossia. The Hungarian etymology statements are more fanciful than reliable in this respect. Fay Freak (talk) 23:52, 1 August 2024 (UTC)Reply
What is your preferred code for Baltic German then? de-BAT? de-BLT? There seem to be no codes for geographic regions comparable to ISO 3166. Region code means something different. Fay Freak (talk) 22:37, 1 August 2024 (UTC)Reply
That's a good question. Does it have to be in the format <language code>-<REGION CODE> to be an etymology-only code? Joonas07 (talk) 23:32, 1 August 2024 (UTC)Reply
@Joonas07: There is no rule, compare the list of etymology-only languages in WT:LOL/E, so I can only inquire about preferences. Fay Freak (talk) 23:52, 1 August 2024 (UTC)Reply
I don't really know. de-bal maybe? ger-bal? The list you linked seems to indeed have various different formats (btw, I really enjoy the abbreviation WT:LOL): some are just three-letter codes, but I don't know if there's an intuitive one for Baltic German that's not already in use. Some start with gsw-, which I gather is for High German varieties? So there seems to be some conventions. I'm open to suggestions. Joonas07 (talk) 00:23, 2 August 2024 (UTC)Reply
@Joonas07: de-cle For Curonia, Livonia and Estonia, because they called their storage-chambers by Proto-Slavic *klětь and we store information here. I will go to sleep now, before implementing it. Fay Freak (talk) 00:43, 2 August 2024 (UTC)Reply
Let's just do de-bal. That's analogous with other language varieties not from a specific country. Joonas07 (talk) 00:57, 2 August 2024 (UTC)Reply
  Done, @Joonas07. Fay Freak (talk) 11:25, 2 August 2024 (UTC)Reply
@Joonas07: I have added you the online dictionary of Baltic German as a reference template, {{R:de:BSS}}. Most of the dictionaries and whole sentences quoted from Baltic German therein are in Standard German. Schiller-Lübben is for Middle Low German. Fay Freak (talk) 23:04, 1 August 2024 (UTC)Reply

AWB access

edit

Hello, I would like to request access to the AutoWikiBrowser tool. I have been contributing significantly by adding entries in Old Tupi and Guaraní, and I often need to correct some inaccuracies in the entries of these languages. Furthermore, the creation of Old Tupi entries only really started to take off last year; we are in a somewhat unstable phase where some quotation templates are occasionally renamed. For what it's worth, I already have access to AutoWikiBrowser on enwiki. Thank you, RodRabelo7 (talk) 04:48, 2 August 2024 (UTC)Reply

This seems uncontroversial, based on edits such as this. I'm not familiar with the language, but your work seems reasonable to me. Please ping me if no one else grants access in a week. I'll try to check in on this thread to see if there are any other comments. —Justin (koavf)TCM 04:53, 2 August 2024 (UTC)Reply
Just in case, a decent and modern Old Tupi grammar in English is Ferraz Gerardi's A Role and Reference Grammar Description of Tupinambá. RodRabelo7 (talk) 04:56, 2 August 2024 (UTC)Reply
Obrigado. —Justin (koavf)TCM 05:04, 2 August 2024 (UTC)Reply
@Koavf: pinging, as requested. RodRabelo7 (talk) 20:29, 9 August 2024 (UTC)Reply
https://en.wiktionary.org/w/index.php?title=Wiktionary%3AAutoWikiBrowser%2FCheckPageJSON&diff=80991102&oldid=80906256 Obrigado for your service. —Justin (koavf)TCM 20:33, 9 August 2024 (UTC)Reply

Hyphenation for Row-Splitting versus a Word that Might or Might Not Normally have Hyphenation

edit

Dear Wiktionary: If a word could be normatively be interpreted as either needing hyphenation or not needing hyphenation, and it is hyphenated by a row-splitting hyphenation, how do I take a verbatim quote of that sentence for a Wiktionary citation? This actually comes up A LOT for me, because formal Wade-Giles includes hyphenation, while informal Wade-Giles and postal romanization do not include hyphenation, so many words "could go either way". What I did in this case: diff on the Zichang page was make a context-based decision (i.e. this sentence did not fall out of a coconut tree; in the context of the book and the other usage of the word in a different entry of the dictionary, it appears that the authors might likely have intended that this hyphen is more than just a row-splitting hyphenation). But I also want to imagine what could be unburdened by what has been before (that is, the author may have intended non-hyphenation for this specific instance, even if the publisher did hyphenate for the row-split, and even if the same word was hyphenated elsewhere, and even if other similarly situated words in the book are hyphenated). Thanks for any guidance. Yours Truly, --Geographyinitiative (talk) 11:26, 2 August 2024 (UTC) (Modified)Reply

I'm not sure how familiar you are with CSS and HTML, but have you by chance seen these web design solutions?
I think these kind of solutions will work for what you're going for, which may involve inserting raw HTML/CSS rather than a template or other wikitext. —Justin (koavf)TCM 11:41, 2 August 2024 (UTC)Reply
Thanks-- Okay, I'm looking at this, but does this coding allow me to signal to the reader that, within the context of the published book, there is an ambiguity as to whether the hyphen is merely a row-splitting hyphen or actually a part of the word proper (i.e. the hyphen would have been included if the word were not on the edge of a row)???--Geographyinitiative (talk) 11:47, 2 August 2024 (UTC)Reply
I think your solution of an HTML comment is probably the best you can do. —Justin (koavf)TCM 11:52, 2 August 2024 (UTC)Reply
If the surrounding context makes the intended usage clear (for example, if the same document has examples within a single line of the same word/proper noun spelled with or without a hyphen, or of analogously formed words or names) it seems fine to follow that. In cases where that can't be determined, I would say it should be considered whether these specific quotations are really essential.--Urszag (talk) 12:20, 2 August 2024 (UTC)Reply
If it’s significant (for example, because a term has both hyphenated and non-hyphenated forms), I indicate this as “roly[-]poly” in a quotation. You can also use the template {{quote-gloss}}, which results in “roly[-]poly”. — Sgconlaw (talk) 15:26, 2 August 2024 (UTC)Reply
As to Sgconlaw's comment, I am not sure if the hyphen is a gloss on the quote, and I don't want to misuse quote-gloss, though I see how this could be good.
Concerning Urszag's comment (that I have usually agreed with) that "In cases where that can't be determined, I would say it should be considered whether these specific quotations are really essential." I have to admit that I have followed that line of thinking before. However, I have later come to feel that I really don't want to cause a bias in my citations by just blatantly ignoring a category of ambiguous situations in English. So I really want to embrace the citations as I come to them. There should be a normative way to deal with this category of scenario beside "fuck it". This is not a lowly or vulgar usage of English- this is a category of ambiguity that is baked into English, and I believe Wiktionary should have a way to confront the situation head-on and properly cite them as what they are. In the above quote the word "An-ting" is not the essential word, but instead the rare word "Tzu-ch'ang" which is super rare because it is a Wade-Giles name derived from a communist-only Chinese original (Taiwan did not use it), so Tzu-ch'ang is pretty rare, and the book is pretty authoritative. So I want to deal with "An-ting" in the "right" way that fully acknowledges the ambiguity rather than do my grab ass horseshit of writing something in the html. I came up with that shit ages ago as a work-around; now, I want to fucking do to beautifully and make it clear to the reader of the quote what the fuck is happening, and unambiguously tell the reader that there is a potential ambiguity. --Geographyinitiative (talk) 22:35, 2 August 2024 (UTC)Reply
@Geographyinitiative, Sgconlaw: The purpose of {{quote-gloss}} is to contain text not present in the original text, so An{{quote-gloss|-}}ting would mean "there was no hyphen in the original, but there was supposed to be" — probably not your goal. The OED does something like this: An-ting [variant reading Anting]. Although I prefer something more explicit like this: An-ting [or Anting, if the hyphen is a line-breaking hyphen] Ioaxxere (talk) 01:53, 8 August 2024 (UTC)Reply
@Ioaxxere: ah, true. In that case I’d go with the first option I suggested which is to indicate the hyphen as “[-]”. — Sgconlaw (talk) 05:08, 8 August 2024 (UTC)Reply
Following Ioaxxere's comment, from now, I will plan to explore the possibilities of make these kinds of edits: diff. It's so murky. --Geographyinitiative (talk) 07:07, 8 August 2024 (UTC)Reply

Unless someone comes up with a better solution, I'm going to leave this quote (diff) as is. I'm going to eventually take this topic to Grease Pit to see if a real solution can be created for this kind of ambiguous situation. However, right now, for this quote, I don't think this quote is a good "model case" for the larger problem because I really feel that the context of the book itself more heavily favors the hyphenated form of "An-ting" than unhyphenated "Anting". But I'll keep this case in mind and come back to it later; please ping me if you have more help/input/advice on the topic generally. --Geographyinitiative (talk) 23:56, 2 August 2024 (UTC)Reply

@Ioaxxere: I thought OED indicates variant readings when there are multiple versions of the same work, and some use one form of a term and some use another form. That’s how I interpreted it anyway. — Sgconlaw (talk) 11:21, 8 August 2024 (UTC)Reply
I do agree that it would be nice to have a standardized solution. I've come across this issue more than once and it can be annoying when it's a rare word and I'm trying to figure out whether the hyphenated form is more common or not. Andrew Sheedy (talk) 05:00, 3 August 2024 (UTC)Reply
Yeah guys, please keep me in mind if you come up with a good solution for this. I will keep Sgconlaw's use of quote-gloss in mind. But I really want to give readers of a quote the full picture on the quote and not either (a) ignore the potential ambiguity, (b) just opt not to use the quote, or (c) use the quote anyway without fully acknowledging potential ambiguity in some way that the reader can see without misusing quote-gloss (in my opinion) or using a work-around or similar, or relying on my personal assessment of what the author meant to pick one over the other. --Geographyinitiative (talk) 10:22, 3 August 2024 (UTC)Reply
Bit late here, but I follow OED in using [-] in these kinds of cases. @Geographyinitiative This, that and the other (talk) 10:14, 8 August 2024 (UTC)Reply
@This, that and the other Is that right? Is there a paper about this? I'd like to learn about the finesse behind when they use [-] and use it the same way they use it. It's very bizarre looking to me, so I want to be 100% clear what I'm doing if I follow that method- (1) EXACTLY what specific situations is it used in? (2) EXACTLY how is it formatted? (3) Do other dictionaries deal with this issue in a similar manner? (4) Is there any clear policy-level guidance on this issue anywhere in Wiktionary? If not, why not? Should it be created? --Geographyinitiative (talk) 11:02, 8 August 2024 (UTC)Reply
@Geographyinitiative Upon closer inspection it seems I may be wrong; OED appears to use the tilde (~) for this purpose instead. But I definitely picked up the habit of using [-] from somewhere - it's not my own invention! This, that and the other (talk) 00:56, 12 August 2024 (UTC)Reply

List and topic categories again (how many types, and how to name them)

edit

I notice CAT:en:Waterfalls says it's for "names of specific waterfalls, not merely terms related to waterfalls, [nor] types of waterfalls." But even before I added to it, almost all its contents were related/type terms (and I could add more: byfall, catadupe, maybe spray bow, foambow, plunge pool, stickle, huck).
I could solve this by changing "Waterfalls" to a "related-to" category; in this case, that wouldn't even cause other languages much hassle, as other languages barely use it. However... I think it is reasonable to have a category for specific Falls too, like we have for cities. But what could it be called?
We use "CAT:en:NAME" for both set categories ("terms for seasons, not merely terms related to seasons. It may contain [...] types of seasons [or...] names of specific seasons"), related-to categories ("This is a "related-to" category. It should contain terms directly related to winter"), and name lists. In our schema, the category for terms related to waterfalls or which are types of waterfalls, and the category for names of specific waterfalls, should both be "CAT:en:Waterfalls" AFAICT.
And because type isn't predictable from name, some people (reasonably!) think e.g. Category:en:Cities is named like a set category and put "capital city", eperopolis et al in it (and where else should these go?), while other people think it's named like a related-to category, or (yes) a name category... so, like many categories, its contents are a mix.
A solution would be to specify the purpose in the name: ":en:set:Seasons", ":en:topic:Winter", ":en:names:Cities"... but this highlights another issue: does it make sense that :en:Winter can include wintery, but :en:Seasons says it shouldn't contain seasonal? Maybe not! (And is it unmaintainable, anyway? It seems like in practice the more fine-grained distinctions we assert, the less well people maintain them.)
Should we merge "sets" into "related-to" categories, so "CAT:en:Seasons" could contain summer and seasonal? (In theory, set categories could just be ====Hyponyms==== sections of entries like [[season]], not needing to be categories at all.) - -sche (discuss) 21:24, 3 August 2024 (UTC)Reply

@-sche: I would be in favour of merging the two types of categories, as I don't really think the distinction is easy to maintain. Alternatively, if it is felt that in some cases it is appropriate to have a "name" category, maybe the default should be a related-to category (for example, "Category:Cities") and the "name" category should be a subcategory called "Category:Names of cities". — Sgconlaw (talk) 23:57, 7 August 2024 (UTC)Reply
@-sche: I prefer a naming scheme that makes the purpose clear, so you might have "types of waterfalls", "waterfalls", and "particular waterfalls" for the three kinds of categories. Ioaxxere (talk) 01:36, 8 August 2024 (UTC)Reply
@Ioaxxere: I don’t feel it’s necessary to distinguish between “Category:Waterfalls” and “Category:Types of waterfalls”. I’m somewhat concerned that if we have distinctions which are too fine we are just going to get editors dumping everything in “Category:Waterfalls”. — Sgconlaw (talk) 11:23, 8 August 2024 (UTC)Reply
edit

Something I found neat in our PIE entries is the feature in WT:AINE allowing the splitting of reconstructed PIE terms by morpheme with hyphens in the alt parameter of links in Derived and Related terms. Not only does it allow more derivation transparency, but also you can square-bracket link the individual morphemes involved so less familiar visitors can be taken to the compositional morphemes to learn more about them.

I would like to amend WT:Reconstructed terms to allow this practice to be used on other proto-language pages, not just PIE (and not on non-proto-language entries).

The amendments to WT:Reconstructed terms#Entries would be something like this, derived from language at WT:AINE:

Separating hyphens can be used in the displayed form of links in Derived terms and Related terms sections of proto-language pages to clarify the formation, as long as it is not used in the page name itself.

Ceso femmuin mbolgaig mbung, mellohi! (投稿) 08:54, 4 August 2024 (UTC)Reply

Strong support, have been doing this for non-proto reconstructions in, e.g., Prakrit and Ashokan Prakrit. It will be nice as a frequent reader of Proto-Indo-European entries as well, though I understand that this is obviously not always possible when there are factors like sandhi in play. Svartava (talk) 09:24, 4 August 2024 (UTC)Reply
Please don't do this for lower-branched Uralic Proto-languages. I think this is not helpful for agglutinative languages overall.
Not sure I like it for PIE, either, but it is kind of a tradition in IE linguistics, so I guess. For languages where this isn't done in literature - not sure it's helpful. Thadh (talk) 09:51, 4 August 2024 (UTC)Reply
As Thadh points out, this is something needs to be decided on a language-to-language basis. If Proto-North Caucasian feels this works best for them, godspeed. What I am opposed to is doing so on Proto-Italic and Proto-Celtic entries, as you have been doing. Those need to go to a vote, because the status quo is not to have hyphens. --{{victar|talk}} 17:23, 4 August 2024 (UTC)Reply
...which is why I posted here in the first place, to narrow down the boundaries of such a vote? — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 18:29, 4 August 2024 (UTC)Reply
Since when have we needed votes for a content issue like this? Theknightwho (talk) 19:21, 4 August 2024 (UTC)Reply
I don't see any policy prohibiting morpheme hyphens elsewhere... — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 19:29, 4 August 2024 (UTC)Reply

Vote has been drafted

edit

@Svartava, Thadh I have started a vote at Wiktionary:Votes/2024-08/Allow hyphens in link displays for Indo-European proto-languages. Feel free to discuss or ask for amendments. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 19:20, 4 August 2024 (UTC)Reply

This is a silly vote. As you pointed out, there is no policy prohibiting hyphens in entry links, let alone alternatives to links. Again, it is up to communities to decide what conventions they use. If you want to change status quo conventions for Proto-Italic, start vote on that, like at Wiktionary:Requests_for_deletion/Reconstruction#Proto-Italic_terms_with_only_one_descendant. --{{victar|talk}} 21:55, 4 August 2024 (UTC)Reply
@Victar You are the one who suggested a vote. Pick a lane. Theknightwho (talk) 01:56, 6 August 2024 (UTC)Reply
I suggested a vote specific to Proto-Italic or Proto-Celtic. --{{victar|talk}} 03:08, 6 August 2024 (UTC)Reply

Latin months: nouns or proper nouns? Capitalized or uncapitalized?

edit

Another Latin "proper noun" question. Currently, there seems to be no standardization in how we format entries for Latin month names. Aprīlis (April) only has a capitalized entry, and is marked as an Adjective and Noun. Maius is marked as an Adjective and Proper noun; there is a stub at maius noting it is an Alternative letter-case form. On the other hand, iānuārius is used as the main entry (Adjective and Noun) while Iānuārius is marked as an Alternative letter-case form. Contributing further to the mess, Category:la:Months includes multiple variants of some names such as Jānuārius.

What should the main entries be, what POS should be used, and how much information should be included in the alternative case form entries? In English, the POS of months is treated as "Proper noun". Urszag (talk) 11:10, 4 August 2024 (UTC)Reply

Do any Latin dictionaries indicate when something is a proper noun? (In English, one hurdle to consulting other dictionaries about whether some class of word is a common noun or proper noun has been that many lazily have just one 'noun' category into which everything goes.) I seem to recall the fact that Russian month names are listed as uncapitalized common nouns being the result of a discussion where Russian editors argued for that based on how Russian references/speakers treated them.
Do you have a sense of whether modern editions of Latin texts usually capitalize month names, the way they usually capitalize personal and place names? Poking around Google Books, it looks to me like "modern" Latin texts (actually, everything that turns up, from texts written in Latin the 1500s and 1600s to recent editions of ancient Roman works) almost always capitalizes month names, which suggests the capitalized forms should be the main entries. - -sche (discuss) 15:52, 4 August 2024 (UTC)Reply
I'm not familiar with any Latin dictionary that indicates proper nouns. Typically they just mark nouns or proper nouns by providing the gender (m, f, n); DMLBS also makes some use of the label "sb." (substantive) for both nouns and proper nouns. In my experience, capitalization is the usual editorial convention.--Urszag (talk) 17:26, 4 August 2024 (UTC)Reply
Since it seems (by capitalization) that Latin dictionaries treat months as capitalized proper nouns, I would argue we should do the same. Likewise the adjectives should be capitalized. Benwing2 (talk) 18:39, 4 August 2024 (UTC)Reply
Capitalisation doesn't mark proper nouns: Several dictionaries also capitalise other adjectives like Homēricus (Homeric), Rōmānus (Roman) and common nouns like Rōmānus m (Roman (person)).
As for months and spellings: It's also a matter of attestion. Is always both like Februārius and februārius attested?
Likewise for months and POS: Is always both mēnsis Februārius/februārius (or something like: Kalendae Februariae/februariae, Nonae Februariae/februariae, Idus Februariae/februariae) and simply Februārius/februārius m attested? --16:18, 6 August 2024 (UTC)
I don't really understand the second part of this comment. Ancient texts don't use capitalization, so there is no relevant ancient attestation distinguishing the two. Pretty much every modern edition I've seen (or modern Latin works, such as "Lingua Latina Per Se Illustrata") follows the convention of capitalizing the names of Latin months. This isn't restricted to English editors either: you can see "Augustus" capitalized in French texts such as the Gaffiot dictionary. I did see some lowercase examples of Latin month names on Google Books (e.g. "mensis augustus") so they are also attested, but I'm confident that uppercase is currently the more usual convention.--Urszag (talk) 15:02, 7 August 2024 (UTC)Reply
Another example of capitalization: "Datum Romae, apud S. Petrum, die XIX mensis Martii, in sollemnitate Sancti Ioseph, anno MMXVIII, Pontificatus Nostri sexto" in Pope Francis's Gaudete et exsultate (2018).--Urszag (talk) 15:25, 7 August 2024 (UTC)Reply
I found an older discussion from when month names were moved to lowercase versions: Wiktionary:Tea_room/2015/June#Latin_month_names. It looks like EncycloPetey based this on (some edition of?) "Josip Lučić Spisi Dubrovačke Kancelarije, a series of legal documents in Latin from Ragusa in the late 13th century". I'm not convinced yet that the cited text is representative of medieval usage as a whole, or that medieval usage should be relevant compared to the typical usage of more recent centuries, but I wanted to link to that discussion for greater context. I have already started moving the names (back) to capitalized versions based on the input from -sche and Benwing2.--Urszag (talk) 15:38, 7 August 2024 (UTC)Reply
The publication in question was the source of citations, used because it was the easiest at hand, and because the text had both capital and lowercase lettering. A search of other medieval records containing dates should be able to furnish additional citations, as long as the scribe wrote out month names rather than numbers. At the time of the earlier discussion, the Latin months were treated as adjectives because the available citations in both classical and medieval Latin demonstrated use as adjectives. Modern dictionaries and Modern Latin do use capitalized forms, but Augustus is not a good example, since it specifically derives from the name of a person. Capitalization of months like october and november would be stronger evidence for capitalization, but as I say, evidence at the time suggested the practice of capitalizing month words was a modern practice. --EncycloPetey (talk) 16:30, 7 August 2024 (UTC)Reply
Thank you for the clarification; so this is a compilation which is being cited as showing multiple independent examples of medieval usage? I guess it seems to me that the first question to be resolved (before getting into the question of what typical medieval usage was) would be whether capitalization on Wiktionary should be based on modern capitalization practices (e.g. "Datum Romae, Laterani, die XV mensis Octobris, in memoria sanctae Teresiae a Iesu, anno MMXXIII, Pontificatus Nostri undecimo", Est utique fiducia/C'Est La Confiance, 2023) or on medieval capitalization practices. I think that in general, we follow modern practices for spelling Latin words in entry titles; e.g. the use of "ae" and "oe" rather than æ, ę, œ, although I guess it is often difficult to distinguish between Classical conventions and modern conventions.--Urszag (talk) 17:09, 7 August 2024 (UTC)Reply

Our treatment of MIA reconstructions

edit

@Pulimaiyi, Kutchkutch, Svartava (feel free to ping others, no idea who is interested in this stuff these days): There are many terms that are only attested across several New Indo-Aryan languages but not at any earlier stages of Indo-Aryan. Sources like Turner's {{R:CDIAL}} reconstruct ancestral forms for such cognate sets, but due to phonological degradation (e.g. consonant cluster assimilation) the reconstructions can only go back to Proto-Middle Indo-Aryan rather than a language we clearly know how to deal with like Proto-Indo-Aryan or Proto-Sanskrit.

For the past couple years our strategy has been to call these reconstructions Proto-Ashokan Prakrit, which is a language we made up and not a label that is really used in any literature (0 hits on Google). We settled on Ashokan Prakrit since it is likely the ancestor of all New Indo-Aryan languages (including "Dardic") and we didn't have a later node that unifies NIA subfamilies, since e.g. we used to treat Prakrit and Apabhramsha as collections of languages.

Now that we have codes for unified Prakrit and unified Apabhramsha, I think we should move any Proto-Ashokan Prakrit terms without "Dardic" descendants (e.g. *𑀟𑀼𑀓𑁆𑀓𑀭 (*ḍukkara, pig)) to Proto-Prakrit. Proto-Prakrit is a term used in scholarly literature on IA historical linguistics, including by Turner. Also, this way we are not overclaiming the age of the word.

One edge case to consider is that often, a term may be constrained to non-Dardic NIA but also happen to have a descendant in Kashmiri; an example is *𑀝𑁄𑀓𑁆𑀓 (*ṭokka, basket)). Kashmiri is the "Dardic" IA language that is most in-contact with plains Indo-Aryan (particularly Punjabi). I think this should also be called Proto-Prakrit but we can debate this. Ideally, we reserve Proto-Ashokan Prakrit for any NIA terms with non-Kashmiri Dardic cognates. —AryamanA (मुझसे बात करेंयोगदान) 20:40, 4 August 2024 (UTC)Reply

I agree. Some related followup Qs:
1. How do we want to handle cases like *dākka, which Turner reconstructs [36]here with both a long vowel and consonant cluster (which is generally considered invalid in Middle Indo-Aryan). It appears that Turner is reconstructing Old Indo-Aryan. In this case, do we want to say that the descendant is Sanskrit *डाक्क (ḍākka), Prakrit *𑀟𑀸𑀓𑁆𑀓 (*ḍākka​), Ashokan Prakrit *𑀟𑀸𑀓𑁆𑀓 (*ḍākka​), or Prakrit *𑀟𑀓𑁆𑀓 (*ḍakka​)?
2. Is Proto-Prakrit a separate language or just a shorthand for referring to reconstructed Prakrit? I haven't seen any Proto-Ashokan Prakrit language in Wiktionary, so I'm guessing what you're referring to is reconstructed Ashokan Prakrit, right? Dragonoid76 (talk) 18:57, 5 August 2024 (UTC)Reply
One more question—what are the cases where it makes sense to reconstruct "Sanskrit", as opposed to "Proto-Prakrit" or "Proto-Ashokan Prakrit"? Can we make (or does it already exist?) a clear decision on these cases? For example:
Dragonoid76 (talk) 20:00, 5 August 2024 (UTC)Reply
I also agree. For Dardic descendants, and also Pali descendants of Turner reconstructions, we might want a Proto-Middle Indo-Aryan but that ways the age of any word will obviously be implied more than what it would be if it was called "Proto-Prakrit". I'm also open for Sanskrit reconstructions which do seem better suited in some cases like *ध्वजदण्ड (dhvajadaṇḍa), *तिथिवार (tithivāra), etc. and this can be easily dealt with on a case-by-case basis (due to the low number of MIA editors) as to which reconstruction fits better. I would also like to point out that despite being less frequent, early MIA like Pali does show both a long vowel and consonant cluster and even some Prakrit words do that, so I don't think it would be very problematic to have Proto-Prakrit reconstructions having both a long vowel and consonant cluster. Svartava (talk) 04:55, 6 August 2024 (UTC)Reply
@AryamanA: Hello! It's great to see you active again. As a matter of an incredible coincidence, @Svartava and I have been, for the past few weeks, discussing on Discord about having a Proto-Prakrit code. Having a Proto-Prakrit code is surely less problematic than taking Turner reconstructions (which were intended by Turner to be Sanskrit) and showing them as Ashokan Prakrit, a practice unique to Wiktionary. Moreover, in Ashokan reconstructions, we spell out the geminated stops (case in point: *𑀝𑁄𑀓𑁆𑀓 (*ṭokka)) but we know that gemination was not reflected in spelling in the edicts of Ashoka. So we have to either change these reconstructions to Proto Prakrit or render them in the Latin script. Also, Ashokan needs to be set as the ancestor of Dardic (it's not, for now).
@Dragonoid76: To address your queries: a long vowel followed by a geminated consonant cluster is uncommon, but not invalid in MIA, as cases like dātta definitely exist. As for Prakrit entries in the reconstruction namespace vs Proto-Prakrit as a separate code, I am of the opinion that since Prakrit has been merged, we might as well use Prakrit reconstructions. As for your next question of how to decide between Ashokan vs Pkt reconstruction vs Sanskrit, as Aryaman said, if it has non Kashmiri Dardic reflexes, it will be an Ashokan reconstruction. As of now, inc-ash is not set to be the ancestor of inc-dar-pro but that can be fixed. I believe it should be, because Shahbazgarhi Ashokan shows many features which can be said to be the ancestor of the corresponding features in Dardic. Deciding between Sanskrit and Ashokan can be much more challenging, given Ashokan contains sounds like /ṣ/, /ś/ and non simplified consonant clusters. So ciṣṭa might well be early MIA. One rule of thumb I'd use is, compounds where the components are discernable as Sanskrit words are Sanskrit, such as *bhaginī-putra -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 05:18, 6 August 2024 (UTC)Reply
@AryamanA: Moving a few entries from reconstructed early MIA Ashokan Prakrit to reconstructed middle MIA Proto-Prakrit seems to be uncontroversial since that was the original proposal.
@Dragonoid76, Pulimaiyi: Regarding, Is Proto-Prakrit a separate language or just a shorthand for referring to reconstructed Prakrit?
I agree with
since Prakrit has been merged, we might as well use Prakrit reconstructions
rather creating a new code for Proto-Prakrit. This is because creating a new code for Proto-Prakrit would mean that we would have to decide whether it is an ancestor, descendant or contemporaneous with the merged Prakrit language. Furthermore, Prakrit reconstructions are usually one-off for special cases unlike protolanguages such as Proto-Indo-Iranian. Protolanguages such as Proto-Indo-Iranian are entirely reconstructed while Middle Indo-Aryan is a mixture of attested and reconstructed terms.
Would the script continue to be Brahmi? … we have to either change these reconstructions to Proto Prakrit or render them in the Latin script
When Proto-Indo-Aryan reconstructions were moved to Ashokan Prakrit reconstructions, it was a delight to see them in Brahmi script instead of Latin script. Then, Victar started this discussion WT:Beer_parlour/2021/March#Reconstructions_in_Latin_script
Victar: I'd like get a discussion going about adding a guideline to WT:PROTO that states that all reconstructions should be in Latin script. Most already are, but here's a list of the ones that buck that standard…Ashokan Prakrit … Sanskrit
Mahāgaja: Devanagari seems perfectly natural to me
Victar: If we're going by academia, reconstructions will always usually be in Latin script, which does also go for Sanskrit and Avestan. Seeing RC:Sanskrit/लुट्टति is rather weird to my eyes
I agree with Fay Freak’s comment. However, I also see what Victar meant. Academia in the English language will probably not consider Wiktionary’s reconstructions seriously if they are not in the Latin script.
At Talk:बद्ध, AryamanA said
It is not useful to reconstruct with the idiosyncracies of Ashokan Brahmi being applied, in comparative linguistics we care about the phonology not orthography.
If the idiosyncracies of Brahmi are not to be applied to reconstructed Brahmi, and if we care about the phonology not orthography, then that might suggest that the Latin script might be used for reconstructions if the Latin script better represents the phonology. However, it could be argued that even the Latin script has idiosyncrasies of its own.
Question 1: @Pulimaiyi, Svartava: If middle MIA reconstructions continue to be in Brahmi, would the anusvara be used for homorganic nasal consonants, or would they be written as the Brahmi equivalents of ङ् ञ् ण् न् म्? The middle MIA convention is to use the anusvara. RC:Ashokan Prakrit/𑀟𑀗𑁆𑀓 uses ङ्, while RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 uses the anusvara.
As for … how to decide between Ashokan vs Prakrit reconstruction vs Sanskrit, … if it has non Kashmiri Dardic reflexes, it will be an Ashokan reconstruction … given Ashokan contains sounds like /ṣ/, /ś/ and non simplified consonant clusters … ciṣṭa might well be early MIA
What this means is that there will reconstructions at three stages:
OIA (Sanskrit)
Early MIA (Ashokan Prakrit)
and Middle MIA (Prakrit)
By analogy with RC:Sanskrit/चिष्ट, RC:Ashokan Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦 might be moved to RC:Ashokan Prakrit/𑀧𑀱𑁆𑀝𑀸𑀦 especially since there is a Kashmiri descendant K. paṭhān m. (see Reconstruction_talk:Ashokan_Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦#*paṣṭāna?). However, RC:Ashokan Prakrit/𑀙𑁄𑀝𑁆𑀝 has a Kashmiri descendant, but it does not resemble early MIA.
Question 2: @Pulimaiyi, Svartava: With such a scheme shouldn’t we use the ===Reconstruction notes==== section to explain why a particular stage was chosen for a reconstruction rather than another stage (in addition to other details)?
For example, when I look at
RC:Sanskrit/ध्वजदण्ड
RC:Sanskrit/तिथिवार
RC:Sanskrit/उन्नग्न
RC:Sanskrit/स्यालभार्या
it always takes me a few minutes to justify why these are being reconstructed as OIA (Sanskrit) rather than middle MIA because of
Special:Permalink/65062470#बुभुक्ष्
Pulimaiyi: Sanskrit reconstructions are very rare in wiktionary and are generally not favoured by wiktionary's convention … Sanskrit reconstructions are not favoured by wiktionary's convention because of the lack of reliable reconstruction sources to base it on.
See also:
Reconstruction talk:Sanskrit/ध्वजदण्ड
We already have RC:Sanskrit/तिथिवार, which is why I even thought of creating this reconstruction. Or else, I'd have simply added {{inh|hi|sa||*ध्वजदण्ड}}, without linking it.
[[User_talk:Inqilābī#Status_of_{{R:CDIAL}}_reconstructions]]
Kutchkutch: Do you have a opinion on whether RC:Sanskrit/उन्नग्न should be modified to a Prakrit form or remain as [it] appear[s] in {{R:CDIAL}}?
CDIAL Introduction:
Many of the headwords, like so much of classical Sanskrit vocabulary, are in reality Middle Indo-Aryan clothed, for the convenience of presentation, in an earlier phonetic dress
Inqilābī: No idea, but it might be the case that Turner reconstructs both OIA and MIA terms.
Talk:सलहज
PUC: Wow, the phonetic erosion was rather strong in there! No?
AryamanA: Yep. It's syālabhāryā > sālahāyya > sallahayya > salhaj
At one point I was deciding whether RC:Sanskrit/तिथिवार should be moved to Ashokan Prakrit and then decided not to. If the ===Reconstruction notes==== section explicitly explains why that particular stage was chosen (in addition to other details), then that would clear up the confusion.
At RC:Sanskrit/युट् despite saying,
Turner posits that all forms of this root may have originated from *युट्ट which was a MIA replacement for युक्त
it seems that the justification for having RC:Sanskrit/युट् as Sanskrit rather than middle MIA is that we agreed not to have middle MIA roots at Talk:घोट#𑀖𑀼𑀝𑁆𑀝𑁆-_(ghuṭṭ-). However, early MIA CAT:Ashokan Prakrit roots are permissable according to the following statement in that discussion:
Ashokan Prakrit roots are tolerated because *we* consider the unattested terms in Turner's dictionary to be Ashokan Prakrit
One rule of thumb I'd use is, compounds where the components are discernable as Sanskrit words are Sanskrit, such as *bhaginī-putra
The components of RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 are discernable as Sanskrit, but I placed it in MIA rather than OIA (Sanskrit). *bhaginī-putra differs from RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 because it has the Kashmiri descendant K. bĕnathᵃr m..
The relationship between reconstructed MIA and Dardic languages has been discussed several times such as at
Reconstruction talk:Ashokan Prakrit/𑀕𑀼𑀧𑁆𑀨𑀸
The existence of a Dardic cognate could suggest that this word existed in late Old Indo-Aryan/early MIA: this is precisely why initially a code for "Proto MIA" was proposed so that Pali and Dardic could be included; but that idea did not garner much support and we had to settle for Ashokan Prakrit instead, which albeit quite pervasive, unfortunately does not extend to Pali and Dardic.
Special:Diff/73057977 at RC:Ashokan Prakrit/𑀕𑀸𑀟𑁆𑀟
Any other way to deal with Dardic terms cognate with Ashokan prakrit without having to reconstruct Sanskrit?
Special:Diff/73407835 at گاڑے#Torwali
Apparently there are more Dardic terms than just Kashmiri corresponding to CDIAL 4116 *gāḍḍa 'cart'
Kashmiri is the "Dardic" IA language that is most in-contact with plains Indo-Aryan (particularly Punjabi)
Although Kashmiri is the most spoken Dardic language, the other Dardic languages are also in contact with “plains Indo-Aryan”, which might explain گاڑے#Torwali. RC:Sanskrit/चिष्ट has the Shina descendant چٹھ#Shina. Also, CDIAL Introduction: derives the Khowar term ātΛpik from reconstructed MIA :
Khowar ātΛpik `to have high fever' must rest either upon a late MIA. *ātapp- (newly formed compound with ā from tappaï) or upon MIA. *āttapp- with analogical -tt- (after type ā-tt- < ā-tr-, etc.). The head-word ātapyatē under which the Khowar word appears is thus in reality a Middle Indo-Aryan word in Old Indo-Aryan form.
What is probably meant by “Punjabi” here is “Punjabic languages” such as Pahari-Potwari and Hindko in addition to the standardised Majhi Punjabi.
Urdu as a lingua franca is also in contact with Dardic languages to a significant extent.
Pashto is another lingua franca that is in contact with Dardic languages in Khyber-Pakhtunkhwa and Afghanistan. Although Pashto is an Iranian language, Pashto borrows from Urdu and Punjabic languages including Lahnda/Saraiki. Perhaps it is too much of a stretch for a Dardic language in Khyber-Pakhtunkhwa or Afghanistan to have a “plains Indo-Aryan” term through Pashto. For example,
RC:Ashokan Prakrit/𑀕𑀸𑀟𑁆𑀟ګاډی#Pashtoگاڑے #Torwali
(See CAT:Pashto borrowed terms) Perhaps there is a possibility with RC:Ashokan Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦 that a Dardic language acquired the term first and then it spread to “plains Indo-Aryan”.
Kutchkutch (talk) 16:00, 6 August 2024 (UTC)Reply

etymology sections and a lack of standardization on detail

edit

We have basically zero standards on the level of detail one should put in an etymology section, some will only list the direct ancestor regardless on if it's derived from another language or not (DeJulio, others will list the ancestors of a word all the way to say, Latin(like dictionary or this Malay term for June), and then others still will go all the way back to PIE or similar. that's not getting into entries like admiral, orange or pizza that start to look run on paragraphs with stuff like cognates and miscellaneous etymological detail.

I do recognize a pattern of more common or popular words having the larger etymology sections but that not really the "problem" here, and the longer sections are all pretty much on topic even if they get rambly. For one, we aren't Wikipedia, and these long paragraphs are a bit unwieldy to the average reader(read: eyesore), and if i probably wouldn't have broached this topic this time last year on the merit of having the full etymology on the same page to be quite useful, and probably was the intent prior, however with the introduction of the etymon template among other technical revolutions on the site this year, there's now much better ways(imo) to present the info to the average readers. another argument for reducing these large sections would be synchronicity, as I've encountered plenty of cases where one entry is missing details provided by another or one having an error that the other had fixed.


now it might sound like i'm advocating for said "only list the direct ancestor" situation but honestly my main gripe with how things are are mostly just presentation of the info, I've brought up on the Discord the suggestion of if entries are to be going the distance of providing an exhaustive etymology, that it doesn't need to be presented in paragraph form, particularly given that it's mostly just three to five word statements like(now presenting how it could be presented instead of paragraphs):

- Word A from language A

- Word B from language A

- Word C from Language B

- Word D from Proto Language Akaibu (talk) 06:39, 6 August 2024 (UTC)Reply

I mentioned this on Discord: with the {{etymon}} template, I don't think it'll hit widespread usage until it's easier to use than the basic etymology templates like {{der}}, {{bor+}}, {{inh}}, etc. etc. Having to learn/use IDs and the whole system is daunting for the average editor. I do agree though that our etymologies do need cleanup in terms of what to display. A lot of times I'll just show the initial borrowing and put "ultimately from" for entries like Hawaiian ʻApekanikana or Yoruba Alibéníà. AG202 (talk) 17:08, 6 August 2024 (UTC)Reply
@AG202: The goal of {{etymon}} is to connect entries like puzzle pieces, so I think the main problem currently is that very few entries are using it. In the future it will hopefully be saving massive amounts of time on stuff like categorization, finding derived terms, and writing out long etymological chains by hand. Ioaxxere (talk) 19:56, 6 August 2024 (UTC)Reply
I'm in favor of increased usage of etymon to reduce the problem of different etymology sections not being in sync with each other, but I agree that in its current form the template is not simple enough to be easily used (e.g. the ID system is cumbersome, and the conditions for when not to use "from" are not intuitive).--Urszag (talk) 20:25, 6 August 2024 (UTC)Reply
We've previously discussed - and seemingly agreed upon - how to make the syntax of {{etymon}} more intuitive. Due to the unfortunate choice of title I cannot link the thread directly, so here is the URL in plaintext: https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2024/June#{{etymon}}
Incorporating Benwing's last suggestion, we'd have something like:
{{ety|en#X|clever#Y|-ly#Z}} “[cleverly is] from clever + -ly
{{ety|en#X|enm:charitee#Y}} “[charity is] from Middle English charitee
{{ety|ru#X|de:montieren#Y|-овать#Z}} “[монтировать is] from German montieren + Russian -овать
The X, Y, Z following the hashtags are IDs, and various additional parameters can be added like |inh=1 |bor=1 |blend=1 |backformation=1
If the syntax were like that, I'd actually be happy to use it as an general-purpose etymology template. Nicodene (talk) 23:48, 6 August 2024 (UTC)Reply

Reminder! Vote closing soon to fill vacancies of the first U4C

edit
You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Dear all,

The voting period for the Universal Code of Conduct Coordinating Committee (U4C) is closing soon. It is open through 10 August 2024. Read the information on the voting page on Meta-wiki to learn more about voting and voter eligibility. If you are eligible to vote and have not voted in this special election, it is important that you vote now.

Why should you vote? The U4C is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community input into the committee membership is critical to the success of the UCoC.

Please share this message with members of your community so they can participate as well.

In cooperation with the U4C,

-- Keegan (WMF) (talk) 15:30, 6 August 2024 (UTC)Reply

Micronations inclusion criteria

edit

Micronations are not explicitly mentioned in Wiktionary:Criteria for inclusion#Place names, yet we have at least seven pages for micronations on Wikt. Seeing as they do count as place-names, I am asking here for input on whether or not micronations should be allowed to have their own entries/just be subject to the same criteria as any entry. FWIW, I am of the opinion that they should be allowed to have entries, and, for clarity, be added to the aforementioned policy link as legal scholars tend to classify them as political entities, which are already allowed entries on Wikt. Would appreciate any feedback or comments, including any opposition to this proposal! Kindest regards, LunaEatsTuna (talk) 03:30, 7 August 2024 (UTC)Reply

I've discussed on the Discord that they should be counted, because they are names of places, and could be seen as already have been included in CFI.
My reasoning for this is as follows:
1. We include "[h]uman settlements: cities, towns, villages, etc."
2. Micronations are human settlements, in the sense that they have/had people who live in them. (We also list ghost towns with 0 people in them, so people actually living in them isn't a concern).
3. As such, human settlements are implicitly included in CFI.
Regardless of if you think my reasoning is sound, I do feel that they should be included, as they can achieve the same level of being talked-about as the towns in Arizona, or even more, in some cases (such as Sealand.) CitationsFreak (talk) 04:00, 7 August 2024 (UTC)Reply
I firmly believe that they are not currently included under our current criteria for Wiktionary:Criteria for inclusion#Place names. Looking at the list found at w:List of micronations, I would be hard-pressed to say that our policy states that we should include all of them. Most of them have no people living in them, and some don't even have an actual territory. A resort, a farm, a bank, two sculptures, straight-up fraud, and more should not be included by default as purported micronations. While I don't necessarily support our current policy that includes ghost towns & unincorporated communities with no people living in them, those at least receive recognition from an actual state and can be found on official government documents.
A lot of micronations are essentially "I made this up". Some should fall under WT:COMPANY. For example, I simply do not think that the Principality of Snake Hill, from a "family in New South Wales who were unable to afford their taxes seceded from Australia", should be included by default here. Some micronations are just online communities, and I don't think we'd want to open the floodgates to the name of just any online community that declares itself a micronation. A number of them claim territory that they don't even live on. It just rings as unserious, frankly, and our place names policy is broad enough as is. They don't rise up the level that an actual unrecognized state like Somaliland does.
That being said, I would support a policy to explicitly include notable micronations such as Sealand, but I'm not yet sure what the notability criteria should be. But for now, I'd say that they fall under this policy: "Most manmade structures, including buildings, airports, ports, bridges, canals, dams, tunnels, individual roads and streets, as well as gardens, parks, and beaches may only be attested through figurative use.", if even that. Or they could go into the Appendix. AG202 (talk) 04:28, 7 August 2024 (UTC)Reply
I would like to point out this line from the rationale of the CFI place names vote: "[T]he categories are left open-ended to allow more of our existing entries." This means that if a specific type of place is not explicitly spelled out, it does not mean that it falls under the criteria.
Also, the regular CFI criterion protects us from having to deal with every little obscure micronation made up by a ten-year-old in their bedroom. I would say that any micronation that is mentioned in three+ independent sources over a period of one year should be included. If enough people talk about, say, Melchizedek, then I'd say it's notable enough for us.
(Plus, is a fake nation really more similar to an airport or street or anything else mentioned in that sentence than a nation?) CitationsFreak (talk) 04:47, 7 August 2024 (UTC)Reply
"Plus, is a fake nation really more similar to an airport or street or anything else mentioned in that sentence than a nation?" Yes? I'm almost certain major and even some minor airports have more notability and usage than the vast majority of the micronations listed. Let alone actual nations and sovereign states. Like I said some of them are literally a singular building. Also, honestly, our CFI criterion doesn't protect us, considering all we need are 3 Usenet comments, or at this point simply 3 tweets. AG202 (talk) 05:02, 7 August 2024 (UTC)Reply
I meant that in terms of function. A micronation acts like a nation, with its own government and rulers and flag and so on. This is unlike an airport or a street, which doesn't.
Also, like I said, I am using the CFI standard, "use in durably archived media, conveying meaning, in at least three independent instances spanning at least a year". If these conditions are met when people are talking out a building, why shouldn't it be in Wikt? CitationsFreak (talk) 05:08, 7 August 2024 (UTC)Reply
Because they are explicitly excluded by WT:CFI#Place names, unless they have figurative usage, which is exactly my point. If we included buildings and such by default, I wouldn't be replying here, but that's not the case. We can't simply have someone redress a building or company or farm or something similar as a "micronation", get 3 independent usages, and then bam, we include it by default. That just does not align with how I'd expect our policies to be read. And looking at the list from WP, based on the references they have, I would expect the vast majority, if not all, of them to pass if we include them by default. AG202 (talk) 05:15, 7 August 2024 (UTC)Reply
I would totally expect a reader to look Sealand, but not the name of any other sea fort, since it is famous. (However, I wouldn't expect a reader to look up Bob's Principality of North-East Main Street.)
(In the Discord, I had also suggested "a new rule for microstates, that says something like "Ignore all references to the founding of the state[, since they are not independent]"? What do y'all think?) CitationsFreak (talk) 05:33, 7 August 2024 (UTC)Reply
Well, a micronation is a territory around a building plus a government. So we include them by comparison to, or their partial identity with human settlements, neighbourhoods and countries, as we even include fictional countries. This does not exclude that some shall not be included also according to our inclusion criteria because they are more similar to constructed languages, for instance.
We should be more concerned with violation of WT:BRAND by their artificialities. Some are organized like a cult, a club or criminal organization, though we include 'ndrangheta, Hamas, Islamic Revolutionary Guard Corps and the Unification Church, or what: I think about Reichsbürger here, whose constructs aren’t considered micronations however. Somewhere it does go too far. We won’t agree on their being noted in references per se supporting their inclusion, though notability is important, since just for clarity and not being confused with Wikipedia Wiktionary editors will avoid mentioning notability in the CFI, which they fear not to even understand in the same way as you if they don’t edit Wikipedia.
We can make RFDs for any reason later if the current inclusion situation goes out of hand, I don’t see a benefit of a theoretical community agreement on inclusion criteria specific to micronations. It is right, necessary and sufficient that we have discussed it, this well help us later to find out what goes too far. Fay Freak (talk) 13:15, 7 August 2024 (UTC)Reply
Micronations aren't all one kind (Liberland denotes a specific area, Obsidia is a movable rock; some are oft-mentioned, some scarcely-mentioned), so IMO we shouldn't add blanket acceptance of all micronations to CFI. But if enough people use a term like Liberland to refer to a given area, I don't see an obvious dividing line between that and other coinages for specific (or nebulous!) regions—which may not have administrative significance or population—we don't bat an eye at: the Triangle, the Golden Strip, Trójmiasto, Mariana Trench, not to mention terms where sovereignty is disputed, like Northern Cyprus, Judea and Samaria, or Donetsk People's Republic. If there are cites to support it, I don't see a reason not to include Liberland: but that doesn't mean we should define these as real nations; I might lead with Liberland being a name for a particular area (used by people who claim it's a nation), and likewise might merge the first two senses of Seborga and just mention that the town is claimed to be a micronation.
It's true anyone can make up a micronation and we could be flooded, but we can RFD things were needed (if we don't blanket-include them), and AFAICT people could already coin and flood us with coinages for non-micronation regions: if people start calling an arbitrary U-shaped snake of land from Hamburg down to Hannover and east to Berlin and north up to Waren "HaHaBeWare" (or something as self-promotional as some micronations, like "Rachel's Backyard"), not asserting it to be a micronation but just saying "this is a name for this region, a la the Golden Strip", I don't currently see on what basis we wouldn't include that... (Also, while Obsidia, which I mentioned above, is less a placename and more like Ishango bone or Einang Stone, we seem to be deciding at RFD to keep such "names of specific individual stones and bones", so maybe Obsidia is fine too? I don't know; I'm more sceptical of it, and we don't currently have other stone-names I checked like Stone of Scone, but I'm curious why Ishango bone would get a pass and not Obsidia... maybe we want to reconsider including Ishango bone?) I am, as always, liable to change my mind as I hear more arguments... - -sche (discuss) 21:36, 7 August 2024 (UTC)Reply

Billion: a thousand millions or a milion millions

edit

Garner still uses the regular plural in these types of definition. Thus, for trillion he states that in Great Britain, it traditionally means a million million millions.

When it comes to defining nominal meanings as different from numerals, should the wording not reflect this? JMGN (talk) 16:13, 8 August 2024 (UTC)Reply

la Luynes

edit

(Notifying PUC, Jberkel, Nicodene, AG202, Benwing2): There has been a conflict on what to do with the headword line (pinging the article creator @Olybrius). My understanding is that the article "la" seems to be always used with the name of this river, and it is not capitalised; but I don't think we should change the headword line. What should we do? Are there other names in the same situation? This is not like the situation of La Défense where "La" is lexicalised as part of the name and is always capitalised (however there are also some websites that perhaps by mistake have left it uncapitalised.) --kc_kennylau (talk) 19:29, 8 August 2024 (UTC)Reply

@Kc kennylau This is very common with French rivers as well as other entities, e.g. most countries (les États-Unis, la France, but just Israël). We don't have a general policy on how to handle this; in English, there is now a param |the=1 for cases like this, which displays "the" in the headword (e.g. the White House; but not always used, cf. the Castro, a well-known district in San Francisco). In German, {{de-noun}} also has special support for this. I'm not sure about other languages. Benwing2 (talk) 19:42, 8 August 2024 (UTC)Reply
For that matter, most (all?) rivers in English use the as well. Benwing2 (talk) 19:42, 8 August 2024 (UTC)Reply
I do feel that we should maybe add articles in the headword for things like la la Barbade or l’Alabama. It makes it more clear for learners, especially since not every region or country uses an article. It's brought up quite often in French-learning spaces. AG202 (talk) 20:58, 8 August 2024 (UTC)Reply
I agree. I don't know if it's necessary for rivers because AFAIK all rivers take an article, but for countries and regions it varies from term to term and is very useful to include. That's why it's included in English and German, for example. Benwing2 (talk) 21:17, 8 August 2024 (UTC)Reply
I also agree. CitationsFreak (talk) 03:55, 9 August 2024 (UTC)Reply
I don't see the point of adding the article, knowing the gender is enough. See Nil, Rhône, Meuse, Rhin, Danube, etc. PUC19:46, 8 August 2024 (UTC)Reply
This is my view as well.
For Luynes, if the concern is that a reader may not know to use la (as opposed to *les), that can be clarified in a usage note. Nicodene (talk) 21:32, 8 August 2024 (UTC)Reply
Does your belief apply only to rivers, or also to countries and regions (see above)? If the latter, my concern is that these usage notes would need to be added to every country and region, and would be more compactly conveyed in the headword (following the example of English and German, among others). Benwing2 (talk) 21:35, 8 August 2024 (UTC)Reply
Personally I'd only add usage notes when something deviates from the pattern. Of the countries mentioned so far that's just Israël. Nicodene (talk) 22:41, 8 August 2024 (UTC)Reply
I'd just find it easier to include the definite article in the headword for learners, since it's not like it's particularly common to find them with the indefinite article. And then for the prepositions used, we definitely have to include usage notes or usexes (like fr.wikt) after having seen this page for countries and this page for U.S. states, which is what I've done at pages like Alabama and Barbade. AG202 (talk) 04:37, 9 August 2024 (UTC)Reply
It seems we're tackling several topics at once, but getting back to the topic of rivers specifically, the indefinite article is also in use: "Pour une Seine plus propre". Therefore, not only is it not useful to add the article in the headword, it's also potentially misleading. We don't feel the need to mention that Thames is used with an article, why should it be any different for French names? We're a dictionary, not a grammar book. PUC12:01, 9 August 2024 (UTC)Reply
To be clear, I don't think rivers need an article displayed. However, I will say that in English, with United States, you can definitely say "(for) a cleaner United States" as well; it's just not particularly common, and it's generally understood as a possibility, so I don't think it's misleading to show the definite article. AG202 (talk) 13:34, 9 August 2024 (UTC)Reply
What reason would there be to follow this approach that would not just as well justify adding articles to all French nouns? Nicodene (talk) 21:34, 10 August 2024 (UTC)Reply
Because not all countries/regions use an article. It's a class of words that has its own special rules/usages. AG202 (talk) 00:09, 11 August 2024 (UTC)Reply
Neither do all nouns. Nicodene (talk) 00:36, 11 August 2024 (UTC)Reply
I think you are missing the point. There are semantic reasons why some common nouns take articles and some don't, but there are no such reasons for proper nouns referring to countries and regions; it's essentially arbitrary. Benwing2 (talk) 00:38, 11 August 2024 (UTC)Reply
It's rather that I don't see why the “default” state for a class of words should need to be marked every time, as opposed to just the exceptions. Nicodene (talk) 03:50, 11 August 2024 (UTC)Reply
Because from a French learner's perspective, while it's clear that every noun has an article, it's not necessarily assumed with a country name considering how other languages handle countries. I've seen it happen so many times in French-learning spaces, where folks are confused as to which countries use an article, what gender they are, what prepositions they use, etc. etc. AG202 (talk) 01:19, 12 August 2024 (UTC)Reply
Then mark the nouns and proper nouns that deviate from the general pattern of taking articles? I don't see the problem at all. Nicodene (talk) 01:25, 12 August 2024 (UTC)Reply
How are readers of this dictionary (which is an English dictionary, intended for English speakers, whose countries don't normally come with articles) going to magically know these Wiktionary-specific conventions? Benwing2 (talk) 01:28, 12 August 2024 (UTC)Reply
I'd be very surprised if there exists a single dictionary of French with entries/headwords like le chat, la femme, l'homme, la France, Israël, janvier, le zèbre. Nicodene (talk) 01:34, 12 August 2024 (UTC)Reply
Most if not all bilingual dictionaries leave out lots of pertinent info; that doesn't mean we need to do the same (and User:AG202 and I have already said there is a material difference between 'le chat' and 'la France', which you seem to be willfully ignoring). Benwing2 (talk) 02:23, 12 August 2024 (UTC)Reply
Also it's not like several monolingual & bilingual French sources don't list them either: see: French Wikipedia, the French government, the Canadian government, The International Labour Organization, the UN's term database, and the EU, in addition to Quebec's government showing it using the "visiter" examples. I don't know why we can't do the same, especially since the project is aimed at English speakers. AG202 (talk) 02:33, 12 August 2024 (UTC)Reply
If the existence of an exception to a rule means that everything that does follow that rule needs to be marked as such, then you should also, logically, do the same with all French nouns. Nicodene (talk) 02:35, 12 August 2024 (UTC)Reply

Bot rights

edit

We really should have a policy for removal of bot rights from accounts that have become inactive for a reasonable period. We could say a bot is temporarily inactive after 2 years and permanently after 3 years. For example, NanshuBot and Websterbot have not edited since 2003, and TheCheatBot has made no contributions since 2008. There must be a notice to the bot owner prior to removal of rights. Any bot removed due to temporary inactivity must be restorable at the request of the owner. However, if the rights are permanently taken away after a longer period, it would require another vote for their reinstatement. Let me know what you think. — Fenakhay (حيطي · مساهماتي) 02:45, 9 August 2024 (UTC)Reply

@Fenakhay Sounds good to me. 2 years sounds good for a temporary revocation but for a permanent revocation maybe 5 years; 3 years seems maybe too close to 2 years. I would add that if a bot owner requests that the bot rights be restored, this doesn't reset the clock; if they ask for a restoration but don't do anything with their bot, then the bot is still subject to permanent revocation after the relevant period from the last edit performed by the bot. Benwing2 (talk) 04:14, 9 August 2024 (UTC)Reply
  Support Vininn126 (talk) 13:35, 9 August 2024 (UTC)Reply
  Support BABRtalk 07:06, 10 August 2024 (UTC)Reply
Knowing which accounts are bots is surely of great value to researchers and others who are studying patterns of contribution to this wiki. Given the special importance of the bot flag as a way of distinguishing non-human contributors to our entries, I'd rather deal with the account compromise risk by indefinitely blocking inactive block accounts rather than taking away the bot group. This, that and the other (talk) 11:41, 11 August 2024 (UTC)Reply
Should this say "inactive bot accounts" rather than "inactive block accounts", @This, that and the other? LeadingTheLifeOfRiley (talk) 22:36, 11 August 2024 (UTC)Reply
Yes! This, that and the other (talk) 00:48, 12 August 2024 (UTC)Reply
I agree with TTO here. When a bot hasn't edited for a long time, it doesn't suddenly turn into a human being. So if the only goal is to prevent the bot account from getting compromised, we might as well use a block. Ioaxxere (talk) 06:03, 15 August 2024 (UTC)Reply
Yes, it confused me that neither Fenakhay nor Benwing2 actually gave a rationale for the proposal, so I had to read between the lines and assume it was to minimise the risk of account compromise, in which case blocking is the more appropriate solution in this context.
If it's needed to make my position clearer, I   Oppose the proposal as put. This, that and the other (talk) 01:28, 17 August 2024 (UTC)Reply
  Support @Benwing2's suggestions. — Sgconlaw (talk) 20:44, 13 August 2024 (UTC)Reply
  Support removal after 5 years' inactivity. The bot operators could be dead now, but we don't know for sure. DonnanZ (talk) 08:56, 16 August 2024 (UTC)Reply
Hmm. This is tricky. I'm weakly inclined to   Oppose as written and   Support TTO's alternative idea to block inactive bots, because I appreciate that declaring inactive bots to have become human (!) — or, to have ceased being bots — is probably unhelpful... but I think the real issue is that we're using the bot flag to mean two different things at the same time, and that becomes a problem in situations like this where only one of the two things is true. We mean both "this account is [present tense] authorized to operate as a bot" and "this account is a bot". Perhaps what we really need is to have the devs add a new user group for "inactive or unauthorized bot" (which has no special rights), to which inactive, or recently-active and blocked unauthorized, bots can be switched...? But even then, blocking such bots seems advisable, so I am persuaded that blocking is a better way to accomplish the goals of "continue to indicate which edits came from a bot" and "prevent the accounts from making edits". - -sche (discuss) 04:33, 17 August 2024 (UTC)Reply
A comparison between bot activity and non-bot activity by the bot account holder may be useful. There may be cases where the bot account holder is still active, but not using their bot. DonnanZ (talk) 12:32, 18 August 2024 (UTC)Reply

Glottonym tweaks: Franco-Provençal, Venetian → Francoprovençal, Venetan

edit

These changes would bring Wiktionary in line with the naming conventions of modern English scholarship, as found in for instance the Oxford Guide to the Romance languages (2016).

Context:

  • Francoprovençal has been the name used in French scholarship since the 1970's. Removing the older hyphen lessened the misleading impression that the language is some sort of secondary blend of French and Provençal (Occitan). There is also an element of typographical convenience.
  • Veneto has always been the name used in Italian scholarship, if I'm not mistaken, with Veneziano predominantly or exclusively reserved for the varieties spoken in Venice and environs, as opposed to the rest of the Venetan domain (Ve1, Ve3‒7).

Nicodene (talk) 22:05, 9 August 2024 (UTC)Reply

Support, the Venetan proposal in particular has been a long awaited change, and given a part of modern Anglophone scholarship handle this sensibly we have little reason to stay behind. Catonif (talk) 22:15, 9 August 2024 (UTC)Reply
  Support. Never heard of Venetan but if this is the accepted term, so be it. Benwing2 (talk) 07:40, 10 August 2024 (UTC)Reply
Thoughts, @Apisite, IvanScrooge98, Samubert96, Sartma, Ultimateria, Urszag, Word dewd544?
(Active users who speak Venet[i]an or have contributed to its entries.)
Nicodene (talk) 20:52, 13 August 2024 (UTC)Reply
Thanks for pinging me. I am pretty indifferent to the hyphen question for Francoprovençal, while I am not fully convinced about Venetan; after all, Venetia is the anglicized name for the region of Veneto (if the linguistic reasoning is to distinguish the specific dialect of Venice from the language as a whole). But if Venetan is now most common in English-language professional literature, then I don’t think there is much to debate. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 21:21, 13 August 2024 (UTC)Reply
The region's name occurs ~15 times more often in English as Veneto than Venetia, according to a Google search for “region of ____” (119000 results versus 7960). The latter occurs generally in historical as opposed to modern contexts.
Also at the moment we have no (reasonable) way to indicate a term used in Venice proper, as opposed to, say, Padua. A dialect label like Venetian would be identical to the name we currently use for the overall language (contra, as mentioned, the name used in linguistics). Nicodene (talk) 22:05, 13 August 2024 (UTC)Reply
Yeah, as I said, I get the reasoning. The thing is Venetian, despite being most commonly a word for stuff from Venice specifically, is not a strictly technical term like Venetan is—which is what comes to me a bit off given that this project is not directed to linguists but rather to the general public. And we could still label entries from the dialect of Venice as Venice, Venice dialect, Venice Venetian or something along those lines. But, again, it doesn’t mean I strongly oppose changing Venetian to Venetan. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 22:19, 13 August 2024 (UTC)Reply
The general public in Italy would be surprised to hear the dialect of, say, Padua described as veneziano. E.g. on Italian Wiki Dialetto padovano redirects to this page, where veneziano is mentioned solely as an external entity: “le parlate dei centri più importanti…sono state influenzate dal veneziano”.
So this is more about the general public of English-speaking countries, which isn't aware that such a language exists, as opposed to a local variety of (Standard) Italian. Nicodene (talk) 23:00, 13 August 2024 (UTC)Reply
Fair enough. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 23:09, 13 August 2024 (UTC)Reply
How do you pronounce "Venetan"? Benwing2 (talk) 23:20, 13 August 2024 (UTC)Reply
For me it's /ˈvɛnətən/ < /ˈvɛnətəʊ/ (≈Italian /ˈvɛneto/) + /-ən/. Nicodene (talk) 23:31, 13 August 2024 (UTC)Reply
@Benwing2: I would rather pronounce the term as /ˈvɛneɪtʌn/. --Apisite (talk) 10:49, 14 August 2024 (UTC)Reply
  Support If we are not going to have separate h2 for the main dialect groups of the Venetan language, then we must go for Venetan. As @Nicodene said, Venetian is the dialect of Venetan spoken in and around Venice. For instance, Paduans, Vicentines and Trevisans speak Paduan, Vicentine and Trevisan respectively, not Venetian. — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 15:27, 15 August 2024 (UTC)Reply
@Benwing2 Shall we go ahead, then? Nicodene (talk) 18:00, 22 August 2024 (UTC)Reply

Synthesised audio files (again)

edit

Hello, I'm still new here so not sure if I'm posting in the right place. But WT:TEA seems to be about individual words and my concern here is wider. There is a previous discussion of synthesised audio files at Wiktionary:Beer_parlour/2024/June#synthesized_audio_files but as far as I can see it was archived before reaching a firm conclusion, and I'm not sure what I'm supposed to do on encountering a batch of low quality synthesised audio files. They were added by a user whose only contributions seems were on 2 days in July, so I'm not sure if they're still active or are a known user who likes to contribute under different usernames. I'm sure somebody ought to have a word with the uploader, but that would best be done by someone with more experience than me.

The audio at fucking Nora is very unnatural, especially in intonation, and the one at paucilingual is of something else entirely. I suppose I could boldly revert all the additions, but I'm not sure whether the files themselves should be deleted or how to initiate that process. Moreover, the previous discussion seems to have no firm consensus on whether all synthesised audio files should be removed, or only the ones obviously of poor quality, and some of this batch seem somewhat reasonable (although all are obviously synthetic).

The previous discussion did seem to be inching towards developing some kind of process that an editor can follow when they encounter such files, but it doesn't look like a final consensus was reached on that either, or that it was written up on an appropriate help page. So while I'm flagging this particular batch up now, I think it would be helpful for there to be guidance available on what I'm supposed to do in future. LeadingTheLifeOfRiley (talk) 22:32, 11 August 2024 (UTC)Reply

@LeadingTheLifeOfRiley I personally think synthesized audio should not exist on Wiktionary, because it's not nearly good enough (even the best TTS) at equalling a native speaker's pronunciation. This is also probably why there's a request that only native speakers should record audio, since non-natives might make "mistakes". In that case there's definitely no doubt that programs will make mistakes, so I think synthetic audio should not be put on entries. Kiril kovachev (talkcontribs) 18:09, 19 August 2024 (UTC)Reply

Beautifying English etymology sections (2)

edit

By my count, Wiktionary:Beer_parlour/2023/September#Beautify_etymology_sections resulted in consensus that:

  • English etymologies should start with "From" (or similar) rather than just a link.
  • English etymologies should end with a period.

Thus, for example, unhoaxable, currently {{prefix|en|un|hoaxable}}, is converted to: From {{prefix|en|un|hoaxable}}. These changes will be going forward in a week's time for English only unless there are concerns that need to be addressed. (Notifying @Benwing2) Ioaxxere (talk) 23:25, 11 August 2024 (UTC)Reply

my only comment to make on the matter is using the + templates that support adding that, and making sure that we don't get accidental from repetition such as "from Derived from English example" and such. Akaibu (talk) 01:02, 12 August 2024 (UTC)Reply
Trust me that I know what I'm doing. As for + templates, there isn't currently consensus for adding them everywhere, so it's done on a per-language basis. Benwing2 (talk) 01:24, 12 August 2024 (UTC)Reply

Orthography guidelines for Venet(i)an

edit

Currently there are no formalised guidelines on what orthography scheme to lemmatise Venetan terms in, which has led to at least three different orthographies being used conflictingly at the moment. I have written a concise guideline list at WT:About Venetian/sandbox (the definition section will of course have to be updated if the Venetian → Venetan renaming goes through). I'm not familiar with the bureaucracy needed to get that out of the sandbox and make it official, is this BP post enough if it receives enough support? Catonif (talk) 10:11, 12 August 2024 (UTC)Reply

@Catonif: When there are no active editors and you are the only one to edit the language, you can just impose a standard first and change it later if people appear that have anything to say about it. If there are others that have an opinion, you can ping them on an About: page and discuss it there. I don't think a BP discussion is needed unless you personally want input. Thadh (talk) 12:31, 12 August 2024 (UTC)Reply
Right, thank you. @Sartma maybe you have input? Catonif (talk) 17:08, 12 August 2024 (UTC)Reply
I’ll chime in. I have made occasional edits to Venetian entries and I also thought some consistency was needed, thanks for working this out. Your proposal is very similar to the standardization I was thinking of, even though I don’t vibe with a couple of things: always marking ⟨è ò⟩ but not ⟨é ó⟩, and using ⟨qu⟩ rather than ⟨cu⟩. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 21:39, 13 August 2024 (UTC)Reply
Hi @IvanScrooge98, thank you for the input! About the accents, the different opinions by the three modern competing standards are
  1. only on /ɛ ɔ/, according to Grafia Veneta Unitaria (1995), which I followed
  2. only on /e o/ according to Brunelli (2012)
  3. on neither, according to Grafia Veneta Internazione Moderna (2017).
What are you proposing, on both /ɛ ɔ/ and /e o/? If so that personally seems unnecessarily a bit cluttered. FWIW, vec.wikt also follows GVU 1995 in this regard.
About etymological Q I have no strong feelings, yet again from GVU: Pur riconoscendo che foneticamente non c'è nessuna differenza rispetto a cu + vocale, per il principio di adeguamento, per quanto possibile, alle abitudini grafiche italiane si scriverà aqua []. Né, per quanto l'identità sia evidente, si procederà a un livellamento, in ogni caso, con q (aqua, quor, squola) o con c (àcua, cuòr, scuòla). Admittedly, both Brunelli and GVIM seem to later ditch this principle and opt for cu always, and I'm ready to do so as well for the sake of scientific consistency, but my ultimate goal is to balance out and find the middle point of different forces, two of contrasting which being what the standard guidelines proscribe and what is actual everyday practice. My question is how familiar would cu spellings be for the average speaker (or rather, writer/reader) of Venetan? If it looks relatively natural then I can agree on switching to cu. If on the other hand it looks too unnatural and "artificial", or even straight-up wrong, then I'd rather keep the qu, which as far as I understand is still by far the commonest. Catonif (talk) 10:22, 14 August 2024 (UTC)Reply
I’d go for accent on neither regarding ⟨e⟩ and ⟨o⟩. I think we should treat them like other vowels.
As you said, my main point about ⟨cu⟩ is consistency. In any case, other attested orthographies are normally included under “alternative forms”, and so we can have, e.g. aqua, àqua, àcua and the like to point to the main entry acua, under which they would be listed together. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:35, 14 August 2024 (UTC)Reply
The thing is, as apparently admitted by the proponents themselves, qu is only more common due to the influence of Italian orthography, where the u in cu normally represents a full vowel and not a semivowel. This distinction does not exist in Venetian the same way it doesn’t in Spanish, for example, and I think we should proceed accordingly. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:35, 14 August 2024 (UTC)Reply
@IvanScrooge98 Alright, I can bend towards no accent for the mid vowels on paroxytone (piane) words. This has the disadvantage of making the orthography ambiguous, making IPA sections necessary, but brings the scheme closer to both everyday practice and GVIM and arguably decreases visual clutter. As for ⟨cu⟩ I'm still unsure how natural it looks for speakers, but whatever. I'll make these two changes to the page and then officialise it into mainspace in a couple days from now if no further input is given. Catonif (talk) 12:46, 14 August 2024 (UTC)Reply
Thanks again for working on this! [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 13:14, 14 August 2024 (UTC)Reply
@Catonif: Native speakers of Venetan dialects use any sort of spelling. Very few native speakers are even aware of the various spelling proposals, so they tend to apply what they learned in school for Italian. I have no issues tanking a bold approach here and go with what we prefer. My preference would be to use <cu> all the time. As for vowels, I would always indicate them in the headword, since some words in different dialects are distinguished only by their openness (I say dòcia, but 5 km from where I live they say dócia. I wouldn't write any accent in the entry name; there's too much vocalic variation depending on the dialect to fix a certain accent in the entry name. — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 15:19, 14 August 2024 (UTC)Reply
Hello @Sartma! Been a while. About accents in the headword, I'd prefer to avoid it as much as possible given sometimes it would be also in the entry name (for oxytone and proparoxytone terms) and sometimes it would only be in the headword, which although clear to us I believe would bring confusion to readers. Hence, I'd just omit accents on all paroxytone terms and leave dócia~dòcia shenanigans to the pronunciation section. Catonif (talk) 09:25, 16 August 2024 (UTC)Reply
@Catonif: Been a while indeed. I had to take a break from Wiktionary's toxic environment. You're right, we should use the pronunciation section for those cases. 👍 — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 11:16, 16 August 2024 (UTC)Reply
Alright, done. The guidelines are now official. Catonif (talk) 11:05, 19 August 2024 (UTC)Reply
You didn’t change them as agreed though! XD
I edited the page. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 11:23, 19 August 2024 (UTC)Reply
Derp! I don't know how I could forget. 🤦‍♂️ Thank you for updating them. Catonif (talk) 11:31, 19 August 2024 (UTC)Reply

Character info box redesign

edit

I've redesigned the box generated by {{character info}} to create a better mobile experience by using space more efficiently. You can see how it looks here. Unfortunately I'm not familiar with the inner workings of Module:character info so I may need help in that respect. Ioaxxere (talk) 06:10, 13 August 2024 (UTC)Reply

If the only differences are in CSS, we can just integrate it into Template:character info/style.css. body.mw-mf can be used to detect the mobile version. — SURJECTION / T / C / L / 19:15, 16 August 2024 (UTC)Reply
@Surjection: I meant that the desktop version and the mobile version of the new design have identical HTML, but they each have a very different HTML to the current design so the module would have to be changed. Ioaxxere (talk) 20:30, 17 August 2024 (UTC)Reply
@Ioaxxere This is just my opinion, but the desktop one in my opinion seems to be a bit too small in the new version. Comparing the page 🝬 that you use as an example, the current one puts the Unicode/HTML entity in a line above the name of the character, which keeps it out of the way; yours, on the other hand, puts it in line with the name, which seems to take up too much space, considering the box has also been shrunk.
I also like the liberal use of space employed by the current design on the bottom, which takes up another row to display the Unicode codepoints for the previous and next characters (which I also prefer), and correspondingly, there's a gap underneath the name of the character block, which makes it feel less cramped.
If there isn't a problem with the space usage in your opinion on the desktop version, would it be possible to keep it quite similar to how it is already in terms of spaciousness? I enjoy the way that it doesn't feel constrained right now and has all the space it needs to show all the information, whereas the new design seems too small for me (since it would easily have room to grow if I viewed it on any page as it is).
What do you think? Kiril kovachev (talkcontribs) 18:18, 19 August 2024 (UTC)Reply
@Kiril kovachev: That's a good point, so I'll see if I can make the new design use a similar amount of space on desktop to the current design. But there is some information that I think is unnecessary, like the code points of the neighbouring characters which I think are always one less or one more than the current page's code point. Ioaxxere (talk) 19:26, 19 August 2024 (UTC)Reply
@Ioaxxere Yeah, you might be right about that, it is technically not very useful. Maybe you can disregard that part, since it's just my arbitrary preference, but I guess I've gotten used to how it is at the minute. But I don't want to impede your change if you'd rather remove it. Kiril kovachev (talkcontribs) 22:38, 19 August 2024 (UTC)Reply

Applying ux to English entries (2)

edit

Last month in this discussion, @JeffDoozan told me that his bot followed some strict rules when applying {{ux}}. I would like to gain consensus for the following bot job for English:

  • Apply {{ux}} even when the usage example contains a wikilink (like at attire#Noun) or multiple bolded items.
  • Apply {{ux}} even when the usage example doesn't start with an uppercase letter, like at proper subset#Noun or protusible#Adjective.
  • Apply {{ux}} even when the usage example is a phrase (rather than full sentences), like at puffing#Noun.
  • Apply {{co}} when the usage example contains two words, not including a leading "a" or "the", like at rabbit-proof#Adjective.

This algorithm isn't perfect and may result in some usage examples being misclassified, but it is still a big improvement over not using templates at all. Ioaxxere (talk) 19:21, 13 August 2024 (UTC)Reply

I have no problem with that. I suspect there are plenty of three-word collocations though. Andrew Sheedy (talk) 03:50, 18 August 2024 (UTC)Reply
@Ioaxxere This is because there are some usage examples that are entered as plain text, without using a template, right? I support these rules in that case. Kiril kovachev (talkcontribs) 18:01, 19 August 2024 (UTC)Reply
@Kiril kovachev: Yes, exactly. Ioaxxere (talk) 19:26, 19 August 2024 (UTC)Reply

planning to standardize names of categories like Category:Semantic loans from English

edit

For historical reasons, we have umbrella categories like Category:Semantic loans from English that are missing the normal "by language" terminology. Only some such etymology categories are this way, cf. Category:Semantic loans from English vs. Category:Pseudo-loans from English by language. I am planning on renaming these to conform to standard umbrella category terminology, e.g. Category:Semantic loans from English -> Category:Semantic loans from English by language. This specifically applies to:

  • Phono-semantic matchings from LANG
  • Semantic loans from LANG
  • Terms borrowed from LANG
  • Terms calqued from LANG
  • Terms derived from LANG
  • Terms inherited from LANG
  • Terms partially calqued from LANG
  • Transliterations of LANG terms

Benwing2 (talk) 03:16, 15 August 2024 (UTC)Reply

Sounds good to me. Ioaxxere (talk) 06:05, 15 August 2024 (UTC)Reply
Sounds fine. Vininn126 (talk) 10:45, 16 August 2024 (UTC)Reply
  Support for consistency. — excarnateSojourner (ta·co) 03:59, 20 August 2024 (UTC)Reply

Wardian

edit

We have Wardian case (and Wardian cases), but no entry at Wardian. Should we? What should such an entry say?

Are there other, similar examples? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:09, 15 August 2024 (UTC)Reply

{{def-uncertain}} for graphemes

edit

Could we add an option to this template that a reading is uncertain, for extinct scripts that are not completely deciphered? This came up for me with the Linear B glyphs, many of which we know are syllabic letters rather than logograms (Unicode even separates them into different blocks), but there is uncertainty or dispute over which syllable they transcribe. Saying their "definition" is uncertain is a weird way of putting that; normal parlance is "reading". We might also want "reading" to trigger different categories (or perhaps rename the existing categories "terms with uncertain meaning or reading").

For the logographic Linear B glyphs, the current wording is fine IMO, but in other scripts "reading" might be appropriate to logograms as well. There may also be glyphs that have both phonographic and logographic uses, in which case we might want "reading" in one section and "meaning" in the other.

There are cases in other scripts where this might apply to words written in an opaque phonographic script (e.g. Akkadograms) -- we might know the meaning but not the reading, or vice versa. Possibly in some cases we'd want to say that both the reading and meaning are uncertain, rather than a binary choice. kwami (talk) 22:22, 15 August 2024 (UTC)Reply

"der" template for phonological influence from substrate

edit

Hey all, apologies if this has been asked before (cursory archive search suggests it hasn't), just checking in to clarify whether or not the "der" template can be used for languages that influenced a term, but which aren't ancestors of said term. Sanskrit नड (naḍa, reed), of Indo-European origin, is thought to have become retroflexed via substrate influence, most likely Dravidian. User:Djkcel says that Dravidian should be marked with a "der" template, but I'm not so sure about this, and the der template page makes no mention of this case, whether to confirm or deny. The Dravidian hasn't quite "hybridized" with the Sanskrit term (which would make "der" more clearly suitable), but it has exerted influence on it. Does anyone have any clue on this? User:Mahagaja, User:Vininn126. Thanks. Agamemenon (talk) 23:19, 16 August 2024 (UTC)Reply

I certainly see "influence" being a case for using der in quite a few entries such as English arbour, Spanish barrueco, Bulgarian агро- (agro-), German Runde, Dutch automaat, Sardinian piaghere, etc. I don't know that there is a template for designating "influence" from another language but perhaps that would be better than der... DJ K-Çel (contribs ~ talk) 00:14, 17 August 2024 (UTC)Reply
Since we even categorize loan meanings, any substitution, as “derivations” from foreign language, yes. See also the graph of foreignisms. (No templates for loan renditions and loan creations yet, as far as I know.) Fay Freak (talk) 01:16, 17 August 2024 (UTC)Reply
{{der}} has often been used in such cases. Vininn126 (talk) 07:16, 17 August 2024 (UTC)Reply

Category/template for terms that are not derived from another language, i.e. derived internally

edit

We currently have no categorization/template for terms that come from within a given language and are not derived or borrowed from another, such as compounds(jackhammer), clippings(motherfuck, affixation(frotteurism), among other methods of internal word formations. even if the aformentioned templates were to be given whatever is needed to denote said internal derivation, my observations have told me that a dedicated template would likely be needed as some etymologies don't slot neatly into an existing etymology template, and are usually just given either the mention or link template Akaibu (talk) 22:41, 17 August 2024 (UTC)Reply

What do you mean? jackhammer is a compound term, so it's categorised as a compound, which is implied to be derived internally. Theknightwho (talk) 22:56, 17 August 2024 (UTC)Reply
@Theknightwho cases like booze, trevally and squeegee. Akaibu (talk) 23:14, 17 August 2024 (UTC)Reply
Okay, those are alterations - I don’t know if they form a coherent class of terms, though. Theknightwho (talk) 23:19, 17 August 2024 (UTC)Reply
@Theknightwho there's more than just alterations, peruse the top few hundred examples of https://petscan.wmcloud.org/?psid=29112298 for more cases Akaibu (talk) 23:49, 17 August 2024 (UTC)Reply
Isn't {{Template:from}} meant for generalized within-language derivations? Although it seems it may not actually function any differently from 'mention' at the moment, and as mentioned, more specific templates like the affix template should be used if appropriate. See e.g. puny, which was brought up when the etymon template was being introduced as an example of this type of derivation.--Urszag (talk) 00:55, 18 August 2024 (UTC)Reply
I wasn't actually aware this existed. Theknightwho (talk) 01:15, 18 August 2024 (UTC)Reply
that template currently doesn't categorize, if it can do such and be used for such internal derivations, that would satisfy. Akaibu (talk) 01:58, 18 August 2024 (UTC)Reply

Reconstructions and scripts

edit

I was informed by @Victar that reconstruction entries on Wiktionary are rendered in Latin script corresponding to their vocalisation while only attested words are rendered in the language's original script, which is how Old Persian entries are already handled on Wiktionary.

However @Mellohi! changed the /k/, /w/ and /y/ in the Gaulish reconstructions to /c/, /u/ and /i/ so as to reflect the forms they are attested in Latin, and they argue that reconstructed words should be spelt like the other attested words in the language.

I have also also come across a Hittite reconstruction using cuneiform in the form of Hittite *𒊭𒀀𒆪𒉿𒀭, which is even more problematic than regular reconstructions because of how the large number of phonetic values per sign and large number of signs corresponding to a phonetic value in cuneiform, due to which even attested words in the many languages using cuneiform did not have fixed spellings in the script.

Can a fixed rule be established for all reconstructions in all languages? Personally, I argue for the phonetic-based use of the Latin script because reconstructions in the original scripts are not always feasible or predictable due to lack of standardisation in pre-modern scripts. Antiquistik (talk) 12:55, 18 August 2024 (UTC)Reply

That's not feasible in cases where the script form is reconstructable, even if the pronunciation is not; e.g. I don't think our Ancient Greek reconstructions should be converted to the Latin script. Theknightwho (talk) 13:30, 18 August 2024 (UTC)Reply
That's fair. In this case, there will need to be criteria established to decide which languages' reconstructions should be in Latin script and which should be in their native scripts. Antiquistik (talk) 13:49, 18 August 2024 (UTC)Reply
@Antiquistik I agree in principle, though there may be cases where one or other is warranted in the same language, depending on what the evidence for the reconstruction is. Theknightwho (talk) 10:17, 19 August 2024 (UTC)Reply
@Theknightwho This is fair as well, though I suppose a bit more tricky. I think it needs to be discussed more thoroughly. Antiquistik (talk) 11:21, 19 August 2024 (UTC)Reply
To quote what I wrote on your talk page:
"It's true, most academic reconstructions are written in Latin script, for clarity, and that's especially the case for languages like cuneiform Old Persian, which are orthographically unpredictable and difficult to read. To the point of Primitive Irish, I can't find any author that reconstructs it in Ogam,{{R:sga:McCone:1986|page=245}} and there is probably a good argument to reconstruct it in Latin script as well, and I would support that if brought up in a discussion."
I would absolutely support moving RC:Hittite/𒊭𒀀𒆪𒉿𒀭 to RC:Hittite/šākuwan. Hittite cuneiform has dozens of alternative characters and logograms. --{{victar|talk}} 20:03, 18 August 2024 (UTC)Reply
I would   Support requiring reconstructions for languages in scripts that are incompatible with the Latin alphabet to be converted to it, whether by transliteration, transcription, or spelling out the phonetics. Cuneiform is a partially logographic script, so the correspondance between spelling and phonetics is pretty loose in many cases. As you progress toward a more-or-less phonemic alphabet or abugida, it's more of a gray area. I wouldn't object to keeping reconstructions in such a script if there are other reasons to so so. In cases such as Gaulish, which is already in the Latin script, minor adjustments for compatibility with attested spellings aren't that big a deal. Chuck Entz (talk) 20:38, 18 August 2024 (UTC)Reply
Editors naturally make the correct choices. Clarity is the key criterion here. Middle Persian had multiple shit scripts only so it is reconstructed in Romanization, Old Persian is also not transparent enough. Akkadian does not have the problem since we lemmatize at Latin script per language-specific decision, while for Sumerian I can imagine some to prefer cuneiform, but the Sumerian internet community is not at Wiktionary yet.
People working with a language often think the script directly instead of what it stands for, one reads with visual memory. For Old South Arabian it is likely that one would use their script and forgo Romanization since knowledge of many vowel values is wanting and acquaintance with the script is expected if one does anything at all with the language, and at a similar vein I made Punic reconstructions to avert entryism (an interesting context to apply this term, yes), also motivated by the idea that no one should enter ancient Romanizations (and Grecizations, hell why is this word unattested) outside of an appendix (as they would go out of hand and contain more textual corruption than truth). Fay Freak (talk) 21:28, 18 August 2024 (UTC)Reply
"Grecizations" is probably unattested because the usual word ("Hellenizations") is easier to say. Andrew Sheedy (talk) 22:31, 18 August 2024 (UTC)Reply
@Fay Freak This would align with @Chuck Entz's comment that terms in logographic scripts would be more inaccurate to reconstruct compared to those recorded in more phonemic ones. For example, I would not necessarily support moving the Gothic and Prakrit reconstructions to the Latin script either, primarily because their representations in their respective scripts are certain to be accurate to how they would have been historically written (although I would not necessarily oppose such a move either if it was seriously proposed on Wiktionary).
Now, there is also the question of the extent to which various scripts have been used for the concerned languages as well. Given that this discussion started with concerns regarding Gaulish reconstructions, I think it is fair to take into consideration the fact that Gaulish was a primarily oral language which did not extensively use writing. I approve of primarily using the Latin script to render Pali on Wiktionary for the similar reason that Pali never had any one single primary script, and I think Wikipedia, though an independent project from Wiktionary, is also correct in primarily using the Latin script for rendering Sanskrit.
All this is to say, in addition to the question of the accuracy of reconstructions in scripts like cuneiform which had very loose spelling conventions which I have already addressed elsewhere in this discussion, there are also many more layers of nuance to take into account when choosing which reconstructions to Romanise and which ones to represent in scripts their languages are attested in. Antiquistik (talk) 11:46, 19 August 2024 (UTC)Reply
@Victar: It is also extremely difficult to find authors that write Hittite in cuneiform, but that doesn't mean we shouldn't do it. I think our reconstructions should be given in the script the language used, unless of course that is not possible. Thadh (talk) 10:37, 19 August 2024 (UTC)Reply
That certainly wouldn't be possible for the vast majority and even in cases where some (proto-)Hittite person was speaking a (proto-)Hittite word, the odds are very good that person was illiterate anyway, so it does make sense to me to try to normalize all these into a Latin script, as someone who does not know about Ancient Near East languages or the contemporary scholarship on them. —Justin (koavf)TCM 10:40, 19 August 2024 (UTC)Reply
I don't think the literacy of past speakers is relevant to reconstructions. Theknightwho (talk) 11:22, 19 August 2024 (UTC)Reply
@Theknightwho I would say that it depends on the specific language. It's fair to opt for Romanisation for a primarily oral language which barely used writing.
Though, when it comes to Hittite, the issue is instead about the reliability of reconstruction in the language's script, given that Hittite was written in cuneiform. And cuneiform being a mixed logographic-syllabographic script meant that it had very loose spelling conventions, not to mention that in cuneiform a single character could have several different phonetic values and a single phonetic value could be represented by several different signs. Antiquistik (talk) 11:54, 19 August 2024 (UTC)Reply
@Antiquistik To clarify: the important issue here is the reliability of reconstructions. In the case of Hittite, your (and Victar's) argument has been that we cannot reliably reconstruct terms in cuneiform, and therefore we shouldn't have reconstructions in it; for Gaulish, the fact that it had no particular literary tradition also prevents us from being able to reliably reconstruct an authentic representation, because no such representation ever existed in the first place. In such cases, we use a normalised Latin script to represent morphemes (though I admit the distinction becomes confusing for languages that have actually been attested using the Latin script, such as Gaulish, even if it was only done on an ad hoc basis). However, at no point does the literacy of the majority of speakers play any part in this: it's true that "the odds are very good that person was illiterate anyway", but that was true for Ancient Greek, Gothic and Prakrit, too. The fact is that it doesn't matter, for our purposes. Theknightwho (talk) 12:07, 19 August 2024 (UTC)Reply
@Theknightwho My bad, I forgot to clarify myself. Indeed, I don't think literacy of the speakers is a factor regarding whether or not reconstruction entries should be in scripts that were used to write the languages they are from.
Ability to create a reliable/accurate reconstruction in the script, and presence or lack of a literary tradition in a particular script or several specific scripts, should dictate whether or not to Romanise reconstruction entries. If either one of those is missing (e.g. the 1st one for Hittite; the 2nd one for Gaulish; both 1st and 2nd one for oral-only languages), then the reconstructions should be Romanised. Antiquistik (talk) 12:19, 19 August 2024 (UTC)Reply
@Antiquistik Agreed. Theknightwho (talk) 12:23, 19 August 2024 (UTC)Reply
@Thadh The problem is that there was not one single fixed way of spelling words in scripts like cuneiform, or even Egyptian and Anatolian hieroglyphs and Linear A and B, for that matter. There were multiple ways in which words could be written even if using a very limited set of characters, and, in the case of cuneiform, a single character could have several different phonetic values and a single phonetic value could be represented by several different signs. This makes any process of reconstructing using cuneiform and similarly-functioning scripts extremely unreliable in terms of accuracy. Antiquistik (talk) 11:25, 19 August 2024 (UTC)Reply
@Antiquistik: We often normalise the entries for such languages anyway, using one 'standard' spelling with multiple attested variants. Don't see how that would be problematic for the reconstructions.
As for @Koavf's point: If we're using arbitrary signs for recording a spoken language, might as well use a native script. What I said wasn't about "Proto-Hittite", it was about actual, recorded Hittite - scholarly consensus isn't always the best thing for us to follow. In the case of orthography, it's not. You're not arguing for reconstructing Middle English using Canadian syllabics just because it had a messy orthography - seems a bit disingenuous to force Latin unto those other languages, no? Thadh (talk) 14:31, 19 August 2024 (UTC)Reply
@Thadh What you are proposing is feasible for adjads, abugidas and alphabets, not not with logographic and/or syllabographic scripts like cuneiform and the like.
Hittite cuneiform alone had four signs for /ša/, one sign for /sak/, eight signs for /ku/, and four signs for /wa/.
And the were no rules regarding which signs to give precedence in cuneiform, which simply makes it too uncertain how to normalise an unattested term in a language using this script. Antiquistik (talk) 15:18, 19 August 2024 (UTC)Reply
@Antiquistik: I know how Hittite works. I also know these signs have the same value and were used interchangeably, so we can actually decide ourselves which one to give a priority. We can, for instance, say that ša1 is from now on the 'prioritised' form, with the others being added based on attestation - there, problem solved, all reconstructions now use ša1. That is exactly how we would treat any other attested language.
For logographic scripts, yes, that won't work, but most languages have a segmental alternative - the only languages that doesn't that I can think of that is deciphered is Sinitic, and even there we can often just use the modern sign for e.g. classical Chinese. Thadh (talk) 15:26, 19 August 2024 (UTC)Reply
I don't know that it's disingenuous: it's just the most convenient manner to write these reconstructed forms. It may well be the case that the literature uses cuneiform to write them, but it could also easily be the case that they use Latinized forms. There would be nothing wrong in principle with Cherokee or Sinhala script or whatever, but I find it highly unlikely that is what the sources use. —Justin (koavf)TCM 20:59, 19 August 2024 (UTC)Reply
  Support having reconstructed entries for poorly attested languages with weird scripts in Latin script. Making reconstructed Primitive Irish entries in Ogam is a pain in the ass. —Caoimhin ceallach (talk) 20:46, 19 August 2024 (UTC)Reply
@Caoimhin ceallach: This online keyboard easily solves the Ogham issue. I think it might even be how those written-in-ogham titles even exist in the first place. But I don't mind changing reconstructed Primitive Irish to romanized entry names; albeit scholars generally romanize Primitive Irish in all-caps. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 21:44, 20 August 2024 (UTC)Reply

Arbitrary break: recap

edit

So we have the following categories of related issues:

  • Primitive Irish: Requires this online keyboard to type in conveniently, and its attested inscriptions consist of virtually exclusively personal names.
  • Gaulish: Written in Latin script natively, the "Romanized" reconstruction pages are inconsistent with attested native Gaulish orthography. E.g. the reconstruction pages spell /k/ with the letter K when the actual Gauls spelled the sound with the letter C, and the reconstructions tend to use -y- even though Gauls never spelled /j/ like that (they used the letter I).
  • Cuneiform and Persian scripts: general nightmare to handle digitally, Romanized reconstruction entries are preferred for these to avoid hassles.

Ceso femmuin mbolgaig mbung, mellohi! (投稿) 21:44, 20 August 2024 (UTC)Reply

Pinging the other participants in this discussion: @Caoimhin ceallach, @Chuck Entz, @Fay Freak, @Koavf, @Thadh @theknightwho, @Victar. Antiquistik (talk) 07:49, 23 August 2024 (UTC)Reply
1 Could you add that to Wiktionary:About_Primitive_Irish? I think all hacks like that should be shared to help less savvy editors like me.
2 I agree with using C, I, U, also because that's what Delamarre's {{R:cel:DLG}} does, although he doesn't mark unattested forms as reconstructed, which we should definitely not copy.
3 Yes. —Caoimhin ceallach (talk) 13:12, 23 August 2024 (UTC)Reply
@mellohi!, Caoimhin ceallach Is it accurate to say that Gaulish was natively written in the Latin script? It appears to have used Greek and Greek-derived scripts like the Lugano script before the Roman conquest, and Latin epigraphically under Roman rule, but it seems to have otherwise been a primarily oral language with no literary form.
I must also note that Wiktionary sometimes uses its own spelling conventions, especially for reconstructions (see the Old Median reconstructions by @Victar compared to how they are presented in academic literature). Antiquistik (talk) 08:26, 24 August 2024 (UTC)Reply
I disagree on the cuneiform front. Linear B is also not great for writing in either, but that doesn't mean we shouldn't use it. Just normalise some phonetic spelling, and then list attested spellings, just like any well-covered language is (e.g. Old East Slavic). Thadh (talk) 17:49, 23 August 2024 (UTC)Reply
Just to be clear, these are guidelines for only reconstructions. Attested forms of Old Persian and Hittite cuneiform, ogham Primitive Irish, etc., should still be written in their attested scripts. --{{victar|talk}} 19:18, 23 August 2024 (UTC)Reply

St and St. abbreviations of Saint

edit

I would like to discuss the idea of sorting St and St. under Saint in place name categories in particular. This idea, though radical, is not as daft as it sounds, as Oxford and Collins do this in their printed dictionaries. That fact can't be proved online, you need to own, or refer to, the actual volumes to find out.

There has been a long discussion in the Grease Pit (Wiktionary:Grease_pit/2024/August#What_to_do_with_St?) over the treatment of the abbreviations, which are currently mixed up with all other entries beginning with St-. This is hardly satisfactory, and User:Theknightwho has been the obstacle to change. I did prepare an example with St Neots of proposed sorting under Saint using a sortkey, but TKW saw fit to revert it.

This user has become far too used to getting his or her own way since becoming an admin, and needs reining in by more senior admins. I am non-admin, and don't want adminship, but this leaves me open to being downtrodden, despite over 255,000 edits since 2013. DonnanZ (talk) 15:31, 18 August 2024 (UTC)Reply

For context: English entries starting with "St" and "St." are not "mixed up": they're simply sorted according to the method that decided last year in in this thread, where we agreed that English sortkeys should ignore spaces and punctuation, in accordance with how most English dictionaries seem to do it. I reverted Donnanz's attempt to change all the sortkeys to manual sort=St., which put them out-of-step with all other English entries, and Donnanz has spent the last few days acting like this is about my personal preference versus his, for some bizarre reason, despite me and @Urszag explaining numerous times that he can't just ignore established consensus; I'm still not sure if Donnanz actually understands that, going by what he's just commented above. In fact, my personal preference would have been to not ignore spaces and punctuation in the first place, as I've told him several times, which would have avoided them being "mixed up" (as he calls it), but that's not what we decided.
That all being said, I would oppose any change to sortkeys to treat "St" and "St." as "Saint", because it makes automatic sorting impossible when it appears in the middle of terms, as it's impossible to distinguish from the abbreviation for "street": compare Bury St Edmunds (saint) with Bow St. Runner (street). Theknightwho (talk) 16:21, 18 August 2024 (UTC)Reply
It would not be necessary to apply a sortkey to those examples, only to entries beginning with St or St. Thus the sortkey would need to be selective, and only applied manually, not automatically, to those that need it. With the limited numbers of entries this is not insurmountable. DonnanZ (talk) 16:55, 18 August 2024 (UTC)Reply
If "St" and "St." are supposed to be sorted like "Saint", then Stoke St Gregory should sort before Stokes Bay and stokesia, because Stoke Saint Gregory would be sorted before them. By default, Stoke St Gregory is sorted after them, which means we're being inconsistent if we only sort "St" and "St." like "Saint" at the start of a term. Here's an alphabetically-sorted column template which demonstrates it: Theknightwho (talk) 17:13, 18 August 2024 (UTC)Reply
Some of those would never appear in a place category. I don't think we should be looking for inconsistencies in sorting five or six characters into an entry. Stokes Bay is always going to sort before Stokesley, Stoke sub Hamdon, Stoke Trister and Stoke Works, and Stokes County would only appear in a US list, not an English one. There is no mixing of the two category lists. DonnanZ (talk) 18:37, 18 August 2024 (UTC)Reply
Okay, so you're advocating for a sorting system that isn't even internally consistent. Let me change that oppose to a strong oppose. There are plenty of other examples where this happens, so taking issue with the specific one I gave completely misses the point; nevermind the fact that you're advocating for sorting place name categories in a special way that adds an annoying maintenance burden to ensure they're sorted consistently, and doesn't really make sense: either we do it everywhere, or we don't do it at all. Theknightwho (talk) 18:46, 18 August 2024 (UTC)Reply
I need to hear from other editors. DonnanZ (talk) 18:58, 18 August 2024 (UTC)Reply
Also you're not even correct about them not being mixed anyway: Stokesley and Stoke St Gregory both appear in Category:en:Places in England, so they would appear in the wrong order. Theknightwho (talk) 18:52, 18 August 2024 (UTC)Reply
It used to be, or may still be, the bibliographical standard to treat abbreviations at the beginnings of terms (not sure about what happens when they occur in the middle) as if spelled out in full for sorting purposes. Thus St and St. were treated as if spelled as Saint, and Mc and M‘ (e.g., in McDonald) as Mac. That being said, I can see how this is not obvious to the average user who may expect to see all the Saints grouped together in one lot, and Sts in another. It may also cause difficulty in sorting as Theknightwho pointed out, though I wonder if there is a technical way to say "sort St and St. as Saint". At this point I remain undecided. — Sgconlaw (talk) 18:55, 18 August 2024 (UTC)Reply
@Sgconlaw Not without solving the "street" issue. It's inherently ambiguous, so it would always have to be manual; see Bow St. Runner. Theknightwho (talk) 18:57, 18 August 2024 (UTC)Reply
@Theknightwho: it can't be that common for there to be entries with Street abbreviated as St. I would imagine the use of St to mean Saint is much more prevalent in dictionary entries. What if we defaulted St and St. to mean Saint, and provided a parameter to override it manually? (This is, of course, on the assumption that there is consensus that St and St. should be treated like Saint, and perhaps the Mac situation should be treated that way as well, which has yet to be decided.) — Sgconlaw (talk) 20:38, 18 August 2024 (UTC)Reply
@Sgconlaw: Guess who created Bow St. Runner? It was TKW, today. DonnanZ (talk) 20:53, 18 August 2024 (UTC)Reply
@Sgconlaw I don't really see what benefit this extra maintenance work brings, as it's adding additional (ongoing) work for no clear purpose; the point of sorting terms in categories is to make them findable, while this achieves the opposite by defying user expectations, especially if we only apply it to some abbreviations and not others. FWIW, the OED Online sorts "St. Elmo's fire" under "St", not "Saint"; it's the same for all other saint terms. For instance, "stag", "St. Agatha's letters", "stage cloth": the precise same method we currently use. Theknightwho (talk) 21:01, 18 August 2024 (UTC)Reply
US place categories are not mixed with English place categories, and this applies to all countries. DonnanZ (talk) 19:29, 18 August 2024 (UTC)Reply
@Donnanz: It seems like if we do this, we should sort every other abbreviation under its expanded form for consistency, which will be problematic. Ioaxxere (talk) 05:27, 19 August 2024 (UTC)Reply

FWIW, although my personal intuition would be to sort things as spelled ('respecting' spaces), thus "sack, saint, Saint Bernard, sap, St, St Andrew's cross, St Elmo's fire, stab, stand, stellar", the few dictionaries I've managed to find "St Whatever" terms in don't match either my intuition or each other: The Webster's New College Dictionary, Third Edition sorts "standpoint, St. Andrew's cross, standstill, ..., stellular, St. Elmo's fire, stem"; Webster's New Universal Unabridged Dictionary, Second Edition has all the "Saint" and "St." terms as run-ins under "saint", alphabetized there as "Saint Bernard, ..., Saint Valentine's Day, St.-Agnes's-flower, St. Andrew's cross"; and ... unhelpfully, those are the only two I've managed to find "St" terms in.
I am not inclined to sort "St Whatever" terms under "saint Whatever" here: I think more people would look for "st..." terms under "st..." than would look for "st..." terms under "sa...". - -sche (discuss) 05:58, 19 August 2024 (UTC)Reply

@-sche The OED follows the same method as The Webster's New College Dictionary, Third Edition, though I don’t have a print copy to hand. Theknightwho (talk) 09:09, 19 August 2024 (UTC)Reply
@-sche: (edit conflict) Yes, that's the unknown factor. Where do users expect to find them, under Saint or St? Currently we have alphabetical sorting like Staffordshire Moorlands, St Agnes, Stagsden. My attempt to group all saint entries together in an orderly fashion, where St Agnes would be followed by St Albans etc., was thwarted by TKW, who has thwarted me at every turn. I would be much happier if we could do that, so we can discuss that here too. DonnanZ (talk) 09:30, 19 August 2024 (UTC)Reply
I expect to find "St. Foo" and "St Foo" at "Saint Foo", not after "Solisbury" and before "Stanford" or whatever. —Justin (koavf)TCM 10:38, 19 August 2024 (UTC)Reply
I'm not sure where I would expect to find them (whether at Sa or St), but I would certainly expect to find all "St" terms together. At least, that's what would be most useful. Andrew Sheedy (talk) 17:57, 19 August 2024 (UTC)Reply
If we do that, then we should stop ignoring spaces in sorting altogether, because it would be a really bad idea to take spaces into account only in this one case, because it's inconsistent. Pinging @DCDuring @J3133 @RichardW57 @Vriullop @Benwing2 who participated in the last discussion about this. Theknightwho (talk) 18:16, 19 August 2024 (UTC)Reply
My understanding is that systems that face a broad user population have lots of special cases or very general architectures because users have diverse, complex, and seemingly contradictory needs. If we have one "special case", we will probably have others. What would be a way to accommodate them? DCDuring (talk) 19:45, 19 August 2024 (UTC)Reply
I have a problem with User:Theknightwho reverting my experiments. Get off my back, they will be reverted when I have studied them. DonnanZ (talk) 19:18, 19 August 2024 (UTC)Reply
Testing proved |sort=Staa| works well. I expect I will be told "No, we can't do that." DonnanZ (talk) 22:51, 19 August 2024 (UTC)Reply
@Donnanz Why would we sort St Georges-super-Ely as sort=Staa, as you just tested? That isn't consistent with anything that's been proposed. Theknightwho (talk) 22:55, 19 August 2024 (UTC)Reply
It's being proposed now, as another option. DonnanZ (talk) 23:07, 19 August 2024 (UTC)Reply
@Donnanz Okay, so I repeat the question: why would we sort St Georges-super-Ely as sort=Staa, as you just tested? Theknightwho (talk) 23:10, 19 August 2024 (UTC)Reply
As Andrew Sheedy said above: "I would certainly expect to find all "St" terms together." DonnanZ (talk) 23:17, 19 August 2024 (UTC)Reply
@Donnanz Yeah, but crudely shoving them in a random place, inconsistently from everything else, is not the way to achieve that. In Category:English lemmas, you'd be inexplicably placing them after sta. Theknightwho (talk) 23:21, 19 August 2024 (UTC)Reply
This is a kludge, which is inadvisable because it is not obvious to other editors why this particular sort key has been used. If there is consensus that St should be sorted as if spelled as Saint, then a proper technological solution should be developed. The focus of the discussion ought to be on determining what the consensus is. — Sgconlaw (talk) 23:25, 19 August 2024 (UTC)Reply
Precisely. I think there's probably consensus for not ignoring spaces, but I don't think there's consensus for treating "St" and "St." as "Saint". Theknightwho (talk) 23:27, 19 August 2024 (UTC)Reply
I don't want random sorting, I would have to do more testing there, my intention is to segregate the saints from other St- places, placing them at the beginning of the St- entries. But judging by your past actions, you will disagree with that. Goodnight. DonnanZ (talk) 23:32, 19 August 2024 (UTC)Reply
@Donnanz I've told you at least 5 times what my personal view is, but you obviously haven't listened. Theknightwho (talk) 23:40, 19 August 2024 (UTC)Reply
@Donnanz Why do you keep adding sort=Staa to various entries with the edit summary "test"? What is being tested here? Theknightwho (talk) 11:56, 20 August 2024 (UTC)Reply
I wanted to see whether random sorting occurs. No, it doesn't. In the Welsh list St Asaph, St Davids and St Georges-super-Ely appear in proper alphabetical order between Square and Compass and Stackpole at the moment. Those edits can be reverted once you have checked them. An alternative would be using sort=Stz, which in theory would sort them after the other St- entries. At least I am trying to find a solution to the problem, your personal view seems to stop you from looking for one. DonnanZ (talk) 12:34, 20 August 2024 (UTC)Reply
@Donnanz As @Sgconlaw pointed out, any manual sorting like that is a kludge, which we wouldn't want to use. As several users have said already, we shouldn't be carving out a special exception just for these, because the same issues apply to all kinds of other entries as well: all the entries starting "N.", "S.", "E." or "W.", any with "U.S." and so on. The problem is that you're not looking at the bigger picture, and don't seem to understand that your personal preference for these particular entries does not justify being inconsistent in how we sort things overall. It's not complicated.
You (still!) don't seem to grasp that you personally not liking something does not automatically make it a problem that needs to be solved: given that the sorting you don't like is the method used by the OED, it's clearly not nonsense; it just isn't what you'd prefer. If there is consensus for taking spaces into account when sorting (which would group "St." terms together), then we can do that, but if there isn't, then we won't. Again: this is a very simple concept that someone with your level of experience should understand by now, but you keep reverting back to the same arguments time and again and ignore everything that doesn't align with your view, which is not becoming of an editor with over 200,000 edits. You also constantly make things personal, which is not on. Instead of telling me what I think, absorb what I actually say. Theknightwho (talk) 12:54, 20 August 2024 (UTC)Reply
I noted Sgconlaw's comment, and the aversion to sortkeys. However, no one has come up with a better solution, let alone an automatic solution, AFAIK. So this issue may never be resolved. And being bossed about is a definite turn-off. I am not trying to make things personal, just make an observation. What about personal attacks on me? I have to accept them, it seems. It applies both ways. DonnanZ (talk) 13:32, 20 August 2024 (UTC)Reply
@Donnanz I just gave you a general solution in my last comment, and it's far from the first time I've mentioned it. What personal attacks on you are you referring to? Theknightwho (talk) 13:35, 20 August 2024 (UTC)Reply
Referring to personal attacks on me was a general comment, but they can occur. But a "general solution", are you referring to taking spaces into account when sorting? In general sorting, probably not. I created White Ball along with alt form Whiteball earlier; the latter won't be sorted. But White Ball is sorted before Whitechapel, which is fine, and Whitechapel is followed by White City, as can be expected. However special cases, such as here, can occur. We have to get our heads around those somehow. DonnanZ (talk) 14:51, 20 August 2024 (UTC)Reply
@Theknightwho: I just found some odd sorting for "the" though, The Charltons, Theddingworth, Theddlethorpe All Saints, Theddlethorpe St Helen, The Gorge, Themelthorpe, The Stukeleys. DonnanZ (talk) 20:14, 20 August 2024 (UTC)Reply
@Donnanz It's the same logic used for "white" above. If you remove the spaces, you can see it: "thecharltons", "theddingworth", "theddlethorpeallsaints", "theddlethorpesthelen", "thegorge", "themelthorpe", "thestukeleys" etc. The same thing happens with "red dog", "red drum", "rede", "redeem", "red ensign", "red hat", and so on. Theknightwho (talk) 20:39, 20 August 2024 (UTC)Reply
Yeah, it's odd-looking, rather than wrong sorting. I see Wikipedia does the same (List of United Kingdom locations: The-Thh). We have to live with it, though "Charltons, The" has occurred to me. OK. DonnanZ (talk) 20:55, 20 August 2024 (UTC)Reply
@Donnanz The problem is that the solution would require us to add spaces back in to compounds (e.g. "Whiteball" → "White Ball"), since it would need to know "redeye" should be "red eye", but "redye" should not be "red ye", so it would be a monumental effort to do it all manually. Theknightwho (talk) 21:58, 20 August 2024 (UTC)Reply
OK. I see there is a Derived terms section for the, but I don't bother with cataloguing any. There must be thousands of entries with that word. It occurs in a surprising number of place names; the road on two sides of Twickenham Green is named simply as "The Green". DonnanZ (talk) 22:45, 20 August 2024 (UTC)Reply

Coming soon: A new sub-referencing feature – try it!

edit
 

Hello. For many years, community members have requested an easy way to re-use references with different details. Now, a MediaWiki solution is coming: The new sub-referencing feature will work for wikitext and Visual Editor and will enhance the existing reference system. You can continue to use different ways of referencing, but you will probably encounter sub-references in articles written by other users. More information on the project page.

We want your feedback to make sure this feature works well for you:

Wikimedia Deutschland’s Technical Wishes team is planning to bring this feature to Wikimedia wikis later this year. We will reach out to creators/maintainers of tools and templates related to references beforehand.

Please help us spread the message. --Johannes Richter (WMDE) (talk) 10:36, 19 August 2024 (UTC)Reply


AWB whitelist request

edit

Request to be added to the AWB whitelist to clean up [[Category:wikipedia with redundant first parameter]] -saph668 (usertalkcontribs) 13:48, 19 August 2024 (UTC)Reply

edit

What do y'all think about placement of possibly related terms? My thinking is to put them under See also and mark them (via qualifier) as possibly related (or probably related, when the likelihood is high); this notion reserves the Related terms section for only terms known with very high certainty to be related. But I will put them under Related terms (with the same qualifier) if most people prefer that. No big deal, as it doesn't come up very often. Just bouncing it off the beer parlour wall. If a consensus arises (or if one already did, in some talk namespace or other), it could be notated at Wiktionary:Related terms. Thanks all. Quercus solaris (talk) 01:22, 21 August 2024 (UTC)Reply

@Quercus solaris I do it this way too, reserving the "Related terms" section only for words that are 100% related, otherwise it basically spreads the false information that two words "are" related rather than just maybe being related. The qualifier idea is good though. Kiril kovachev (talkcontribs) 16:54, 21 August 2024 (UTC)Reply
My personal preference would be to have them under "Related terms" and note the uncertainty in a qualifier. But I don't really add related terms, so I'd rather leave the decision to those who do. Andrew Sheedy (talk) 18:54, 21 August 2024 (UTC)Reply

Removing information from Azerbaijani articles written in Abjad alphabet

edit

Hello,

  1. 1 I filled out the Wiktionary with words from Azerbaijani words written in the Azerbaijani Abjad (Perso-Arabic alphabet). However, at some point, @Əkrəm Cəfər wrote to me on the page and asked if these words were Ottoman Turkish. I proved that they were not, since I used a lot of literature and dictionaries of the Azerbaijani language. Then, I noticed that he just deleted the information from these pages and gave a link to the Latin versions of these words. I was offended by this, since most Azerbaijanis write in the Abjad script. This person is a citizen of the Azerbaiajni Republic, which promotes and uses the Latin alphabet (I am not against this and even for it, but the Abjad alphabet is also part of this language), but Azerbaijanis are currently an indigenous people not only in the Azerbaijani Republic, but also (if we take into account the subethnic groups) in Georgia, Iran, Iraq, Russia, Afghanistan. Azerbaijanis from these countries did not accept the Latin alphabet as the main script, for example, in Russia the official script for Azerbaijani is Cyrillic. That's not the point. The point is that they cancel all my edits on the page. Here the next question arises, why do these templates like Template:az-arabic-noun for the Abjad alphabet exist if some users delete them and make a link to the Latin version.
  2. 2 The second question is related to the article müvəllidülma. @Fenakhay just deleted Abjad written word and renamed page to müvəllidülma. Also I was write that the word is formed from 1 Arabic root, 1 Arabic affix and 1 Persian word. But he replaced the etymology with Ottoman Turkish. (Given that this Ottoman word is not in the wiktinary, this is not an argument, but still) Maybe he has some evidence? or did he do this because he considers the Ottoman language more "prestigious" than Azerbaijani?
  3. 3 The third question is related to the constant rollbacks of information from articles written in the Abjad alphabet, I constantly encounter these restrictions that they write "this word does not exist in modern Azerbaiani". This is due to the fact that the ancestor of the Azerbaijani language is not defined in Wiktionary, or rather it is defined as Old Anatolian Turkish, but this is too ancient an ancestor. For comparison, in the Turkish language the ancestor is indicated as the Ottoman language and then the old Anatolian Turkish, this is logical. But it turns out that modern Azerbaijani has no ancestor in the time intervals from the 15th to the beginning of the 20th century (according to various sources, modern Azerbaijani can begin in 1922-1923, when the USSR occupied Azerbaijan, or in 1928-1939, when the USSR translated the Azerbaijani language into latin alphabet). However, historically, the ancestor of Azerbaijani was considered as Ajami Turkish ("Turkish of Persia" and was language of Qajars, Afshars, Qizilbashs etc, it is also ancestor for Qashqayi, possible for Khalaji etc), it is known under different names, but this is the most common, since most often it was simply called Turkî (Turkish, Turc, Turk). I could write Azerbaijani articles written in the Abjad alphabet within this language so as not to encounter restrictions, but as I understand it is not possible at the moment.

Please help me with this issue, since I have a lot of literature and I want to create pages indicating these words, but I encounter restrictions from other users. Sebirkhan (talk) 12:19, 22 August 2024 (UTC)Reply

well, well, well… since i'm mentioned here, and i feel that i'm being blamed in a disastrously manipulative way, i, with all my humility, consider myself righteous to say a few words in order to defend myself. jimmy mcgill ahh intro
first of all, i'd like to talk about my promotion of the supremacy of the almighty and the glorious of all the writing systems. as seen here and in the discussion in their talk page, the user does not hesitate to appeal to manipulative fallacies, such as the classic ones, fabricating an enemy, blaming others, and self-victimizing themselves, as seen here, here, and here: …he considers the Ottoman language more "prestigious" than Azerbaijani?. even the simple act of creating a discussion in this page is a very nice example of that, since they just couldn't manage to provide an argument and tried to end the convo asap in their user talk page and just want to continue agitating.
azerbaijani has been contributed for in wiktionary for years, in all 3 writing systems — latin, cyrillic and arabic (aka perso-arabic or abjad), and nobody denies nor asks for the opposite of that. however, handling three scripts for a language can never be an easy task, since there are quite a few options for how to approach to this situation. as far, the widely applied solution is keeping the main entries in latin-script pages, providing an {{az-variant}} to make the access to the spellings in other scripts easier, and use the {{spelling of}} template with the adequate script code, which links to the main entry, spelled in latin. i have already provided arguments about why this makes sense and why we should continue doing things this way (in the reply to this message). i humbly think and believe that, north azeri (the one written in latin script) is the only one that's regulated, and the only one that's recognized as an official language. south azerbaijani, on the other hand, (written in arabic script) is not regulated and doesn't have a widely accepted standard orthography. in addition, notable per my consideration, the latin-script azerbaijani is more accessible for people on the internet (and in the real life in general, in the world outside iran), is more well-documented, and has an overwhelmingly better online support than the arabic-script one. apple not having an arabic-script keyboard for azerbaijani is just a simple instance for this. people would immediately think of north azerbaijani if we don't explicitly mention "southern", even though the macrolanguage includes both. these, i believe, could be the reasons why the latin script was selected to keep the main azerbaijani entries. we've been doing this for years, and it is the de-facto solution for creating azerbaijani entries on wikt. this is the reason why i cleaned the entry in the first place. unfortunately, this caused an edit war, which we're trying to solve. by the way, they also suggested having duplicate entries for each script, which i object as it'd be inconsistent and hard to maintain.
and about that ottoman thing… the orthography of the azerbaijani language was quite similar to the one of the ottoman turkish language until both languages switched to the latin script, at approximately same times —the end of 1920s—. and that one dated orthography differs significantly from the current persian-style orthography and the underrated varliq standard. that's why i thought it could be an ottoman turkish word, accidentally input as azerbaijani. then they provided some dictionaries that are older than my grandma, and that's why i suggested them being marked as {{lb|az|dated}}. we just forgor this due to this silly thing we've been kept busy with, such a tragicomedy
in conclusion, i don't and cannot ever have any objections for azerbaijani being written using different scripts, unlike they try and manipulate as if i would. i am just a man of keeping things tidy and clean, appropriately formatted. that's all. it's such a shame that i've been wasting more than 2 hours to write this. i just don't understand why they just keep insisting on their nationalist views, while not being able to provide reasonable arguments. i remember that i overcame this when i was 14, like a year ago or smth. it shouldn't be that hard and we shouldn’t be struggling with such shenanigans.
p.s. i see no problem with discussing the legitimacy of the status quo approach to azerbaijani entries, but i'd prefer reasonable arguments, instead of that our armenian friend will be grateful bs.
p.s. 2. i feel like there could be better reasons for why the latin script is (and should be) perceived as the main one, so, if you have any ideas, or you think that i'm wrong, i'd be thankful if you just threw them below. tia :D əkrəm. 14:47, 22 August 2024 (UTC)Reply
You are confusing the concepts again, I can't speak or write in a language called "South Azerbaijani". My grandfather was from the Ganjabasar region, which now is the east of the modern Azerbaijan Republic. I have never spoken to a person who speaks South Azerbaijani and have never read a book written in this language. How does the alphabet relate to this or that dialect? The problem is that you are deleting information from pages where the word is written in the Azerbaijani Abjad alphabet. I am talking about information that is missing from the page of the word written in the Latin alphabet. Why do you use the template you mentioned in relation to the Abjad alphabet with a link to the Latin alphabet, and not vice versa? Sebirkhan (talk) 15:14, 22 August 2024 (UTC)Reply
well, the term south azerbaijani is just an alias i used to indicate the modern azerbaijani language, if that helps. if you're NOT talking about the modern language, well… obviously, {{lb|az|<dated|archaic|obsolete>}} is, in my humblest-to-god opinion, our only choice.
about that deletion, thing… JUST GO AND FUCKING READ WHAT THE FUCK I TOLD YOU, OKAY? umm, wait, actually, this is not okay. i wouldn't want to use such a language. all right, what about this: i kindly ask you to read my arguments and not to act like they never happened. please. i'm not going to write the same thing twice, just waitin' till you open your eyes and start seeing things you've never seen before.
oh and btw please use an autocorrection tool like grammarly or smth before you post your reply here, tia. əkrəm. 21:49, 22 August 2024 (UTC)Reply
@Sebirkhan You didn't prove anything, all of the dictionaries you provided are from the early 1900s and late 1800s, despite the fact that there was an Azerbaijani orthography reform in the 1980s. None of the dictionaries you provided proved those spellings are still recognized in the modern Iranian-Azerbaijani alphabet.

Secondly, You need to calm down and Assume good faith, it is absolutely disrespectful to accuse people you disagree with of having evil motives. For the record, Akram and Fenakhay were simply enforcing a long-standing wiktionary policy to consolidate all language information in one place. If an entry is repeated in multiple places then when one version of the entry is updated, all the others will become outdated. In those instances, it can be years before someone notices one entry is outdated. — BABRtalk 15:27, 22 August 2024 (UTC)Reply
Sorry, but I think you need to read my entire text to understand that the problem is a little broader than you thought. I can not used Abjad becouse they deleting this. The problem is that the ancestor of the Turkish language is indicated - Ottoman Turkish, which was used until 1920s. This completely solves the problem in the case of the Turkish language. At the same time, there is no solution to this problem for the Azerbaijani language - the ancestor of the Azerbaijani language is indicated in wiktionary as Old Anatolian Turkish, which was used until the 14th century at the latest (and which is also the ancestor of Ottoman Turkish and Ajami Turkish and Turcomani). Where is the ancestor of the Azerbaijani language in the period from the 15th century to 1920s? Where can I write these words, but so that Latin people do not delete information from the article and do not replace it with a template referring to Latin? Sebirkhan (talk) 15:38, 22 August 2024 (UTC)Reply
Please, someone, create the language Category for this language Ajami Turkish (https://www.wikidata.org/wiki/Q110812703) to make it ancestor it for Azerbaijani language. It will look like this: Azerbaijani language comes from Ajami Turkish, which comes from Old Anatolian Turkish.
I do not know how to do it in wiktionary. Sebirkhan (talk) 17:08, 23 August 2024 (UTC)Reply
I'm surprised this hasn't been added yet. Nicodene (talk) 20:44, 23 August 2024 (UTC)Reply
The Wikipedia article Q110812703 was authored from late 2022, in the whole lot of languages. Guess Azerbaijanis organize well and their DMN made new projects during the pandemic. But they have missed what we have done all the time on English Wiktionary. I added Classical Azerbaijani two years earlier. Nobody outside global Azerbaijan could have expected such a preference for another term, even Azerbaijani Turkologists in the West, which seem content with this label. Unlike Classical Persian it has not been added as an ancestor of Azerbaijani for no discernible reason later, after our language data modules have been rewritten and reorganized multiple times and only acquired the option of us setting an L2 language to have an ancestor in an etymology-only variety of itself.
Either way editors should be aware what they do here. They compartmentalize Southern Azerbaijani incorrectly if the target audience of this dictionary necessarily reads Latin characters. It matters less then what speakers mostly use. Anyone seeking out en.wiktionary.org can get through Arabic Azerbaijani forms redirecting to Latin spellings, there is little grounds for animosity. Fay Freak (talk) 04:48, 24 August 2024 (UTC)Reply
Hello, so can you please add Ajem-Turkic (aka Ajami Turkish, Ajami Turkic) as ancestor? As for the term Classical Azerbaijani (which is listed as a variety of modern Azerbaijani in wiktionary), I will try to explain why it is not entirely appropriate (and in general the use of the word "Azerbaijani" for the ancestor of the Azerbaijani language.) Ajem Turkic is the ancestor of several languages, but whether these languages ​​are separate languages ​​or dialects is a question for which there is no clear consensus - I mean Qashqai, Afshari, Iraqi Turcoman, Sonqori, Qizilbash. In book The Turkic varieties of Iran , Christine Bulut says (page 406) that written language for theese language was Ajam Turkic since 16th century. It is a good term. But it is also a good term to use because it does not require each of these languages ​​to have an ancestor called "classical Qashqai" (or old qashqai), "classical Sonqori", etc. especially considering the fact that their vocabulary is identical to each other with a few exception. It is obvious that they all descended from one ancestor, but now the only question is what to call this ancestor.) The above mentioned languages/dialects have no relation to the region called Azerbaijan and have never in their life called their language Azerbaijani. 178.46.58.85 11:49, 24 August 2024 (UTC)Reply
m["trk-ajm"] = {
"Ajami Turkish",
110812703,
"trk-ogz",
"fa-Arab",
ancestors = "trk-oat",
entry_name = {["fa-Arab"] = "ar-entryname"},
} 192.71.227.211 15:20, 24 August 2024 (UTC)Reply

Sign up for the language community meeting on August 30th, 15:00 UTC

edit

Hi all,

The next language community meeting is scheduled in a few weeks—on August 30th at 15:00 UTC. If you're interested in joining, you can sign up on this wiki page.

This participant-driven meeting will focus on sharing language-specific updates related to various projects, discussing technical issues related to language wikis, and working together to find possible solutions. For example, in the last meeting, topics included the Language Converter, the state of language research, updates on the Incubator conversations, and technical challenges around external links not working with special characters on Bengali sites.

Do you have any ideas for topics to share technical updates or discuss challenges? Please add agenda items to the document here and reach out to ssethi(__AT__)wikimedia.org. We look forward to your participation!

MediaWiki message delivery (talk) 23:20, 22 August 2024 (UTC)Reply

User:ColumbaBushBot

edit

Hi everyone - I recently started a vote Wiktionary:Votes/bt-2024-08/User:ColumbaBushBot_for_bot_status for bulk-renaming Assyrian Neo-Aramaic inflection templates, ie Category:Assyrian_Neo-Aramaic_inflection-table_templates

Here's some examples of changes it could be used for

Anyhoo - I invite everyone in the community to discuss and share your thoughts ColumbaBush (talk) 07:03, 23 August 2024 (UTC)Reply

Blocked 1 week

edit

Hello, I would like to know why my (@Sebirkhan) account was blocked? It says "Re-adding previously deleted entries" but i just tried to create new page: Ajami Turkish 178.46.58.85 20:11, 23 August 2024 (UTC)Reply

@178.46.58.85 JSYK the proper way to request an unblock is to use {{unblock}} on your talk page. — BABRtalk 20:36, 23 August 2024 (UTC)Reply
thank you 194.87.107.107 20:59, 23 August 2024 (UTC)Reply
You created Ajami Turkish and then Fenakhay moved it to Ajami Turkic 35 minutes later. Then an hour later you created Ajami Turkish again, Fenakhay deleted it, and a few minutes later you created it again, and he deleted it again. I think Fenakhay moved it because in English Turkish refers to the Oghuz language of Turkey, and Turkic refers to other related languages that are in the Turkic family. — Eru·tuon 20:49, 23 August 2024 (UTC)Reply
But why Turkish page says that Turkish is synonym of Turkic? I am as Azerbaijani Turk can say that Turkic word used for words like turkic runes, turkic tribes and other ancient things (also it common for all Turkic nations) but in case of Azerbaijani we use "Turkish". For example Azeri Turkish, Azerbaijani Turkish (see: w:Azerbaijani_language. So anyway it not incorrect word but synonyms, and I was blocked becouse I have used synonyms? 194.87.107.107 20:58, 23 August 2024 (UTC)Reply
Ajami Turkish is not attested in English, but only your protologism. In Turkic linguistics, the variety is known as Ajem-Turkic or Ajami Turkic, which is why I moved it to the latter form. Continuing to recreate a protologism three times is disruptive and block-worthy no matter what. — Fenakhay (حيطي · مساهماتي) 21:01, 23 August 2024 (UTC)Reply
as I know "Ajami Turkic" is protologolism of H. Boeschoten, i did not find any other sources that would not refer to him. Can you share links to the term you mentioned above from at least three different authors so that we know for sure that this term is preferable?
Also Ajami Turkish is tranlation if original word "Turkî Ajami" 178.46.58.85 21:13, 23 August 2024 (UTC)Reply

New Wiktionary logo: request for feedback (replacing 维 with 維)

edit

Hi, I've previously proposed making this change from to and received positive support and suggestions. User:Cypp0847 has now kindly created an svg which I would like to seek your opinion on. Here is the original (current version) and here is the newly created one.

 
Proposed new Wiktionary logo

I think something like this looks good, if anyone has any thoughts on, for example, font selection, stroke thickness, or anything else I'd love to hear it. I was thinking a thinner font might look clearer when displayed in small size, but would like to get y'all's thoughts on it. It might also be better to have this as a separate file on Commons in the meantime. In any case looking forward to seeing this come to fruition! Thanks, ChromeGames (talk) 23:55, 23 August 2024 (UTC)Reply

sounds good to me — nd381 (talk) 04:12, 24 August 2024 (UTC)Reply
sounds boring to me Zebres rouges (talk) 08:39, 24 August 2024 (UTC)Reply
Thanks for tagging me in this thread. I was able to find two font types that can display but either a bit heavier (the one now shown) or much thinner. Grateful if anyone can propose a better font so that I can improve on it. Thanks. Cypp0847 (talk) 12:40, 24 August 2024 (UTC)Reply
If you can find a version with thinner strokes, that would be great, but I'm happy enough with it either way. Andrew Sheedy (talk) 15:24, 24 August 2024 (UTC)Reply