Wiktionary:Beer parlour/2021/October

Definitions of Letters edit

As words of a particular language, many letters have definitions such as "the second letter of the Welsh alphabet". (The Welsh entries themselves are not quite so bad, as they also then spell out the letter and gives their predecessors and successors.) Such definitions are intrinsically unstable, for letters may be inserted in an alphabet. For example, the letter 'j' has been added to the Welsh alphabet since I was a child, and as a result of different sources we now have the opening definition "the fourteenth letter of the Welsh alphabet" for both J and L! As a result of the deletion of letters, both Ll and N are defined as 'the 14th letter of the Spanish alphabet'. --RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

I therefore feel that it would be appropriate to change definitions of one-character letters from "the nth letter of the WW alphabet" to "the letter of the WW alphabet used as the header word of this entry", and add "It is the nth letter of the WW alphabet" to the "Trivia" section of the entry. History may cause the trivium section to expand. Multi-character letters would be handled by analogy. As boldly making this change might be considered vandalism, what do people feel about this proposed change? Does it need a vote? --RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

Should we be documenting the use of letters in non-additive numbering systems, such as 'Section 5(c)'? The most significant feature of such systems is that some letters are not used in such lists. I can see an argument that such documentation belongs to a grammar, rather than a lexicon.--RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

I feel like this discussion will be pointless if the vote about letters entries passes. Thadh (talk) 11:23, 1 October 2021 (UTC)[reply]

@Thadh: How so? Are you assuming that all the letter entries of a language can be squeezed into a single table? --RichardW57m (talk) 12:30, 1 October 2021 (UTC)[reply]

@RichardW57m: Not necessarily in a table, but they probably won't look the same way they do now, so it doesn't make much sense to discuss the way they look in entries before we know where the vote's heading. Thadh (talk) 13:47, 1 October 2021 (UTC)[reply]

Rhyming categories for Middle Chinese edit

I think all the data for Middle Chinese rhymes are already there. Those data were sourced from rhyme dictionaries in the first place. Is there plan for actually implementing Middle Chinese rhyming categories? This may even be a fairly good case for automation. --Frigoris (talk) 16:53, 1 October 2021 (UTC)[reply]

HSK lists of Mandarin words update edit

Currently, Wiktionary has Appendix:HSK list of Mandarin words accumulating all the vocabulary of the old (pre-2010) HSK test. Recently, the exam was reformed, and the lists of words and characters were published. See this pdf for official specifications. Thus, I propose to update the appendix.

I made drafts of the new HSK word lists:

HSK Beginner (levels 1-3): all three levels

HSK Intermediate (levels 4-6): level 4, level 5, level 6

HSK Advanced (levels 7-9): a-h, j-s, sh-zh

The words are OCRed from the paper, and then converted into traditional characters with some manual corrections. I think some proofreading is still needed.

The following problems arise here:

What should be done with the old appendix?
How should the new appendix be divided? The current version of the HSK has 9 levels grouped in 3 ranks. The high levels (7-9) are not delimited, but they contain roughly as many words as all the preceding levels combined (5636 vs 5456). Note that it's computationally heavy to have a huge amount of words in Template:zh-l on a single page.
There is a category tied to the old word lists, see Category:Mandarin by difficulty level. You may want to reorganize it.
Many words in the HSK can be considered SoPs, and some of them were previously deleted on that ground (see the red links on my drafts).
Many words in the HSK have optional erhua. How should they be listed in the new Appendix?
I think everyone would agree on inclusion of traditional forms of the words, but what should be done about the variant pronunciations (Taiwanese or colloquial Mainland) not listed in the official HSK paper? Should they also be included? --YousuhrNaym (talk) 23:51, 3 October 2021 (UTC)[reply]

Let's talk about the Desktop Improvements edit

Hello!

Have you noticed that some wikis have a different desktop interface? Are you curious about the next steps? Maybe you have questions or ideas regarding the design or technical matters?

Join an online meeting with the team working on the Desktop Improvements! It will take place on October 12th, 16:00 UTC on Zoom. It will last an hour. Click here to join.

Agenda

Update on the recent developments
Sticky header - presentation of the demo version
Questions and answers, discussion

Format

The meeting will not be recorded or streamed. Notes will be taken in a Google Docs file. The presentation part (first two points in the agenda) will be given in English.

We can answer questions asked in English, French, Polish, and Spanish. If you would like to ask questions in advance, add them on the talk page or send them to sgrabarczuk@wikimedia.org.

Olga Vasileva (the team manager) will be hosting this meeting.

Invitation link

Join online
Meeting ID: 829 3670 1376
Dial by your location

We hope to see you! SGrabarczuk (WMF) 15:09, 4 October 2021 (UTC)[reply]

Unifying the transliteration of ʾalef and ʿayin in Semitic languages edit

Dear Wiktionary Semitists, I'd like to bring to your attention the current lack of consistency in how ʾalef and ʿayin are transliterated across Semitic languages. Have a look at the following pages and compare transliterations, for example:

The inconsistency is both inter- and intra-linguistic. It is quite confusing, and since it's basically just a stylistic question, I'd like to start a discussion on whether we should unify to the more traditional (but not user friendly, since they're small and difficult to tell apart) /ʾ/ and /ʿ/ or the more modern (and much more user friendly) /ʔ/ and /ʕ/. Opinions? Thoughts? Let's discuss! — This unsigned comment was added by Sartma (talk • contribs) at 12:22, 5 October 2021 (UTC).[reply]

For Amharic, ʾ and ʿ are the ones in use and since these aren't contrastive, I would like to keep following that practice. I don't have a strong opinion on other Semitic languages though, but /ʔ/ and /ʕ/ do seem more user-friendly in languages where that distinction is relevant. Thadh (talk) 19:30, 5 October 2021 (UTC)[reply]

I'd rather consistency between languages that frequently appear together, like the Ge'ez-script languages or Arabic topolects. I don't see any reason why there should be consistency between all Semitic languages, which only appear next to each other on protolanguage entries. —Μετάknowledge^{discuss/deeds} 20:32, 5 October 2021 (UTC)[reply]

In my own handwritten notes I find I'm using the IPA symbols as just clearer. We don't have to use pure IPA in transcriptions, but the traditional little curly apostrophes, barely readable in a printed book, become impossible in a computer typeface. The IPA symbols magnify them and make them readable. If you're going to use š rather than sh in transcriptions, you're half way to pure phonetic symbols. The apostrophes are appropriate for semi-technical formats like maps and history books, but for a more linguistic purpose, use clear, readable, unambiguous symbols. --Hiztegilari (talk) 20:58, 5 October 2021 (UTC)[reply]

I support the IPA symbols except for the Gəʿəz-script languages, in which field the half rings seem uncontested, and as mentioned are also the distinction is less contrastive. For Akkadian I don’t know. Fay Freak (talk) 22:57, 5 October 2021 (UTC)[reply]

I have seen that in the Routledge volume The Semitic Languages, most authors use ʔ and ʕ even when they use conventional non-IPA symbols otherwise, e.g. ʔǝgziʔ-ä sämay yä-ṣnǝʕ mängǝśt-ǝyä (Butts, chapter "Gǝʕǝz"). I don't know if this is a general trend, but consistently using ʔ and ʕ in place of ʾ and ʿ is nothing unseen. –Austronesier (talk) 10:33, 6 October 2021 (UTC)[reply]

@Austronesier: True, I remember these fashionable books. They owe it to their character as general overviews, while Wiktionary’s mission is to document the individual languages in detail and as one does when one deals with a narrow selection of languages in detail. I framed the field as Ethiopian studies (Äthiopistik). While this is an Orchideenfach I do not know the people of who nowadays study it, I doubt little that the bulk of the field is gutted if it sees a deviation from that certain transcription system which we currently automatically put and which is of course followed and presented without even any question or any glance on an alternative by the Wikipedia article on Geʽez script—so nobody seeks an article like Romanization of Arabic for Ethiopian Semitic—, and Ethiopists would rather refrain from any change to it. Fay Freak (talk) 16:05, 6 October 2021 (UTC)[reply]

@Fay Freak: Good point. I can confirm from my very own experience that editors of such overview volumes set standards which contributors wouldn't normally follow in more specialized works: e.g. I was urged to change the name of a language to make it confirm with the ISO-standard (in that special case a real abomination). Since you say that the Gəʿəz transliteration in that book was adjusted to an in-volume standard that is otherwise uncommon, I agree we shouldn't really follow it. –Austronesier (talk) 16:28, 6 October 2021 (UTC)[reply]

I've just picked Al-Jallad as an example: in the Routledge volume, he uses ʔ and ʕ in the Safaitic chapter; but in his Safaitic grammar (Brill), he naturally uses ʾ and ʿ. –Austronesier (talk) 16:46, 6 October 2021 (UTC)[reply]

@Austronesier, Fay Freak, Thadh, Metaknowledge Ok, it looks like the majority is ok with using different signs depending on the language. But what about those languages that don't seem to have a standard at the moment? Like Aramaic (the various variety), Hebrew, Arabic and its topolects? To be honest, despite much preferring ʔ and ʕ, I'm more than happy to unify everything to ʾ and ʿ. In the end, there's no real "tradition" that uses ʔ and ʕ, these are just the more "modern" style. To me it's really strange to see Standard Arabic using ʾ/ʿ and other Arabic topolects using ʔ/ʕ, for example. There's no reason why it should be like this. What shall we do? Sartma (talk) 15:16, 7 October 2021 (UTC)[reply]

ʔ and ʕ—the easier if standardization is less relevant. I made the exception only for Ethiosemitic—which is separated by a mere, anyway; I think it will vex you not if we have ʾ and ʿ for Ethiosemitic and ʔ and ʕ elsewhere. Fay Freak (talk) 15:31, 7 October 2021 (UTC)[reply]

Arabic needs input from a great deal more people than will see and interact with this; we'd want a dedicated discussion at Wiktionary talk:About Arabic. As for Aramaic, it will never be completely unified, because some of the modern neo-Aramaic varieties have romanisation traditions that emerged independently from scholarly usage, and should be left as they are. For the long-extinct Aramaic varieties, we can do as we like, and though ʾ and ʿ are the closest we have to a standard for them, I would be happy to switch them over to ʔ and ʕ — although that could be putting the cart before the horse, in that most of the entries don't have romanisation at all and the scheme isn't completely settled anywhere. —Μετάknowledge^{discuss/deeds} 17:37, 7 October 2021 (UTC)[reply]

Request for new language family and proto-language codes: North Halmahera / Proto-North Halmahera edit

User:Alexlin01 and I (or better, mostly Alexlin01 who has been active as IP in the past) have started to add lemmas from languages of the North Halmahera family, together with etymologies from reconstructed proto-forms. There is an existing corpus of 180 proto-forms available, and we might carefully add more reconstructions based on regular sound correspondences.

The North Halmahera languages are part of the proposed West Papuan macrofamily which has the code [paa-wpa] in WT. While West Papuan is still tentative and only based on resemblance sets, North Halmahera is universally accepted, since it is as self-evident as e.g. the Slavic languages. Therefore, we request a code for North Halmahera and Proto-North Halmahera. North Halmahera would be under [paa-wpa] (West Papuan), and include the following languages:

Galela [gbi]
Gamkonora [gak]
Ibu [ibu]
Kao [kax]
Laba [lau]
Loloda [loa]
Modole [mqo]
Pagu [pgu]
Sahu [saj]
Tabaru [tby]
Ternate [tft]
Tidore [tvo]
Tobelo [tlb]
Tugutil [tuj]
Waioli [wli]
West Makian [mqs]

Currently, they are under [paa-wpa] (West Papuan) or the generic [paa] (Papuan). ‑Austronesier (talk) 07:35, 6 October 2021 (UTC)[reply]

Hi! Also, from these, Ibu is already extinct. Alexlin01 (talk) 14:34, 6 October 2021 (UTC)[reply]

@Austronesier Created paa-nha and paa-nha-pro. DTLHS (talk) 03:10, 8 October 2021 (UTC)[reply]

@DTLHS Great, many thanks! –Austronesier (talk) 08:43, 8 October 2021 (UTC)[reply]

Inconsistent treatment of Arabic words in Persianate languages edit

(Notifying AryamanA, Atitarev, Benwing2, Smettems, Kutchkutch, Bhagadatta, Msasag, Svartava2, Getsnoopy): @Allahverdi Verdizade

There is an inconsistency in the treatment of Arabic words in Persianate languages.

In South Asian languages, the proximal donor is given as Persian.
In Turkic languages (especially Turkish and Azeri), the proximal donor is given as Arabic.

For example, Hindi किताब (kitāb) is given as coming from Classical Persian (kitāb), while Azerbaijani kitab or Uzbek kitob is given as ("ultimately") coming from Arabic كِتَاب (kitāb) with no mention of Persian.

Could this be resolved one way or another? I suppose it's a bit iffier for Anatolian Turkish given that the Ottomans had direct contact with Arabic-speaking subject populations, but for Azeri Turkish or the Central Asian languages it should be the same situation as with South Asian languages, i.e. these words entered the language through the means of a Persianate literati class who used both Persian and Arabic, but whose primary language of writing was the former.

My understanding is that there is evidence of Persian mediation for both South Asian and Turkic languages, e.g. Hindi फ़ुर्सत (fursat) meaning "spare time" or Turkish macera meaning "adventure".--Tibidibi (talk) 13:32, 6 October 2021 (UTC)[reply]

Also ping @Vox Sciurorum, @Fay Freak.--Tibidibi (talk) 13:50, 6 October 2021 (UTC)[reply]

I mark Ottoman Turkish and Turkish terms as derived from Arabic unless I have evidence that one was borrowed from Persian. If the word has been in Turkic languages from before the 13th century or so I may assume it was borrowed from Persian. Nineteenth century borrowings I assume were directly from Arabic, if not Ottoman coinages based on Arabic grammar. If there are any phonological or temporal guidelines to use, let me know. Vox Sciurorum (talk) 13:53, 6 October 2021 (UTC)[reply]

@Vox Sciurorum I think there is a stronger justification for having Ottoman terms be derived directly from Arabic because Persian was neither the language of the Ottoman administration nor that of any significant part of the population. For the Turkic languages east of the Ottoman-Safavid border, and for all South Asian languages, the influence of Persian as a prestige language was much more direct.--Tibidibi (talk) 14:10, 6 October 2021 (UTC)[reply]

What Squirrels Voice said.

Also, I see zero value in clogging up the etymology of Arabic derivatives with an extra piece of information, which is hardly provable anyway if it came in through Persian or directly via bookish contexts. Allahverdi Verdizade (talk) 14:15, 6 October 2021 (UTC)[reply]

The Seljuk dynasty that invaded Anatolia after their victory in the Battle of Manzikert was a Persianate society. While Ottoman Turkish was not Persian, the language was replete with loanwords from Persian covering cultural and administrative terminology, while Arabic was the donor for many religious terms. Some of the Persian loanwords the Seljuks brought with them to Anatolia came from Arabic. It is IMO truly impossible to decide whether the proximate source of Ottoman Turkish فلسفه was the (identically spelled) Persian term, or, directly, Arabic فلسفة. The choice not to mention Persian as a possible donor is then merely a choice for the sake of convenience, not a matter of principle. --Lambiam 17:25, 10 October 2021 (UTC)[reply]

The distribution of Persian has a cohesive epicentre while Arabic has been scattered all around the world. Have you heard of Uzbeki Arabic? Now Samarqand clearly was a hotspot of Arabic communication; from there Arabic-speaking tradesmen in low concentration reached Uyghuristan, in the vicinity of which Arabs learned words like خُتُو (ḵutū), on the entry of which I included a quote where Samarqand occurs as a casual station of Arabic rulers; I don’t think one has to imagine the mediation of communication by Persian, contact was generally Arabic language to Turkic language, this regard is most parsimonious. Fitting this picture, Persian words use to reach Mongolian but via Tibetan (!). For Anatolian Turkish it is only most prominent and most obvious, to a Westerner, that contact with Arabic was there, because Arabs were Ottoman subjects (but so they were Kipchak and Turkmen subjects before …). Fay Freak (talk) 15:38, 6 October 2021 (UTC)[reply]

@Fay Freak: Arab speakers in Khorasan are a small minority because the colonists there assimilated quickly. From The Cambridge History of Iran, Volume 4, page 602:

Alongside both the early dialects and dari, which had spread everywhere with a greater or lesser degree of local variation, Arabic had also taken root in Iran. It was of course the everyday language of the Arab immigrants: certain towns such as Dinavar, Zanjan, Nihavand, Kashan, Qum and Nishapur had a considerable Arab population and Arab tribes had also settled in Khurasan. However, these Arab elements were more or less rapidly assimilated: in the middle of the 2nd/8th century the majority of the Arabs in the army of Abu Muslim spoke dari.

In fact, the Islamic conquest led to the expansion of Persian and its replacement of local Eastern Iranian languages like Sogdian.

Since major urban centers such as Bukhara and Samarqand were clearly predominantly Persophone by the period when the region was becoming increasingly linguistically Turkic, I don't see any justification for claiming that most Arabic loans in e.g. Uzbek are directly from the small community of native Arabic speakers instead of reflecting Arabic's position as a prestige language upheld by a primarily Persophone literati elite.

Chagatai, the direct literary ancestor of Uzbek, was marked by extensive Persian influence (to the point that some texts have virtually no Turkic content words) and became a literary language explicitly on the model of Persian in Timurid and Shaybanid courts, both of which retained Persian as the chief bureaucratic language. I understand that Chagatai has little additional Arabic influence beyond what is already systemically found in Persian. Tibidibi (talk) 16:16, 6 October 2021 (UTC)[reply]

The point was that there had been a constant latent presence of Arabic, not only as traces in Persian. Be the communities more or less native or be they acquainted with it due to trade or war or education. Arabic was never eradicated and the influx was continuously renewed. While in India this latent presence lacked, Arabic was really remote and for the educated. Oddly of course Persian scholars wrote Arabic – for Samarqand I think of Najib ad-Din Samarqandi – while Indians wrote Persian, does this tell us something for the question of the thread? So in the former borrowings could be more from Arabic due to some familiarity. Fay Freak (talk) 16:30, 6 October 2021 (UTC)[reply]

If you actually read about Central Asian Arabic, you'll see that they bear signs of having close ties to dialects in Arab countries, which allows us to reconstruct migration events. This is clearly inconsistent with a "constant latent presence" of actual speakers (as opposed to scholars and clerics, who could only influence the language on a literary or religious level). For Indian and Central Asian Turkic languages, there is no reason not to assume a Persian intermediary unless specific evidence is brought to bear for a given word; for Turkish and Azerbaijani, I don't think it's generally knowable. —Μετάknowledge^{discuss/deeds} 03:30, 8 October 2021 (UTC)[reply]

@Metaknowledge Why do you think it's unknowable for Azerbaijani? I'm not really sure what the major difference would be between Azerbaijani and Chagatai vis-a-vis their relationship to Arabic/Arabs and Persian/Persians. Tibidibi (talk) 14:09, 10 October 2021 (UTC)[reply]

Because the West Oghuz tribes have actually been geographically adjacent to Arabs since around 1000 AD. Allahverdi Verdizade (talk) 10:53, 13 October 2021 (UTC)[reply]

Romanization pages for Mandarin and Cantonese - possible update task for a bot? edit

Currently, the various romanization pages for Mandarin Pinyin and Cantonese Jyutping are in a poor state. I presume due to the quantity and ancillary nature of such entries, many are lacking updated content with common characters and there are inconsistent presentation of the relevant characters. Some examples:

For 烹, the pinyin entry pēng shows characters such as 硷 and 軽, which are simplified or variant forms but the linked traditional forms do not show this pronunciation. In the case of 軽, this character is more commonly recognised as Japanese Shinjitai since the regularly observed Chinese forms are 輕 and 轻.
paang1 does not show 烹 at all
xiǎn shows in list items 5 崄 and 6 嶮 which are the simplified and traditional version of the same character, while lower down item 23 lists 猃, 獫 together.
Also in xiǎn, item 17 濁 is shown but the simplified form is not included.

This seems to be a good target for a bot to update the entries if it is able to take all the existing pinyin and Jyutping pronunciations for all characters and to update the entries systematically, while also standardising the presentation of simplified and variant character forms. A good example to reference is shí which has a good number of entries (however I'm not sure if it includes all) and most entries list the traditional and simplified forms together. This entry does however list item 2 as "実, 实, 實, 寔", which is a bizarre ¿alphabetical? order of Shinjitai, simplified, traditional and variant characters. As for item ordering, it might seem like it is ordered by radical and stroke - this might be something that needs consideration for standardisation of the romanisation entries.

Would anybody be able to take on this task?

I can try to built such a bot but I have not built bots before and I believe it requires data scraping the pronunciations off all the existing entries, which will be a arduous task in itself, even if done with automation.

Zywxn (talk) 17:14, 6 October 2021 (UTC)[reply]

User TheNicodene - revert war to hide unresolved abuse edit

The user is trying to obstruct my efforts at bringing to attention at addressing the abuse they've perpetrated against me deleting and archiving the discussion at Talk:formaticus. They're trying to hide the abuse and break the existing links in other discussions. The issue is not resolved and cannot be archived until it is. I request this user be blocked if they continue the edit war. Brutal Russian (talk) 05:31, 7 October 2021 (UTC)[reply]

I did not 'hide' the discussion; that is a flat-out lie which can be disproved by clicking the link. I placed the discussion in an archive and added a link at the top of the talk page; doing so with discussions over 75000 bytes, in order to free up space for new discussions, is standard Wiki practice. The discussion has not even been replied to for four months now. Nor did archiving it 'break links', which is another flat-out lie. Talk: formaticus functions exactly as it always did.

See here for a write up of only some of the insults this user has thrown at me over several months, for which he has even been temporarily blocked. I have no idea why he is suddenly acting up again after a merciful three-month hiatus. The Nicodene (talk) 05:53, 7 October 2021 (UTC)[reply]

Macedonian: standard, non-standard, misspelling edit

@Chuck Entz, Erutuon, Metaknowledge Since I am now creating entries for non-lemma forms of verbs, I would like to discuss some issues relating to the treatment of non-standard and misspelled words. We scratched the surface with User:Erutuon in August, but there are quite a lot of problems to be addressed:

Currently, my entries are formatted as follows:

коригира - standard word, lemma

Assigned to: verbs, lemmas (I am omitting less relevant categories)

корегира - misspelled word, lemma: "misspelling" in the headword line, {{misspelling of}} in the definition

Assigned to: misspellings, non-lemmas

коригиран - standard word, non-lemma: "participle" in the headword line, {{infl of}} in the definition

Assigned to: participles, non-lemmas

корегиран - misspelled word, non-lemma: "misspelling" in the headword line, {{infl of}} in the definition

Assigned to: participles, misspellings, non-lemmas

очерупа - nonstandard word, non-lemma: "verb" in the headword line, {{lb|mk|nonstandard}} in the definition

Assigned to: verbs, non-standard terms, lemmas

очерупан - nonstandard word, non-lemma: "participle" in the headword line, {{infl of}} in the definition

Assigned to: participles, non-lemmas

The problems are as follows:

It is also possible to treat корегиран as a misspelling of коригиран, i.e. to link two non-lemma forms to each other, rather than defining each as an inflected form a lemma. I have always tended to opt for the second solution, including with categories other than partciples.
Putting "misspelling" in the headword line of a misspelled verb lemma prevents it from being assigned to "verbs", but putting "misspelling" in the headword line of a misspelled participle (non-lemma form) of a verb does not prevent it from being assigned to "participles", because the parameter "part" inside {{infl of}} seems to populate that category.
"misspelling" does not distinguish between misspelled lemmas and misspelled non-lemmas.
Non-lemma forms of non-standard words are not labelled in any way to indicate that they are non-standard, because if I write {{lb|mk|nonstandard}}, they will get categorized as non-standard terms, which is wrong (they are not terms but non-lemmas), whereas if I write {{lb|mk|nonstandard forms}}, that will technically be correct, except that this label is used elsewhere for non-standard forms of standard words (comparable to English "goed", a non-standard preterite of the standard "go").

Further complications:

Participles have their own inflection, e.g. "коригираниот", which is the definite form. I do not want this to link back to the verb коригира; it is more appropriate for it to be defined as {{infl of|mk|коригиран||def|m|s}}. It will then be assigned to participle forms, with the help of the headword line {{head|mk|participle forms}}. However, if the inflected participle is misspelled as "корегираниот", it would be defined as {{infl of|mk|корегиран||def|m|s}} and the headword line would be {{head|mk|misspelling}}. Consequently, there would be nothing to assign "корегираниот" to participle forms. This would be a second inconsistency, in addition to the aforementioned one ("misspelling" suppresses the category "verbs" but not the category "participles") Martin123xyz (talk) 11:48, 7 October 2021 (UTC)[reply]

Ideal solution:

Redefine the category system to have the following:

lemmas
non-lemma forms
misspelled lemmas
misspelled non-lemma forms
non-lemma forms of misspelled lemmas
non-standard lemmas
non-standard non-lemmas forms
non-lemma forms of non-standard lemmas

Each of these would contain subcategories for "noun", "verb", "adjective" instead of "lemma", e.g. "misspelled nouns", "misspelled noun forms", "forms of misspelled nouns", etc. There would be separate headers for each, e.g. {{head|mk|noun}}, {{head|mk|misspelled noun}}, {{head|mk|form of misspelled noun}} (with abbreviations for easier typing).

For dealing with non-lemma forms of non-lemma forms, like the declined forms of Macedonian participles, we would need the following:

participles < verb forms
misspelled participles < misspelled verb forms
participles of misspelled verbs < non-lemma forms of misspelled verbs
non-standard participles < nonstandard verb forms
participles of non-standard verbs < non-lemma forms of non-standard verbs
participle forms
forms of misspelled participles
forms of participles of misspelled verbs
forms of non-standard participles
forms of participles of non-standard verbs

This is in my opinion the maximal categorization that we arrive at when we take into account all the relevant factors that my creating Macedonian entries has brought to the fore so far. Any other system, including the current one, seems to me to be bound to blur at least one of the empirically established distinctions highlighted above.

I am assuming that no one will be happy to implement such a categorization system, but the overview I have provided above should still be helpful for keeping track of what exactly the current system obscures and coming up with improvements addressing individual problems only. Needless to say, the distinctions that I have presented will also apply to many other languages.

Pending improvements, I would like to ask if the way I format the six types of entries listed at the start of this post is appropriate for the time being, or is there something I could do better, or even should, according to Wiktionary policies. Martin123xyz (talk) 11:48, 7 October 2021 (UTC)[reply]

In my opinion, a misspelt noun or verb is still a noun or verb, and should be categorised as such. Converting the header line of a lemma to the header line of a misspelling is Visigothism, even if committed by @Equinox, and in English loses the mentions of inflections that one could otherwise find by searching. {{misspelling of}} provides the appropriate information and categorisation. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]

When adding "misspelling" to the header line in addition to using {{misspelling of}}, I was complying with the instructions provided at Wiktionary:Misspellings. However, your suggestion resolves the two inconsistencies I referred to above. Martin123xyz (talk) 07:03, 8 October 2021 (UTC)[reply]

My thought on reading that is 'Quo Warranto?'. I don't know whether to amend Wiktionary:Misspellings, tag it as unadopted or simply request its deletion. Can anyone justify not treating misspelt English verbs as verbs? One problem is that a manual maintenance action needed for verbs will not happen simply because misspelt verbs are not listed as verbs. --RichardW57 (talk) 08:03, 8 October 2021 (UTC)[reply]

Requesting its deletion without providing new instructions would not be helpful. As long as there are some instructions, at least a certain degree of consistency between different users' contributions is ensured. And if you leave it as it is, more users will find it, assume that it is an official policy which enjoys the consensus of the community, and continue to adhere to it. Either way, the instructions for contributors regarding things like "misspellings" need to be significantly expanded - currently they are simplistic, in addition to being biased in favour of English entries. I am considering writing a user guide for Macedonian contributions, except that so many things are unregulated or poorly regulated on the English Wiktionary as a whole that I would need to make my own arbitrary decisions or keep asking here about every point. Martin123xyz (talk) 10:04, 8 October 2021 (UTC)[reply]

'Term' covers both lemma and non-lemma. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]

Full information about a non-lemma should be given under the lemma; one would not wish to repeat the multiple meanings of a lemma for its inflected forms. Accordingly, it should suffice to record that something is the inflected form of a non-standard term by recording the non-standardhood at the parent term itself. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]

Thank you for the input. Martin123xyz (talk) 07:03, 8 October 2021 (UTC)[reply]

I have noticed a further problem: not only is "nonstandard form" ambiguous between "inflected form of a nonstandard lemma" and "non-standard form a standard lemma", it can also be understood as "nonstandard equivalent/variant of a standard lemma" (on the analogy of "alternative form of". I had used it in this sense at допринесува recently. Regrettably, {{nonstandard form of}} does not address this threeway ambiguity. Martin123xyz (talk) 14:00, 8 October 2021 (UTC)[reply]

I just created a page for витруелен (vitruelen), using {{head|mk|misspelling}} and {{misspelling of|mk|виртуелен}}, and the entry appears in Category:Macedonian non-lemma forms and Category:Macedonian misspellings, which is wrong, because the word is misspelled lemma, not a non-lemma form. Maybe we need to use {{head|mk|misspelled lemma}} instead, and put those entries in Category:Macedonian misspelled lemmas? Gorec (talk) 14:47, 8 October 2021 (UTC)[reply]

The argument for using misspelling as a part of speech actually argues for splitting the lemma categories into misspelt and 'correctly' spelt lemmas. I'd rather add a parameter to {{mk-noun}} and {{en-verb}} etc. I'm waiting for an old hand to weigh in. --RichardW57 (talk) 16:48, 8 October 2021 (UTC)[reply]

Arbitration edit

As I've suggested before, we should establish an arbitration committee (much like the one Wikipedia has) to settle entrenched disputes among users. The finer details can be discussed later, but in general, is there any considerable support for this proposal? Imetsia (talk) 19:14, 8 October 2021 (UTC)[reply]

There is from my part! Of course we hope to not have any disputes at all, but as the previous year has shown, they are inevitable in a project of our size. Thadh (talk) 19:36, 8 October 2021 (UTC)[reply]

Just as seatbelts and airbags have lead to more automobile accidents, creating an arbitration committee is guaranteed to lead to more intransigence. Participants in such disputes are all fairly confident that they are in the right and that their PoV will be the prevailing one, with only minor concessions to the other side. Also, there will be less avoidance of potentially controversial edits and other changes because one's point of view will be perceived as more likely to prevail. DCDuring (talk) 20:23, 8 October 2021 (UTC)[reply]

I think I should clarify: I don't know how WP's arbitration works, but my idea was similar to what Vox Sciurorum proposes below. I think we ought to have some system where unaffiliated admins can resolve ongoing disputes. Thadh (talk) 10:36, 11 October 2021 (UTC)[reply]

ArbCom over at Wikipedia has not been a roaring success. It is very important that we recognise that the way their judicial system works is not ideal, it is simply how things happened to play out. Their ArbCom has three distinct purposes: policy, block appeals, and conflict resolution. There is no reason that one body should decide on all three, nor is this necessarily a good thing. As it stands, Wiktionary is much more democratic than Wikipedia, and we handle more policy through votes. I think this should remain the case. So the question is then whether block disputes (not just appeals, which are usually spurious, but where admins are actually in disagreement) and conflict resolution could be handled better than they are now, and at what cost. I think we could do better, so this idea has some merit — but we would also create a venue for the bickering that already distracts from the actual work of editing, and this has been a major effect of Wikipedia's ArbCom. —Μετάknowledge^{discuss/deeds} 20:39, 8 October 2021 (UTC)[reply]

I volunteer as head arbitrator Roger the Rodger (talk) 23:14, 8 October 2021 (UTC)[reply]
- If I ever have a head that needs to be arbitrated, I'll know who to call... Chuck Entz (talk) 00:02, 9 October 2021 (UTC)[reply]

Support because this would prevent long endless disputes like the recent one ({{inh+}} & {{bor+}}). Svartava2 (talk) 06:08, 9 October 2021 (UTC)[reply]

I agree with Μετα that WP:ArbCom is not as functional as one might wish, and with DCD that the laudible intention of avoiding arbitrariness in arbitration has led to rule codification paving the road to ~~hell~~ endless wikibickering. We should be careful what we wish for. A dispute over a deep disagreement can be held in an amicable way; what made recent disputes unpleasant were the sometimes implied, often straightforward accusations of bad faith cast at the other side. Perhaps an etiquette committee might do some good. --Lambiam 16:57, 10 October 2021 (UTC)[reply]

I don't like the idea. I know I'm a bit of a handful but it's not "I don't want to be officially reprimanded" (I don't care if I'm officially reprimanded, that's fine), it's more, as Meta suggests above, I think that creating a special little judicial system-in-system does more to foster bullshit than it does to fix actual project issues. Equinox ◑ 17:22, 10 October 2021 (UTC)[reply]

It would be useful to have a way to resolve disputes where neither of two contradictory and strongly-held positions has supermajority support. I doubt a formal arbitration committee is the way. Maybe we can find a less formal way to have senior administrators cut the knot in cases like derivation wording without having every vote appealed to them. Vox Sciurorum (talk) 18:44, 10 October 2021 (UTC)[reply]

Say the proposal is instead to create "Wiktionary:Requests for Arbitration," where users can make their case, and well-established editors can vote in support of one disputant or another. I'd imagine this would be very similar to how we run RFD - no committees, formal procedures, rules of evidence, etc. And by the end of one month, we count the number of votes and act according to what the majority decides. Is this a "less formal way" that you'd support? (Really, this question goes to all users in this discussion who don't like the idea of forming an ArbCom). Imetsia (talk) 23:23, 13 October 2021 (UTC)[reply]

@Vox Sciurorum, Metaknowledge? — This unsigned comment was added by Imetsia (talk • contribs).

The problem is that this doesn't differ much from a simple vote... I really do think we ought to restrict the solving of such disputes to the (uninvolved) administrators. Thadh (talk) 21:43, 15 October 2021 (UTC)[reply]

This solution introduces so many new problems that it more than counterbalances the ones it solves. I think that instead of throwing half-baked ideas at the wall and seeing what sticks, it's worth asking what you really want and how to achieve that. If what you want is to know whether you're allowed to use {{bor+}}, then I would say that you're going about it the wrong way — a Supreme Court shouldn't be making policy. —Μετάknowledge^{discuss/deeds} 22:14, 15 October 2021 (UTC)[reply]

The + templates situation would have been something an arbitration committee could have helped solve. However, it is a moot case at this point, and I wouldn't use a proposed ArbCom to continue to litigate it. For a more current issue, I'd point to the Brutal Russian versus TheNicodene complaints, even though I have no personal stake in that issue and am very unfamiliar with the fact pattern. Again, a board of well-established users voting in his favor/opposition is one possible avenue to put this issue to rest once and for all. Indeed, I think it is the best way to resolve the two above issues declaratively. Such conflict-resolution is squarely in the province of a judicial branch, whose sole purpose it is to interpret policy and settle disputes among litigants. But ultimately, I also understand the objections (though I still think the benefits outweigh the detriments), and I won't continue to pursue the creation of an arbitration committee in spite of myself. Imetsia (talk) 23:44, 15 October 2021 (UTC)[reply]

I share concerns that establishing a bureaucratic structure here with formal committees probably wouldn't help in the way proponents are hoping. I worry about the risk of "borrowing trouble", as a wiser fellow expressed to me a while back. ‑‑ Eiríkr Útlendi │^{Tala við mig} 21:39, 13 October 2021 (UTC)[reply]

With the number of people actively in this community, an arbitration committee would feel like a sitcom or Alice in Wonderland trial, where there's an argument and someone puts on a wig and a fine bit of farce is had that satisfies nothing. The English Wikipedia ArbCom works in part because the Committee is not tangled up in all the issues that reach them; I can't see that happening here. Referring our issues to the English Wikipedia ArbCom might work.--Prosfilaes (talk) 23:40, 13 October 2021 (UTC)[reply]

I am deeply reticent to refer any EN Wiktionary concerns to the EN Wikipedia ArbCom. Our organizational cultures and norms are very different. We've had various issues arise because Wikipedia editors engage here, based on Wikipedia norms, requiring much cleanup and coordination. I can't imagine that issues referred to the WP ArbCom would be handled with any ease. ‑‑ Eiríkr Útlendi │^{Tala við mig} 02:48, 14 October 2021 (UTC)[reply]

Like Prosfilaes, I don't think we have a big enough active editor base to have an Arbcom. I like the suggestion that if there's an intractable issue where neither position can get supermajority support, or it's unclear what the status quo is (since votes are structured as changes to the status quo) but we have to do something, we should have a majority vote. It isn't without issues, but...it's an idea. I don't know if Wikipedia's Arbcom would be keen to accept cases from us, since they have a workload as it is, and they (or we) also might often feel they lacked the relevant expertise to judge things like disputes over what template wordings are best for a dictionary. For intractable disputes over blocks, we could ask global sysops to weigh in. - -sche (discuss) 01:33, 14 October 2021 (UTC)[reply]

Global sysops are just as bad as outsourcing to Wikipedia. In my experience, they generally neither know nor care about Wiktionary, and would probably be annoyed at the very suggestion of foisting another local task on them. —Μετάknowledge^{discuss/deeds} 18:00, 14 October 2021 (UTC)[reply]

Wiktionary:Etymology edit

This page survived RFD, but many users pointed out the need for a cleanup. Modernization/expansion from experienced editors is welcome. (Discussion here, to be archived at Wiktionary talk:Etymology.) Ultimateria (talk) 00:02, 10 October 2021 (UTC)[reply]

Wording of RFD banner edit

I propose that we change the banner message generated by {{rfd}} as follows:

Current text:

This entry has been nominated for deletion

Please see that page for discussion and justifications. Feel free to edit this entry as normal, though do not remove the {{rfd}} until the debate has finished.

Proposed new text:

This entry has been nominated for deletion

Please see that page for discussion and justifications. While voting is in progress, please do not edit this entry in a way that may alter or make unclear the apparent intention of votes already cast. Do not remove the {{rfd}} template until the debate has finished.

What do you think? Mihia (talk) 21:02, 10 October 2021 (UTC)[reply]

I noticed that someone put a noun sense under the verb sense of push and shove, which seemed like a good idea but made the voting less clear. None Shall Revert (talk) 06:56, 11 October 2021 (UTC)[reply]

Also wiki things are not supposed to be "votes" None Shall Revert (talk) 06:58, 11 October 2021 (UTC)[reply]

It does happen from time to time. I have observed several cases where fundamental changes have been made to the whole basis of an entry while voting is in progress, and moreover people sometimes do not even bother to mention that they have done this at the RFD discussion. So an entry is listed at RFD, people vote "Delete" let's say, and then the entry is completely changed or rewritten, or redirected maybe, with no notice, leaving the status of the pre-existing votes totally unclear. I definitely do not agree that we should simply say "Feel free to edit this entry as normal" on the RFD banner -- it's just a question of exactly what we do say. Rather than my suggestion above, we could say "please mention any substantial changes at the RFD discussion", but this still leaves the problem of what should be done with pre-existing votes that may no longer be applicable. Mihia (talk) 08:12, 11 October 2021 (UTC)[reply]

Alternative suggestion (a bit more permissive):

This entry has been nominated for deletion

Please see that page for discussion and justifications. You may continue to edit this entry while the discussion proceeds, but please mention significant edits at the RFD discussion and ensure that the intention of votes already cast is not made unclear. Do not remove the {{rfd}} template until the debate has finished.

Mihia (talk) 08:22, 13 October 2021 (UTC)[reply]

I like the last one. Ultimateria (talk) 17:16, 13 October 2021 (UTC)[reply]

I like this wording better than the first proposal. - -sche (discuss) 01:35, 14 October 2021 (UTC)[reply]

Likewise, I support this last wording. Imetsia (talk) 17:06, 14 October 2021 (UTC)[reply]

OK, I have implemented the second suggestion. Mihia (talk) 17:07, 14 October 2021 (UTC)[reply]

Proposal for new parameter in linking templates: "alternative script" edit

I suggest a new parameter for linking templates which will input alternative (non-lemma) script forms within parantheses. This is already partly done for Korean and Vietnamese:

{{ko-l|한국|韓國}} > 한국 (韓國, han'guk)
{{vi-l|Việt Nam|越南}} > Việt Nam (越南)

But these language-specific templates are not ideal because they lack most key functions (e.g. part of speech, literal meaning, suppression of transliteration.) and cannot be integrated with other templates such as {{alter}}, {{syn}}, {{bor}}, etc.

An "alternative script" parameter would be useful for various languages:

In the case of Korean, especially formal or academic language, there is a very large number of Chinese-derived homophones. An example is 연기 (yeon'gi), whose entry currently features nine not uncommon and completely unrelated words:

연기 (演技, yeon'gi, “acting”), 연기 (煙氣, yeon'gi, “smoke”), 연기 (延期, yeon'gi, “postponement”), 연기 (緣起, yeon'gi, “dependent origination”), 연기 (年記, yeon'gi, “date of composition recorded on an artwork”), 연기 (年期, yeon'gi, “certain number of years”), etc.

A fully integrated "alternative script" parameter would allow far easier disambiguation of these. To a lesser extent, this is also true of Vietnamese.

Many languages are written in multiple scripts. On Wiktionary, one script is usually chosen as the lemma script, with the result that forms in the other script are neglected. For instance, the majority of Azerbaijani speakers live in Iran and primarily use the Arabic script, which has also been the script for most of Azerbaijani history. But this fact is neglected because all Azerbaijani lemmas are in the Republic's Turkish-based Latin script. The integration of an "alternative script" parameter would allow for a more equitable coverage of such languages in etymology or descendant sections, in translation charts, etc. Example:

current {{m|az|Azərbaycan}} Azərbaycan > new {{m|az|Azərbaycan|altscr=آذربایجان}} Azərbaycan (آذربایجان)

current {{m|ks|کٲشُر}} کٲشُر (kạ̄śur) > new {{m|ks|کٲشُر|altscr=कॉशुर}} کٲشُر (कॉशुर, kạ̄śur)

Thoughts?--Tibidibi (talk) 07:11, 11 October 2021 (UTC)[reply]

I've found a similar need in Pali, where there are multiple scripts in use, and I anticipate a similar need for Sanskrit. The solution for Pali is documented by a full set of examples for {{pi-link}}, which generalises {{link}}. One complication there is that some Pali writing systems are ambiguous and that the Roman script is one of the major writing systems, so we end up with transliterations and Roman script equivalent sometimes having to be different. Generally we want to link to the Roman script equivalent, but sometimes it is not easily available, e.g. in inflection tables, which commonly link to the entries in the tables. Sanskrit has a similar but different complication. The Bengali script writing system is ambiguous, and Devanagari is the 'lemma' script. (Don't like the term, as we treat the equivalents in the other scripts as alternative forms, thus also lemmas.) For Pali I've built specialised forms of some linking templates on the standard templates, such as {{pi-alternative form of}} on {{alternative form of}}. I've independently encoded {{pi-nr-inflection of}}, which I ought to convert to build on the standard template using common generalisation logic. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]

Note that my scheme treats the form in the alternative script as the primary input. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]

Korean is an unusual case, where the hidden parameter to the conversion is meaning rather than pronunciation. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]

@Tibidibi: It's a yes for me. Maybe with the possibility of adding a description before the alternative script, like they do in Serbo-Croatian entries (for example: dom#Noun_28). Sartma (talk) 08:27, 12 October 2021 (UTC)[reply]

Splitting Hebrew roots? edit

There are a bunch of homonymous Hebrew roots that mean completely different things but just so happen to look the same and there doesn't seem to be a way to distinguish between them. חילוני, התחיל וחלל don't really share a root, right?.--The cool numel (talk) 08:47, 12 October 2021 (UTC)[reply]

I don’t see how the root of חילוני (khiloni, “secular”) can be ח־ל־ל, while that of חילון (khilún, “secularization”) is ח־ל־ן. I guess this is a typo. If we had pages for these roots, we could document several unrelated meanings like we do for other homonymous terms, such as fluke. --Lambiam 04:30, 13 October 2021 (UTC)[reply]

@Lambian: I'm pretty sure the root of חילון is ח־ל־ן, as it's derived from חילוני which is in turn just the root ח־ל־ל with the pattern קִטְלוֹנִי (like צבעוני). The thing I'm talking about is splitting categories like Category:Hebrew terms belonging to the root ח־ל־ן by meaning. --The cool numel (talk) 09:57, 13 October 2021 (UTC)[reply]

So I take it then the root is the inflectional root, not the etymological root. Doesn’t that make splitting categories by meaning much less interesting? IMO such splitting would best be done by creating subcategories of homonymous roots according to their different core senses, but deciding what these core senses are and recategorizing terms with homonymous roots accordingly will mean a lot of work for a very small bunch of active Hebrew editors. --Lambiam 11:44, 13 October 2021 (UTC)[reply]

Adding DRAE links to all Spanish lemmas edit

There are currently ~18,500 lemmas with links to DRAE. There are an additional ~27,000 Spanish lemmas that do not currently have a DRAE link but do have a corresponding DRAE entry.

I can run a bot to add a "Further reading" category with a link to {{R:DRAE}} to the entries missing DRAE links. Would this be desirable or just annoying clutter? JeffDoozan (talk) 17:02, 13 October 2021 (UTC)[reply]

If you can match the entries accurately I don't see why it would be a problem. I routinely add them manually. – Jberkel 17:10, 13 October 2021 (UTC)[reply]

Huh, I expected more pages to have an entry. I think it's helpful! As I expand Spanish entries I could use it to filter out a set of "core" Spanish words to work on. Ultimateria (talk) 17:14, 13 October 2021 (UTC)[reply]

Only if the bot checks that the target of the link is a real definition. Today I saw several French entries where people added {{R:TLFi}} but the web site has no definition. Vox Sciurorum (talk) 18:26, 13 October 2021 (UTC)[reply]

Yes, it does. JeffDoozan (talk) 18:39, 13 October 2021 (UTC)[reply]

Did the bot run on all forms? I added one earlier manually: Special:Diff/62116035/64262193 – Jberkel 19:47, 17 October 2021 (UTC)[reply]

Also, could you adapt it to work with {{R:TLFi}}? – Jberkel 08:24, 18 October 2021 (UTC)[reply]

The bot did not run on all forms, only on pages with entries containing a lemma. The page you edited was skipped because it previously contained only a verb form. If anyone is interested, I could generate a list of pages where the DRAE has a lemma but we have only a form.

I'll see what I can do with {{R:TLFi}} but I can't promise anything. JeffDoozan (talk) 15:19, 18 October 2021 (UTC)[reply]

Yes, such a list would be useful, thanks! Especially the adjectives often exist only as participles, presumably autogenerated at some point. – Jberkel 20:53, 19 October 2021 (UTC)[reply]

Here's a list of the 2,935 pages where we have Spanish forms that have corresponding DRAE lemmata. JeffDoozan (talk) 18:39, 20 October 2021 (UTC)[reply]

@Jberkel: ~10,000 TFLi links are being added right now, it should be complete within a day. Here's a list of the 2,743 entries where we have forms but TLFi has a lemma. JeffDoozan (talk) 01:12, 29 October 2021 (UTC)[reply]

This is very much appreciated and I whole-heartedly endorse this. I work on Spanish here and I add this to all entries I make. —Justin (koavf)❤T☮C☺M☯ 06:23, 27 October 2021 (UTC)[reply]

The phrasebook is in dire need of rules. edit

(Not referring to the CFI, that's another topic.) Coming from languages that are both gendered and have polite forms, the translation boxes in most phrasebook entries are a mess. It's completely random whether:

...only the polite, only the familiar or both versions are present.
...these polite/familiar forms are qualified as such, whether this qualification comes before or after the entry and whether this qualification is called polite/familiar or formal/informal.
...plural phrases are present.
...all these forms are consistently present both in their male as well as their female forms (if applicable) and how those forms are annotated.
...what the order of all these forms is.

My suggestions:

Decide whether to call it polite/familiar or formal/informal and then apply this consistently. See the inconsistencies in are you allergic to any medications
Split the translation box into two distinct ones in most articles (where applicable), one for familiar, one for polite forms. Languages that don't have this feature could either be automatically completed using a bot that copies over entries between the boxes or alternatively they could be barred from one of the boxes (maybe by introducing a new {{trans-top}} that only accepts languages with politeness distinctions).
- If the above point doesn't happen, at least define a consistent scheme. Should the qualifier come before or after? Should entries without qualifiers in languages with politeness distinctions be allowed? What should come first?
Disallow plural translations.
Decide whether gender should be expressed using the gender parameter of {{t}} or using {{qualifier}}, then apply this consistently. See the inconsistencies between e.g. are you religious and are you single.

--Fytcha (talk) 02:36, 14 October 2021 (UTC)[reply]

I agree with all of this. But it's worth noting that in many languages, politeness and formality are not the same thing. In Korean, you can be politely informal and non-politely formal. Tibidibi (talk) 04:40, 14 October 2021 (UTC)[reply]

In that case, as I don't think it is within the scope of a phrasebook to give impolite phrases (except perhaps for phrases that are explicitly/obviously impolite), I would suggest that we stick with formal and informal and avoid any distinctions between politeness and impoliteness. Andrew Sheedy (talk) 05:53, 14 October 2021 (UTC)[reply]

I also agree with all the above, with the caveat that some languages, like Korean, have both a formal/informal and polite/familiar distinction. As you say, we can choose the most relevant one (I would probably keep polite/familiar for Korean too, since formal/informal is a distinction more pertinent to more restricted scenarios, but I guess Korean editors will make the call on that. Just a note: non-polite means "familiar" and doesn't mean impolite.). Sartma (talk) 09:12, 14 October 2021 (UTC)[reply]

From how I read it, you two are in disaccord on whether formal/informal or polite/familiar should be the primary divider of the translation boxes. What is your opinion on this @Tibidibi? If a normal phrase like how are you had only two boxes, would you want them to be formal/informal or polite/familiar? Note that we could always provide all four combinations of (formal+polite), (informal+polite), (formal+impolite), (informal+impolite) by the use of the appropriate {{qualifier}}s; the question is merely which distinction is more useful and semantically more sensible. Fytcha (talk) 16:44, 10 November 2021 (UTC)[reply]

@Chuck Entz Can I change Wiktionary:Phrasebook and the entries accordingly or is a formal vote necessary? It's a bit of a radical change so I don't want to do it unilaterally; OTOH people seem to not care much about the phrasebook and this proposal. --Fytcha (talk) 14:37, 2 November 2021 (UTC)[reply]

Adding my two cents here for later:

I think the best option would be to reduce and POTENTIALLY go down to no distinctions, with any and all listed as qualifiers. However, I could also see the use of having a section for formal/informal/polite. As to the plural/nonplural - it could be useful to know when you're talking to multiple people, but I think that is something that could be listed as qualifiers instead of having its own translation bar - we could potentially go on exponentially in that direction - a bar for polite single, a bar for polite informal, a bar for plural single, etc. It seems like it'd be easier to handle to have separate bars for formality and distinguish number there. Vininn126 (talk) 10:09, 9 January 2022 (UTC)[reply]

@Vininn126: I've followed this up with Wiktionary:Votes/2022-01/New phrasebook regulations. The three distinct advantages of splitting up translations into multiple boxes are 1. it is much easier to spot holes 2. it is possible to {{t-needed}} those holes 3. it makes the whole thing a lot more navigable and humanly readable and at the same time less bloated (no need for the endless repetition of {{q|formal}} etc.). — Fytcha〈 T | L | C 〉 12:10, 9 January 2022 (UTC)[reply]

Major opportunity for us to step in for word of the year edit

Heads up that OED are slipping. It's our time to strike. —Justin (koavf)❤T☮C☺M☯ 16:36, 14 October 2021 (UTC)[reply]

"...observing that 'worms are all over the place' and 'everybody loves a good worm.' Well, I'm sold. Ultimateria (talk) 16:52, 14 October 2021 (UTC)[reply]

In a way it would be funnier with the computing sense of worm (something like a virus), since I can imagine somebody really out of touch thinking this was a "new" hi-tech word of the 21st century! Equinox ◑ 10:13, 15 October 2021 (UTC)[reply]

New SOP policy idea edit

I propose adding a new SOP test at WT:Idioms that survived RFD. It would have a caption like "Terms whose parts are substitutable, but with which only a few variations greatly predominate. For instance, the word "air" in air resistance can be switched out for "wind," "snow," "water," "fluid," and others; but "air resistance" is the only widely used and attested form." (A better writer could improve some of the wording). Accordingly, I would name the test WT:AIR RESISTANCE/WT:AIR, although there are probably other entries to which this logic has been applied in RFD discussions. (Talk:idle threat comes to mind). There are also the ongoing discussion about rumor has it and puré de batata.

As a community, this is a justification that has previously won the day, so it makes sense to codify it. In addition, all of our SOP policies are essentially advisory and open to great interpretation (there are no bright-line rules), and I don't think this test would depart from that tradition. Lastly, this policy would finally bring us one step closer to a more fleshed-out approach to handling set phrases and common collocations. Thoughts? Imetsia (talk) 17:36, 14 October 2021 (UTC)[reply]

Your idea sounds great, I like it. The reason why I'd advocate for the inclusion of articles such as air resistance isn't because they're so indecipherable (let's be honest, you really can guess what it means based on the parts) but because:

It is the canonical collocation to express this idea. There might be other SOPs that convey the same meaning but this one is the one that's actually used.
The article serves many other purposes other than just explaining the idea, such as providing translations, coordinate terms, hyponyms etc.

Your proposal shifts the focus of SOP discussions a bit away from the question "Can its meaning be guessed based on the parts?" to "Is it the principal (i.e. most widespread) collocation to express this concept?", which is a change I welcome with open arms. Fytcha (talk) 18:06, 14 October 2021 (UTC)[reply]

Support. This seems like a good idea. We need some way of including collocations and fixed expressions, anyway. Andrew Sheedy (talk) 19:36, 14 October 2021 (UTC)[reply]

I now agree that we should have a firm basis for including entries for strong set phrases -- combinations that are explicable as SoP, but in practice overwhelmingly predominate over other possible ways of saying the same thing by word substitution of synonyms (however we can best define this idea). While we are looking at this policy area, I also believe that we should have a firm basis for including SoP phrases that are particularly hard to understand from the parts if one does not already know which of many possible meanings to combine together -- another argument that is often made at RFD. Mihia (talk) 21:36, 14 October 2021 (UTC)[reply]

On the second suggestion, I think we'd have to firmly pin down whether there is enough of a multitude of "possible meanings to combine together" for a term to not be SOP. This seems quite hard to establish clearly through policy. Talk:amico per convenienza comes to mind. On the first try, it passed RFD because of just this justification, though the vote was later overturned. (To me, the argument that it was SOP was a slam dunk, and it shocked me that so many users initially disagreed). So I do not disagree with the idea in principle, but we would have to adjust the dials just right to ensure we are neither over- or under-inclusive. Is there really an administrable standard we can come up with to achieve just this result? Imetsia (talk) 22:01, 14 October 2021 (UTC)[reply]

I think both ideas are equally hard to precisely codify because there will always be an element of subjectivity. I think we just have to accept this, and establish the broad policy and let borderline or argued cases go to RFD. I think that examples of phrases that have passed RFD on the stated grounds, as we have done with other cases, are very helpful. FTR, a recent one that was undeleted on the second ground is track meet. Mihia (talk) 10:08, 15 October 2021 (UTC)[reply]

I oppose including common colocations because they are common colocations. We can use {{ux}} to illustrate the more common uses. Vox Sciurorum (talk) 23:30, 14 October 2021 (UTC)[reply]

I oppose the subjectivity of the idea. Although “idiomaticity” is close friends with commonness.

The real question should be technical utility, with cross-language perspectives (which most who want to have a say on a term don’t have, naturally since our language knowledges are limited by our origins in particular language communities). And it wasn’t even about the utility of the term alone in the case of air resistance, but people apparently wanted it as a model for other types of resistance (so we do not have to create them but look in this entry how to construct them, very remarkable). But you are unable to form a reasonable rule or guideline from this example. A particulari ad universale non valet consequentia. (Case law is bad and a meaningless Anglo-fetish.) Fay Freak (talk) 00:37, 15 October 2021 (UTC)[reply]

The guideline I've formulated seems quite reasonable to me. Why do you disagree? It's readily administrable and provides a good general principle that can be applied not mechanically, but by using sound judgment and discretion. Just like every other example on WT:Idioms that survived RFD, this is not a hard and fast rule, and it includes an element of subjectivity. Editors constantly disagree about the application of SOP policies; some are more permissive on the issue of term inclusion, and others are more conservative. This is not an exception to that rule. It fits in perfectly with every other advisory rule we've ever put forward about idiomacity and SOP-ness. Imetsia (talk) 15:55, 15 October 2021 (UTC)[reply]

I don't support individual entries for mere common collocations. I think we can find a conceptual division, albeit slightly grey and subjective, between common collocation and strong set phrase. Mihia (talk) 10:12, 15 October 2021 (UTC)[reply]

The idea may have merit if we can formulate a solid objective criterion, but I cannot resist pointing out that air resistance is a poor example. The term denotes a physical force, expressible in the unit newton. In general, designers try to minimize air resistance. The term wind resistance as commonly used (pace M–W) is an entirely different species, the ability to stand up to wind damage,^[1] a highly desirable property (except for the sets of disaster flicks such as Twister). --Lambiam 10:18, 15 October 2021 (UTC) Addition: In English the first component of such a compound can be the subject or the object of the action. In French you can see the distinction in the preposition used: résistance de l'air versus résistance au vent. --Lambiam 10:39, 15 October 2021 (UTC)[reply]

As solid and objective a criterion as possible, yes, but it will never be mechanically objective, such that anyone can apply a rule and will always come up with the same answer. If we had only mechanically objective CFI criteria then we would never need RFD discussions. Mihia (talk) 13:14, 15 October 2021 (UTC)[reply]

I agree with Mihia's comment right above. In addition, is air resistance really as poor an example as you argue? M-W, as you point out, has a definition much more in line with that of "air resistance." Even if you claim it's not the most used meaning, you must accept that it is a meaning. And what for the other substitutes like "snow," "water," and "fluid?" (I haven't checked these on my own, but maybe you can make a case for your position based on these). Imetsia (talk) 15:55, 15 October 2021 (UTC)[reply]

“Fluid resistance” is a more general term than “air resistance”. It is the resistance experienced by a body in motion, relative to a surrounding fluid. Usually the fluid is air, but when something else, the term “air resistance” is not appropriate. “Snow resistance”, “water resistance” and “wind resistance” generally refer to the ability to resist, or protect against, the intrusion or harmful effects of said phenomena or substances; having good wind resistance means the same as being windproof. --Lambiam 10:44, 16 October 2021 (UTC)[reply]

@Lambiam: OK, I agree now that snow and water resistance do not fall under the same family of meanings as "air resistance." But I don't agree when it comes to wind resistance. "Wind resistance" definitely does have a similar meaning which is used quite commonly ([2], [3], [4] just for starters). According to the wiki article you linked, there's also "wave resistance," under the same family of meaning. So what would you think about the proposed policy if we switched "wind, snow, water, and fluid" with simply "wind, fluid, and wave?" Imetsia (talk) 18:29, 16 October 2021 (UTC)[reply]

Can I just point out that I think that people here are talking about potentially two different things. This first is whether words with different meanings can be substituted to create a parallel phrase, for example "air resistance" changed to "wave resistance", and the second is whether synonyms can be substituted to produce an equally idiomatic way of saying the same thing, e.g. per Fytcha's comment "It is the canonical collocation to express this idea. There might be other SOPs that convey the same meaning but this one is the one that's actually used." (my emphasis). Mihia (talk) 19:34, 16 October 2021 (UTC)[reply]

The aim of my comment regarding the example “air resistance” was to point out that it is not a felicitous example to illustrate the proposed test, and equally infelicitous to serve as its name. A better example might be the collocation disaster preparedness; while its synonyms catastrophe preparedness and disaster readiness have been used, it is clearly^[5] the winner of the “canonical collocation” (con)test. --Lambiam 20:24, 16 October 2021 (UTC)[reply]

I wonder whether we can come up with something a bit punchier than "disaster preparedness". What about "human rights"? Mihia (talk) 21:30, 16 October 2021 (UTC)[reply]

Would that be an example that fits your second criterion (synonym substitutability) but not the first (parallel phrases)? For synonym substitutability, I'm guessing we have, e.g., "people rights," "mankind rights," and similar. But are there any parallel phrases? The only ones I can think of are either [ADJ]+rights or [possessive]+rights, which don't really fit. Imetsia (talk) 23:40, 16 October 2021 (UTC)[reply]

"human rights" is supposed to be an example of something that would pass the first test, the clear predominance of one way to say something over other candidates involving word substitutions, such as those you mention. A parallel phrase would be animal rights. Despite our defining this as, essentially, the rights of animals, it again passes the first test because of its overwhelming predominance over e.g. "creature entitlements" or whatever. "human rights" and "animal rights" are examples of what I would call strong set phrases explicable as SoP. Mihia (talk) 08:15, 17 October 2021 (UTC)[reply]

Could we then just have two tests, one for parallel phrases and the other for synonym substitutability? "Air resistance" seems like a good candidate for the parallel-phrases test, while "human rights" passes the synonym-substitutability test. (The parallel phrase you mention is one for which we have an entry, so I don't think it's a great example -- isn't the idea to list parallel phrases that wouldn't be entryworthy, thus showing that the original term is in fact a set phrase?) Honestly, "air resistance" has the synonym, as argued above, of "wind resistance." So it could also work for the second case. But if we really want a more shining example of synonym substitutability, I suppose we could just include both of the tests. What do you think? Imetsia (talk) 16:18, 17 October 2021 (UTC)[reply]

My desire would be a rule to explicitly allow strong set phrases / fixed expressions even if explicable as SoP (and also, on a separate point, a rule to explicitly allow SoP combinations that are particularly hard to understand from the parts). Of course, the problem is how best to define "set phrase" or "fixed expression" (or, at least, the sort that we would want to include). The idea that "It is the canonical collocation to express this idea", aka (more or less) non-synonym-substitutable, will apply in some cases, perhaps not all. I am less clear how much additional help the "no parallel phrases" test would be. My suggestion, if we want a basis to include entries that we presently wouldn't, or that would presently be of unclear eligibility, is to compile as big a list of these entries as possible, so that we can check whether the proposed test(s) are adequate (and at the same time verify that the tests do not allow entries that we would not want to include, of course). A good source of these entries would probably be previous RFD discussions. Another possibility would be simply to say that we "allow strong set phrases or fixed expressions even if SoP" and, where disputed, let it be debated case-by-case what these are. Mihia (talk) 17:53, 17 October 2021 (UTC)[reply]

@Mihia: Could you lay out the full text of your proposed WT:HUMAN RIGHTS test? I'd like to start an informal vote below about both of them (since the discussion seems to have stalled at this point), so I'd like the full text. Imetsia (talk) 20:44, 19 October 2021 (UTC)[reply]

@Imetsia: Actually, of the two I would probably in the end choose WT:ANIMAL RIGHTS as perhaps slightly less susceptible to objections that it is not wholly explicable as SoP. Unfortunately I don't have a full proposal at the moment except the one that I mentioned, namely "allow strong set phrases or fixed expressions even if explicable as sum-of-parts", which I think may not fly as people may reasonably ask "how do we tell what is a strong set phrase or fixed expression"? That is the difficult part. I am of the opinion, as I alluded to above, that before making a concrete votable proposal the wording should be tested against as many actual examples as possible, which might be obtained from the imagination or from failed (or even passed) RFD candidates, to ensure not only that desired phrases pass but also that undesired ones fail. Compiling such a list is something that I have had on an "eventual to do" list, but haven't got round to yet. Mihia (talk) 16:31, 20 October 2021 (UTC)[reply]

I have a comment about the practical implementation of this idea, if it should go ahead. At WT:CFI it says "An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components [...] See Wiktionary:Idioms that survived RFD for other examples." We cannot therefore just plonk a "set phrase" test at Wiktionary:Idioms that survived RFD, as initially suggested, since quite likely the meaning of a set phrase can be easily derived from the meaning of its separate components. Mihia (talk) 17:27, 15 October 2021 (UTC)[reply]

In fact, the same could be said about some other tests at Wiktionary:Idioms that survived RFD, such as the "tennis player" test. It seems that this problem is a pre-existing slight muddle of the wording in these sections. Mihia (talk) 17:35, 15 October 2021 (UTC)[reply]

Support on my end. AG202 (talk) 21:47, 15 October 2021 (UTC)[reply]

Some dictionaries include a separate section of common collocations involving some term in their entry for that term. For examples, see the online Cambridge Dictionary and Collins. I think this would be a good alternative for us too. --Lambiam 11:43, 16 October 2021 (UTC)[reply]

A separate section would be counterproductive. Using {{ux}} does the job better; remember that there’s the |t= parameter!— thus: {{uxi|en|sick burn|t=a particularly cutting insult}}. ·~ dictátor·mundꟾ 23:08, 16 October 2021 (UTC)[reply]

It is not clear to me why you expect this to be counterproductive. It seems a better alternative, at least to me, than introducing another vague exception to the non-SOP rule. The |t= parameter is explicitly intended for English translations of usage examples on foreign entries, not for glossing. While one or two {{ux}}es – which per policy should be be grammatically complete sentences – will generally suffice to demonstrate usage of a term, I can easily imagine a handful of associated common collocations. --Lambiam 10:16, 17 October 2021 (UTC)[reply]

@Lambiam: The parameter t= is used to translate Early Modern English and dialectal English quotes whose mutual intelligibility with Modern Standard English is low. Youth slang and other suchlike jargons also do depart from Modern Standard English, and hence have I no qualms about any misue of the parameter. Sociolinguistically, any non-Standard variety is ‘foreign’, or else I would not have been blocked for using non-Standard English to write definitions. ·~ dictátor·mundꟾ 15:04, 17 October 2021 (UTC)[reply]

Name the test WT:Nature lover. ·~ dictátor·mundꟾ 23:08, 16 October 2021 (UTC)[reply]

Out of interest, how would you define the test, or criterion, that would allow us to keep "nature[-]lover" against the arguments that it is SoP? Mihia (talk) 08:40, 17 October 2021 (UTC)[reply]

nature lover is a collocation, unlike SoPs like wine lover and nature person. ·~ dictátor·mundꟾ 15:04, 17 October 2021 (UTC)[reply]

I fear that "is a collocation" will be far too permissive for our purposes. Mihia (talk) 17:24, 17 October 2021 (UTC)[reply]

What if I told you that idiomaticity is exclusively essentiated by comparative grounds? Like there is name names, this is easily parsed as the sum of its parts, but if you look at its German translation Ross und Reiter nennen you are soothed. An ἰδίωμα (idíōma) is there by its being ἴδιος (ídios) in contradistinction to other, more hands down ἰδιώματα (idiṓmata) (there is no peculiarity without a general mass to other from). Because judging by commonness within a community runs into the sorites paradox, too obviously and frequently. “Collocation” is just a rephrasing of the same commonness idea. Fay Freak (talk) 19:27, 17 October 2021 (UTC)[reply]

Support per dictátor·mundꟾ. Collocations should be allowed where the collocation itself is substantially more likely to be used than any substitution, where the collocation is a commonly used rhyming pair of terms or alliterative pair of terms, or where at least one term in the collocation has multiple common meanings, but the usage in the collocation overwhelmingly intends one of those meanings (particularly where it is not the most common meaning of the term). bd2412 T 02:04, 18 October 2021 (UTC)[reply]

Without any further stipulation, the test "where the collocation itself is substantially more likely to be used than any substitution" would apparently allow "white cat" as substantially more likely than e.g. "ivory feline", or "chair leg" as substantially more likely than e.g. "stool limb", while I personally would not want to include either of those. I'm sure examples such as these abound. This is why we need to carefully check exactly what the letter of the rule would and would not allow. Mihia (talk) 21:00, 20 October 2021 (UTC)[reply]

I would have zero problems including chair leg as an entry. As for color-noun combinations, they are obvious enough that we might as well append a rule saying no "color noun" terms unless the meaning is something other than a common noun of that name and of that color. bd2412 T 04:55, 11 November 2021 (UTC)[reply]

@Mihia: Is white cat really so much more common than black cat or white noise? Fytcha (talk) 10:56, 11 November 2021 (UTC)[reply]

Not necessarily, but I don't see the connection to the topic. We include black cat and white noise because those have idiomatic meanings. The same doesn't apply to white cat (as far as we know; if it did then we would include it, no problem). Mihia (talk) 18:23, 12 November 2021 (UTC)[reply]

Voting to elect members to the Movement Charter drafting committee is now open (October 12 - 24) edit

Voting to elect members to the Movement Charter drafting committee is now open. In total, 70 Wikimedians are running for 7 seats in these elections.

Voting is open from October 12 to October 24, 2021.

We are piloting a voting advice application for this election. It helps show which candidates hold positions similar to the choices entered.

According to the set up process, the committee will initially consist of 15 members in total. 7 members elected in this process, 6 members selected by Wikimedia affiliates, and 2 members appointed by the Wikimedia Foundation. Up to 3 additional members may be appointed by the committee, and steps may be taken to replace members as needed.

More details and the voting link is on Meta.

Please feel free to let me know if you have any questions about this process.

Xeno (WMF) (talk) 01:47, 15 October 2021 (UTC) (Movement Strategy & Governance Team, Wikimedia Foundation)[reply]

(Disclosure: I'm a candidate.) The election closes in 17 hours (at 12:00 UTC). The referenced Charter to be drafted is basically going to be a constitution for Wikimedia, binding on all our supporting organizations, and outlining the formation of new governance structures. --Yair rand (talk) 19:13, 24 October 2021 (UTC)[reply]

Wiktionary:Votes/2021-10/Standardising wording for showing cognates edit

I recently created this vote, for consistency and standardisation. Looking for feedback, concerns, comments, etc. Svartava2 (talk) 16:49, 16 October 2021 (UTC)[reply]

The nuisance of a lot of edits in my watch list may exceed the small benefit. Other than that, I understand the proposal to be replacing all instances of the five strings before {{cog}} with a single one of them, and leaving all other uses of {{cog}} alone. Thus typos like "cognate witth" and alternate wording like "from the same origin as ..." would be untouched. I suggest leaving "include", "with", and "compare" alone and replacing "to" and "of" with "with". Which is not on the list of options. Note that include implies additional unlisted cognates. It is not correct for a bot to replace include with anything else. Vox Sciurorum (talk) 17:02, 16 October 2021 (UTC)[reply]

@Svartava2, a better formulation of the vote might just be to standardize cognates like this: (1) require "Cognate with" for full cognates, (2) allow "Cognates include" when multiple cognates exist, and (3) allow "Compare" for "non-full cognates" (i.e., per Richard, terms that "are semantically similar or etymologically related"). If the vote were so phrased, it would have the typical consensus-building problems of any omnibus vote. Different users would like and dislike different parts of the proposal, and few will embrace it in full, leading then to a mixed opposition that ultimately tanks the vote. To avoid this, I do agree with Vox's solution above: a simple vote to replace "'to' and 'of' with 'with.'" Imetsia (talk) 15:59, 17 October 2021 (UTC)[reply]

@Imetsia: I don't understand (1) above. It contradicts (2). 'Compare' is appropriate for when the relationship is unclear or a parallel formation can be seen. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]

OK, let me rephrase it: (1) require "Cognate with" when only one full cognate exists, (2) allow "Cognates include" when there are multiple, and (3) allow "Compare" "when the relationship is unclear or a parallel formation can be seen." I also like your wording better, so thanks for the suggestion! Imetsia (talk) 17:12, 17 October 2021 (UTC)[reply]

@Imetsia: Would you allow 'cognate with' if there were descendants of the cognate given? --RichardW57 (talk) 19:18, 17 October 2021 (UTC)[reply]

@RichardW57: I don't really understand your question. Maybe an example could help? Imetsia (talk) 19:22, 17 October 2021 (UTC)[reply]

@Imetsia: Suppose all that we knew of the cognates of Greek θεός were Latin fānum and the latter's English borrowing fane. Would you allow us to describe the Greek word as 'Cognate with Latin fānum', or would we have to write 'Cognates include Latin fānum'?

I would allow "Cognate with." So I guess even my revised suggestion is too imprecise. I mean to say that one must use "Cognate with" rather than of or to if they wish only to include one cognate (even if others may exist). If, however, one wants to include multiple cognates they can use "Cognate with X and Y" if X and Y are the only cognates that exist; or use "Cognates include X and Y" if X and Y are only two of the many cognates that exist. Imetsia (talk) 20:07, 17 October 2021 (UTC)[reply]

Scratch that. I don't see the reason to make that distinction. I guess we could just deprecate "Cognates include." Imetsia (talk) 20:09, 17 October 2021 (UTC)[reply]

@Imetsia: 'Cognates include' does suggest that there is no need to list them all. --RichardW57 (talk) 21:45, 17 October 2021 (UTC)[reply]

Sure, but there isn't really an urgent need for that to be suggested. It's already understood that we shouldn't list out every single cognate in every case. And simply using "Cognates with" neither implies that the list is exhaustive nor that it's only a subset of the possible cognates. It implies nothing in this respect. Imetsia (talk) 22:48, 17 October 2021 (UTC)[reply]

Different phrasings can mean different things: When I say "Cognates include", I imply there are more cognates. When I say "Cognate with", I imply that there aren't, or that those aren't known. When I say "Compare" I usually mean that the words aren't full cognates, but are either semantically similar or otherwise etymologically related. I would like to keep this freedom to choose the most accurate and nuanced wording. Thadh (talk) 22:44, 16 October 2021 (UTC)[reply]
@Vox Sciurorum, Thadh: "Compare" is often used just before {{cog}}, for true cognates also sometimes (eg. ਗੁਝਾ). My understanding is that "cognate with" or "cognates include" doesn't really imply that only that many cognates are there unless there is an "and". For example, Cognate with LANG term, LANG2 term2, LANG3, term3 or Cognate with LANG term, LANG2 term2, LANG3, term3 and Cognates include LANG term, LANG2 term2, and LANG3, term3 or Cognates include LANG term, LANG2 term2, and LANG3, term3. Another example - Marathi थुंकणे (thuṅkṇe); it says "cognate with" but doesn't include the Urdu cognate given at 𑀣𑀼𑀓𑁆𑀓𑀇. Svartava2 (talk) 03:45, 17 October 2021 (UTC)[reply]
Everybody has their own stylistic choices, and I respect their choice even if I wouldn't necessarily make it myself. And AFAIK {{cog}} may be used for partial cognates, like Saterland Frisian Bäidenstied and German Kinderzeit (only the second part of the compound is etymologically related), but maybe I misunderstand when {{cog}} must be used? Thadh (talk) 09:05, 17 October 2021 (UTC)[reply]

'Cognates include' would be odd for a complete list, even if it doesn't preclude it. While I appreciate that there is now push back against 'Wiktionary is not a paper dictionary', as space on mobile phones is limited, 'cognate with' also invites padding with a complete list of cognates, or a complete list of cognates of a particular type. Note that we use 'cognate' in a wider sense than some other dictionaries, by including words related by borrowing. --RichardW57 (talk) 14:08, 17 October 2021 (UTC)[reply]
@Thadh I believe {{cog}} is only to be used when a single term in the source language; like Pali sarīra, Prakrit 𑀲𑀭𑀻𑀭 both from Sanskrit शरीर. As for Saterland Frisian Bäidenstied, {{noncog}} would be more appropriate; using {{cog}} there is like {{cog}} for Prakrit 𑀅𑀡𑀼𑀕𑀘𑁆𑀙𑀇 (aṇugacchaï) and Pali avagacchati (where the prefix is different and w/o prefix Prakrit 𑀕𑀘𑁆𑀙𑀇 (gacchaï) is true cognate of Pali gacchati). Svartava2 (talk) 13:28, 17 October 2021 (UTC)[reply]
@Svartava2: {{noncog}} is a bit strong for partially cognate words; plain {{mention}} would be better. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]
{{m}} doesn't load a hyperlinked language name, though. But I do think we may want to be more lax with the usage of {{cog}}, because using {{ncog}} for false cognates is not uncommon. Thadh (talk) 17:08, 17 October 2021 (UTC)[reply]

@Thadh: {{m+}} does. — This unsigned comment was added by RichardW57 (talk • contribs).
@RichardW57: It doesn't link the language name. Thadh (talk) 19:45, 17 October 2021 (UTC)[reply]

@RichardW57 There is always a chance of any cognate list being incomplete; but do we always use "cognates include […]"? Do you really think that the 42,100 (approx.) uses of "Cognate with" are 100% complete with not even a single cognate missing? No, I don't think so. "Cognate with" ≠ "Cognate [only] with". Re: "we use 'cognate' in a wider sense" - some editors do, while some don't. I don't. In case of a borrowing, I prefer showing other (borrowed) words in other languages from the same etymon as "Compare {{ncog|LANG|term}}", for example, diff (initially which said "cognate" added by Kutchkutch). Svartava2 (talk) 15:41, 17 October 2021 (UTC)[reply]
'Cognates include' tells the user and other readers that the list is not intended to be complete. (Quoting from a dozen Zhuang dialects does not seem not useful, unless we're looking at a recent borrowing.) 'Cognate with' does not reveal the author's intention. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]
There appear to be grammatical issues to handle in any automated processing. While it seems safe to replace 'Cognate to' with the classier 'Cognate with', merging of 'cognate of' runs the problem that 'cognate' here is a noun. 'Cognates include' actually includes a verb, and there may therefore be grammatical issues as well as a loss of connotation. Would the bot know not to change quotations? --RichardW57 (talk) 14:08, 17 October 2021 (UTC)[reply]
well yes you're right, that would also need some attention. I think we could deal with this with the help of some list like user:benwing2/pra-sc. Svartava2 (talk) 16:00, 17 October 2021 (UTC)[reply]
Someone please delete this shitty, nonsensical vote! It does not help us in any wise. (@Metaknowledge) ·~ dictátor·mundꟾ 15:31, 17 October 2021 (UTC)[reply]
It isn't nonsensical; per Imetsia: “A discussion on whether to incorporate some other text in the {{cog}} template by default is a better place to start.” The vote may be hurried a bit, so I removed its starting date (for now); let's do some more discussion regarding this. Svartava2 (talk) 15:53, 17 October 2021 (UTC)[reply]

Bot to generate Spanish forms edit

I'm playing around with a bot to generate Spanish forms and I wanted to solicit some feedback concerning the "best" way to declare a form of. To start with, it'll just be generating forms of nouns and adjectives.

Below is a list of the templates/paramaters I would propose for the given situations. Given that these will be bot generated, I'm preferring templates that may generate the most helpful categories or other meta data without regard for how unwieldy their parameters may be.

Plural of a masculine/feminine adjective (verde -> verdes)

head: {{head|es|adjective form|g=m-p|g2=f-p}}

gloss: {{adj form of|es|verde||p}} -> plural of verde

Masculine plural of adjective (rojo -> rojos)

head: {{head|es|adjective form|g=m-p}}

gloss: {{adj form of|es|rojo||m|p}} -> masculine plural of rojo

Feminine of adjective (rojo - > roja)

head: {{head|es|adjective form|g=f}}

gloss: {{adj form of|es|rojo||f}} -> feminine of rojo

Feminine plural adjective (rojo -> rojas)

head: {{head|es|adjective form|g=f-p}}

gloss: {{adj form of|es|rojo||f|p}} -> feminine plural of rojo

Plural of a masculine/feminine noun (dentista -> dentistas)

head: {{head|es|noun form|g=m-p|g2=f-p}}

gloss: {{noun form of|es|dentista||p}} -> plural of dentista

Plural of a masculine noun (doctor -> doctores)

head: {{head|es|noun form|g=m-p}}

gloss: {{noun form of|es|doctor||p}} -> plural of doctor

Feminine equivalent of a masculine noun (doctor -> doctora)

head: {{es-noun|f}}

gloss: {{female equivalent of|es|doctor}} -> female equivalent of doctor

Plural of a feminine equivalent of a masculine noun (doctora -> doctoras)

head: {{head|es|noun form|g=f-p}}

gloss: {{noun form of|es|doctora||p}} -> plural of doctora

Plural of a masculine noun (naranjo -> naranjos) head: {{head|es|noun form|g=m-p}}

gloss: {{noun form of|es|mesa||p}} -> plural of mesa

Plural of a feminine noun (manzana -> manzanas) head: {{head|es|noun form|g=f-p}}

gloss: {{noun form of|es|mesa||p}} plural of mesa

Are there other cases I should consider or anything else anyone would like to see in a bot generated form entry (etymology, IPA, etc)?

Note: Some of the default head/gloss lines have been edited to reflect the suggestions below. I decided to keep the gender/plural declarations in the headword definition because most entries already have them and they would be difficult to add later but easy to remove.

JeffDoozan (talk) 22:08, 16 October 2021 (UTC)[reply]

Was there a decision on whether {{es-IPA}} is safe enough to add to all entries? If so, the bot should add pronunciation. On the bigger issue, I don't think all Spanish forms need their own pages. The list should be made by a human including common words but not rare words. Vox Sciurorum (talk) 22:47, 16 October 2021 (UTC)[reply]

Don't worry, I'm not going off on an anti-red-link campaign. I think there are tasks that are better suited to a bot than a human and generating forms seems like one of them. I'm open to input on how to apply this: perhaps only generating forms for lemmas that are DRAE attested that don't contain an obsolete/disused/antiquated qualifier. Additionally, the bot could monitor a page where humans could add lemmas that they deem form-worthy and the bot can save them the labor of creating them manually. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]

@Vox Sciurorum: I asked this question recently and unfortunately, es-IPA is not 100% foolproof yet. I can get a citation if you need. —Justin (koavf)❤T☮C☺M☯ 06:53, 27 October 2021 (UTC)[reply]

See what the convention is for Portuguese. One may not be better than the other, but two similar languages that often have identically spelled cognates should use the same wording. Vox Sciurorum (talk) 13:01, 17 October 2021 (UTC)[reply]

I've seen all of the variations that I posted above without an obvious consensus, so I thought it would be good to brainstorm to see if there are any nuances I've missed. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]

No preference on templates for verde et al. I don't think e.g. "masculine plural of hombre" makes sense; I'd say don't mention the gender of nouns in the definition line except for female equivalents. doctora isn't exactly a noun form; we currently treat female equivalents as lemmas (as with alternative forms), and I agree with that format. On verdes et al, I'm weakly against including the gender/number in the headword line.
Something I would love to see you do with this bot is find instances of bluelinks with missing parts of speech. E.g. if a page has an adjective and a masculine noun, the plural of the noun will often exist while omitting the adjective form of the same spelling. It happens a lot with verb forms and nouns in -a, -e, -o.
As for es-IPA, I think it's ready for anything but modern borrowings and especially long words. I'm not 100% sure when secondary stress is used, but I suspect it's present with multisyllabic prefixes, compounds, and some long words. BTW, thanks for adding the DRAE links! Ultimateria (talk) 17:11, 17 October 2021 (UTC)[reply]
Thank you for the feedback, especially regarding doctora, which I've adjusted above.

Finding missing parts of speech in bluelinks is one of the motivations for writing this, as even frequently used forms can go unnoticed for a long time, like the missing adjective form alegres, which this bot will handle easily. Another part is detecting orphaned forms that reference lemmas that have been removed. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]

the stress on -mente adverbs isn't coded into {{es-IPA}} - the stress on normalmente, for example, goes on the "mal" syllable, just like normal QuickPhyxa (talk) 22:25, 18 October 2021 (UTC)[reply]

bot edit

hi please creat a bot to creat plural of English names I creat some Amirh123 (talk) 12:03, 18 October 2021 (UTC)[reply]

@Amirh123: you can't even write a complete sentence- why are you creating entries? You were blocked for this three years ago. If you keep doing it, the next block may be permanent. Chuck Entz (talk) 12:27, 18 October 2021 (UTC)[reply]

Effect of Apple’s iCloud Private Relay edit

Hello!

Communities typically block edits from IP addresses that obscure individual users. Usually, it has been about “open proxies” or virtual private networks (VPNs). Now, there's a new Apple service called iCloud Private Relay that obscures the IP addresses of desktop and mobile users of the Safari browser.

As Apple users adopt this new service, we estimate that 3-5% of all logged-in and logged-out editors will have these obscured IP addresses. If we don't change anything, they may be blocked within a few months. This wiki may be severely affected. A lot of people editing here use the Safari browser. Next, other browser providers may do the same. The problem will grow.

The communities, users with advanced MediaWiki permissions, the Wikimedia Foundation and others need to work together. We all need to decide how the security of the wikis can be maintained. At the same time, the pathways to editing should remain open for all good-faith participants.

Read more on Apple iCloud Private Relay. Answer the questions and comment on its talk page.

Thank you!

SGrabarczuk (WMF) (talk) 21:34, 18 October 2021 (UTC)[reply]

As use increases the admins may want to disable disabling of account creation when blocking an IP address. Vox Sciurorum (talk) 20:45, 19 October 2021 (UTC)[reply]

Over-eager abusefilter rule? edit

I was trying to add a question at the information desk and tripped alarms. I was expressing surprise the equivalent to #redirect took so much work here, and I unwisely muttered a fnord and got a rude surprise:

A brief description of the abuse rule which your action matched is: Bad redirect

by in the middle of the text saying:

Does that mean you create a redirect article that *isn't* a #redirect (heaven forfend!), but ...

Now of course here I've used &num; rather than '#'. I don't want to add more entries to my permanent record.

Still, isn't the mention of !#!redirect *anywhere* in text kinda too broad a rule?

63 Bad redirect Disallow, Tag Enabled 04:27, 4 October 2019 by Erutuon (talk | contribs) Private

I see User:Erutuon has been away for some 2 weeks. Anyone else want to take a look? (And is searching for "interface-admin" in Wiktionary:Administrators a reasonable approximation to the set of possible 'someone's? Shenme (talk) 06:28, 19 October 2021 (UTC)[reply]

I looked through the hits on the filter and they generally seem to be accidents or generally bad edits. There is some amount of spam where people try to redirect to other websites, which won't actually do anything, but maybe that is what Erutuon was trying to avoid. Maybe the filter should be updated to only prevent new and unregistered users from typing #REDIRECT elsewhere on the page. - TheDaveRoss 12:41, 19 October 2021 (UTC)[reply]

Erutuon only edited it. I wrote it. The problem it addressed was vandals adding a redirect to an existing page to effectively make it go away. It needs to be fixed so it only detects functioning redirects, but it addresses a real problem. I'd rather not get rid of it. Chuck Entz (talk) 13:57, 19 October 2021 (UTC)[reply]

I edited it again to make it a little less eager. The OP's attempted edit on the Information Desk would now not trigger it anymore, but all of the other edit filter hits that I looked at would. — Eru·tuon 20:36, 19 October 2021 (UTC)[reply]

Thank you. I'll redirect my attention elsewhere. :-) Shenme (talk) 04:11, 21 October 2021 (UTC)[reply]

Automatically generating form-of entries? edit

Hi. Is there any way to automatically generate form-of entries, or does one have to manually go create the page and fill out the correct details for each inflection? The language I'm thinking of doing this for has some subtleties regarding accentuation, but I'll ignore that for now and just ask about e.g. Italian or French verb conjugations. 70.175.192.217 05:34, 20 October 2021 (UTC)[reply]

French and Italian inflected forms have been added by a bot: SemperBlottoBot. --Lambiam 09:35, 20 October 2021 (UTC)[reply]

Help needed for bor cleanup edit

I request that someone prepare a list of words in Indo-Aryan languages deriving from Sanskrit — that use only {{bor}} (i.e., no other specific templets). This will make it easier for me to substitute {{bor}} with {{lbor}} (or {{slbor}}, or in a few cases correcting to {{inh}}). A list will make it easier to do the cleanup, or else it is sore difficult to search through the entire list at CAT:X language terms borrowed from Sanskrit. Pinging @Benwing2, Erutuon. ·~ dictátor·mundꟾ 15:21, 21 October 2021 (UTC)[reply]

@Surjection: would it be possible for you to prepare such a list for me? Thanks for any help. ·~ dictátor·mundꟾ 15:21, 22 October 2021 (UTC)[reply]

@JeffDoozan, SemperBlotto: Could anyone of you help me prepare a list; maybe you could do it using a bot. Thanks for any help. ·~ dictátor·mundꟾ 17:31, 23 October 2021 (UTC)[reply]

@Inqilābī: I don't know if this is something I can generate, but there are a couple of things you could clarify to make this request easier to fulfill: which languages qualify as "Indo-Aryan languages deriving from Sanskrit" and exactly which other templates qualify as "no other specific templates"? If you want, say, a list of all Punjabi entries that have {{bor}} but not {{lbor}} anywhere in the entry, I can make that for you pretty easily. If, however, you're looking for a list of 10+ different languages, considering only templates that appear in the Etymology section, and excluding 10+ templates, that's significantly more work. JeffDoozan (talk) 17:47, 23 October 2021 (UTC)[reply]

@JeffDoozan: Thanks for your interest! Yes, I should clarify things. So, I would like a list of words of all Indo-Aryan languages that are categorized as only borrowed (using {{bor}}) from Sanskrit. However, you can of course choose to make separate lists language-wise, i.e., separate lists for Bengali, Hindi, Punjabi, etc. Entries that already use the specific {{lbor}} and {{slbor}} are not to be included. And also, Sanskrit includes any specific chronolect of Sanskrit as well— Classical, New, etc. Hope this helps. ·~ dictátor·mundꟾ 18:05, 23 October 2021 (UTC)[reply]

@Inqilābī: If you can give me a list of all of the language codes you want searched ("bn" for Bengali, "hi" for Hindi), I can make you a list of all entries that include {{bor}} but not {{lbor}} or {{slbor}}. JeffDoozan (talk) 18:15, 23 October 2021 (UTC)[reply]

@JeffDoozan: Here: as, bn, or, bho, inc-oas, awa, mag, bra, gu, hi, ur, kfr, ks, kok, mr, ne, pa, sd, inc-ogu, omr, pi, si .

Besides Category:Terms borrowed from Sanskrit, also do check other categories like Category:Terms borrowed from Classical Sanskrit, Category:Terms borrowed from New Sanskrit, etc. for any possible usages of {{bor}}. ·~ dictátor·mundꟾ 21:00, 23 October 2021 (UTC)[reply]

Here's your list. It turns out that there are no entries whatsoever that have use {{bor|??|sa}} and also {{slbor|??|sa}} or {{lbor|??|sa}}. If you look at the stats for as Assamese, you'll see it has 176 uses of {{bor}} and 12 uses of {{lbor}} but there is no intersection of the pages containing {{bor}} and pages containing {{lbor}}. — This unsigned comment was added by JeffDoozan (talk • contribs).

@JeffDoozan: Thank you so much!! But I see you omitted some languages: inc-oas, awa, mag, bra, kfr, kok, inc-ogu, omr. ·~ dictátor·mundꟾ 14:43, 24 October 2021 (UTC)[reply]

@Inqilābī: They show 0 results because they don't contain any entries with a {{bor|??|sa}}. JeffDoozan (talk) 14:48, 24 October 2021 (UTC)[reply]

I spoke too soon some of them do include {{bor}}, I think I know what the problem is, I'll get you a new list. JeffDoozan (talk) 14:52, 24 October 2021 (UTC)[reply]

Fixed JeffDoozan (talk) 15:00, 24 October 2021 (UTC)[reply]

@JeffDoozan: Sorry to bother again: would it be possible to make the list an automated one, similar to Category:etyl cleanup, so that upon fixing one entry, it goes off the list? I would ideally want that, otherwise the cleanup is going to be very strenuous for me. ·~ dictátor·mundꟾ 15:31, 24 October 2021 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ @Inqilābī: The list is generated from the wikimedia database dump that is generated twice a month so you won't get live updates, but I'll try to remember to keep it updated for you and maybe automate it someday. Ping me if it's ever more than 4 weeks old and I'll refresh it for you. JeffDoozan (talk) 16:54, 24 October 2021 (UTC)[reply]

Using Template:head to populate Category:English N-letter words edit

In Wiktionary:Beer_parlour/2021/June#Categorization_bot, User:Suzukaze-c stated that {{head}} could be used to easily populate Category:English three-letter words (which is currently sparse). I would like to propose that {{head}} be used to populate the categories for English one-letter, two-letter, and three-letter words. (I lack the permissions to implement this myself.) The only explicit counterargument mentioned in the previous discussion (though I'm open to more) is that it is difficult to browse or search through a very long list of categories at the bottom of a page. If consensus is against populating these categories, then I'll happily create RFDs for them. - excarnateSojourner (talk|contrib) 03:13, 22 October 2021 (UTC)[reply]

When it comes to implementing, it should be noted that N-letter words must "have meaning(s) beyond their component letters that are neither names nor abbreviations", so the part of speech and possibly capitalization of a term will have to be examined to determine whether it fits the categories' criteria. - excarnateSojourner (talk|contrib) 03:19, 22 October 2021 (UTC)[reply]

Talk to the Community Tech edit

Read this message in another language

Hello!

We, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will begin on 27 October (Wednesday) at 14:30 UTC on Zoom, and will last an hour. Click here to join.

Agenda

Become a Community Wishlist Survey Ambassador. Help us spread the word about the CWS in your community.
Update on the disambiguation and the real-time preview wishes
Questions and answers

Format

The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (all points in the agenda except for the questions and answers) will be given in English.

We can answer questions asked in English, French, Polish, Spanish, German, and Italian. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

Join online
Meeting ID: 83847343544
Dial by your location

We hope to see you! SGrabarczuk (WMF) (talk) 23:00, 22 October 2021 (UTC)[reply]

Old Prussian Macrons edit

Looking through Category:Old Prussian lemmas, I noticed two things: all the terms with macrons I've looked at so far are at the spellings with the macron, and all links to Old Prussian entries with macrons I've seen so far are redlinks. It turns that Module:languages/data3/p is explicitly set to strip macrons.

The obvious question: do we remove the macron-stripping parameter from the module, or do we move all the macron forms to macronless ones and add a head parameter with the macron?

I would also mention that there are a lot of non-lemmas in Category:Old Prussian lemmas, which shows that the headwords have had very little attention since before we made the lemma/non-lemma distinction. It's not uncommon to see one or two edits by humans in 2006 or 2007 and nothing else but bot edits in the edit histories. Chuck Entz (talk) 00:17, 23 October 2021 (UTC)[reply]

Thank you for bringing this up. The Old Prussian entries on Wiktionary could definitely benefit from cleanup and consistency.

Many Old Prussian words are only attested in non-lemma form, e.g. only the accusative plural tūsimtons (“thousands”) is known. We could go one of two ways: we could either attempt to reconstruct the original form (which would probably belong in the Reconstruction: namespace, similar to how some Gothic and Latin terms are there), or we could stick to only describing the attested forms. I personally wouldn't want to venture far into reconstruction (since I'm not a linguist), although if the work has already been done somewhere else and is simple to incorporate, then I wouldn't mind including it.

Another issue is orthography. As with a lot of old languages, every text had its own unique way of spelling things. That's not a huge problem in itself, since we can just choose one main form arbitrarily and make the other forms alternate. What requires more thought is the fact that a lot of words have an actual orthography, influenced by German, and a reconstructed orthography, based on Balto-Slavic phonology. Macrons are one part of this, but not the entire thing. Macrons were actually used in the Enchiridion, but I'm not sure if any other texts used them. Another idea was that stress is indicated by the presence of doubled consonants preceding a vowel, but I digress.

For example, smoy (“person”) is attested in the Elbing vocabulary, but the reconstructed form is zmūi. We probably want to make sure terms are in the original form, or at least be aware of this nuance. For smoy, we already have the original form. But, for instance, I believe our entry ēizwa ("wound") is actually a reconstruction of the original eyswo.

(Actually, I'm not sure if either ēizwa or zmūi are attested anywhere, although looking at Kortlandt's version of the Enchiridion, I'm not seeing them.)

Something to definitely keep in mind is that the dictionary at https://wirdeins.twanksta.org/ is for a reconstructed, revived version of Prussian. For example, if you type in "telephone", you'll get "telepōns" as a result. There actually is a way to tell which words are real and which aren't. You have to click the head word, and then you'll see "telepōns <32> masc [Telephon MK]". "MK" are the initials of the person who wanted to revive Prussian. Any words with "MK" in them, or certain other initials corresponding to people involved in the project perhaps (but I've only ever seen MK), are completely out of scope for Wiktionary. Whereas, if you type in "person", you'll get "zmōi <64> masc [Smoy E 187]", where "E" indicates that the word appears in Elbing, and "Smoy" is the original attested form, while zmōi is their reconstruction.

Any other editors with interest in Old Prussian or Balto-Slavic historical linguistics in general may want to take a look at this discussion. 70.175.192.217 11:01, 23 October 2021 (UTC)[reply]

Macrons: As they are part of one regular spelling, wouldn't they belong into the title? Compare:

Latin, Greek, Germanic (OHG, MHG, MLG, Anglo-Saxon): Macron isn't regulary used in writing, but is used in some dictionaries to indicate vowel length. That's why titles don't have macrons.
Baltic (Lithuanian, Latvian): Macrons are used in writing and hence are also used in titles.

Reconstructions: Indeed, reconstructions belong into the reconstruction namespace.

--Myrelia (talk) 16:27, 24 October 2021 (UTC)[reply]

[Edited for brevity 70.175.192.217 20:29, 24 October 2021 (UTC)] Lithuanian and Latvian do use macrons/ogoneks in their standard orthography to indicate vowel length, but I think it's worth noting that like Greek (etc.) they also have special diacritical marks only used in dictionaries to indicate pitch accent that we don't include.[reply]

I think it could be reasonable to include the macrons in titles, but only for words where it is attested (probably a subset of words from the Enchiridion), not for every form where some people have tried to guess where the stress would have been. 70.175.192.217 19:09, 24 October 2021 (UTC)[reply]

The problem I see is that we haven't been very consistent in marking where and in what form these are attested. There are no doubt a number where the original document had them marked as long, but there are the words for sodium and iodine (now in rfv) that show macrons, but can't possibly have been even attested- with or without macrons. If only one source has macrons, why are there so few macronless entry names in Category:Old Prussian lemmas?

at any rate, we can't stay with the status quo. Old Prussian muti is a good illustration: it's defined as "Alternative form of mūti", but there's no way to go to mūti- the template strips the macron and treats it as a self-link. In most cases, templates have redlinks to a non-existent macronless form rather than linking to an existing macron form. Simply put, you can't use a template to link to an Old Prussian entry with a macron.

As I said, there are only two solutions: disable macron-stripping (easy, but inconsistant with how we handle other languages) or add a {{head}} parameter with the macron form to all entries with a macron, then move them to the macronless spelling (time-consuming, but there are only a couple hundred entries at most). I'm tempted to just implement the second solution

I would suspect that a lot of people who wrote Prussian entries just didn't do their due diligence. They might have looked up the words in that revived-Prussian dictionary or something similar and just added them, without checking in what form they were attested. Either of those solutions could work as a stopgap measure. Ideally we could go through all words words and figure out the form in which they were attested, but that would take an awful lot of effort. 70.175.192.217 20:29, 24 October 2021 (UTC)[reply]

"macrons in titles, but only for words where it is attested": yeah, that is going by actual spelling in the sources.

muti/mūti: If mūti is the actual spelling in a source, then the entry mūti should stay and the templates/modules be fixed.

"so few macronless entry names": Possibilities:

That the macron-source was the preferred source for the entries or has more content.
That the category contains much reconstructed or 'normalised' Prussian or constructed Neo-Prussian. diff gives a source: Vytautas Mažiulis, cp. Old Prussian language#Revitalization.

--Myrelia (talk) 21:34, 24 October 2021 (UTC)[reply]

Re the first possibility: It looks like the Enchiridion (3rd Catechism) is the most voluminous source: 132 pages of text, vs. ~6 pages each for the other two Catechisms, ~800 words in the Elbing vocabulary, and various fragmentary texts.

If anyone wants to look into this further, I think a good source is prusistika.flf.vu.lt, although it is in Lithuanian (with some German glosses). It is based on Mažiulis's dictionary and has very detailed entries, including various inflected forms, 'normalized'/'reconstructed' forms, etymology, and links to the original usages. E.g., for abasus ("cart") (which we currently list under abazzus): [6].

By the way, if we want to follow that dictionary, they seem to include macrons in headwords if it is attested: īmt, but exclude them if it is not: smoy (note that the entry does include the normalized form with the macron, prefixed by an asterisk to indicate it's a reconstruction: "*zmōi̯"). 70.175.192.217 22:44, 24 October 2021 (UTC)[reply]

I just went through five arbitrary Prussian lemma entries near the start of the alphabet, and all of them were in the normalized/reconstructed form:

abbaras vs. aboros
ābzdus vs. wobsdus
azzaran vs. assaran
cīziks vs. czilix
I couldn't even find any entry corresponding to ālaws, although there is starstis (which I found using the reverse-dictionary search from Lithuanian 'alavas'). An actually-existing Prussian cognate is alwis, but that means lead, not tin.

These are not cherrypicked. If we aim to only use attested forms as headwords, we have our work cut out for us. 70.175.192.217 00:03, 25 October 2021 (UTC)[reply]

I have my doubts about Category:prg:Chemical elements in general. Some of them would be straightforward enough for someone of that era to get right, but at least two are obviously modern inventions and there may be modern guesswork involved in the identificaion for others. We definitely should remove the atomic numbers and descriptive stuff- they can get that at the English entries, and it may be implying more precision than the sources merit. Chuck Entz (talk) 01:35, 25 October 2021 (UTC)[reply]

I've gone through all the entries and added |head= parameters. I did so because they do no harm as long as they match the entry title, and because it removes one obstacle to moving rather than changing the module. I won't mind if everybody decides not to move- I also checked all the headwords and added sortkeys, so if we don't move them, all the categories generated by {{head}} won't sort with all the macron forms at the end. Either way, I will have wasted a little bit of my time, but that's fine with me.

Re: muti/mūti: the way to deal with this in the macronless option is to have two noun sections next to each other: one with the macron in the headword and one without.

While going through the entries, there were a few entries that weren't nominative singular, but were formated like a lemma. I'm guessing that these are only attested in the inflected forms. Another oddity is unds, which is defined as the masculine singular of undan. I've noticed that pretty much all of the accusative singulars end in -n, so someone may have assumed that undan is one of those, though it's said to be an alternative form of wundan which is in turn the gloss in the Elbling vocabulary for "Wasser".

I should also mention a "Usage notes" section at drūwis, which discusses standards for capitalization of the names of religions. I'm guessing this only applies to modern revived Old Prussian, since a dead language attested mostly in vocabulary lists doesn't really have usage in that sense. I would suggest that we remove it.

This does bring up the question of capitalization, however. All of the proper nouns seem to be capitalized and the common nouns seem to be lowercase. I really have doubts as to whether this was the practice in all the original manuscripts. Still, if the difference is due solely to the linguistic background of the mostly second-language speakers who wrote the words down, we might want to standardize capitalizations to avoid a random mess. Chuck Entz (talk) 01:11, 25 October 2021 (UTC)[reply]

I've removed macron-stripping from the module. If we decide we want to move the macron forms to macronless, it can be easily changed back. At least now the links work. Chuck Entz (talk) 04:36, 25 October 2021 (UTC)[reply]

That's great. BTW: diff caused the issue, and was changed without explanation or taking care of the entries. --Myrelia (talk) 09:33, 25 October 2021 (UTC)[reply]

Reference Placement edit

Where should inline references go? Normally the correct placement is after the information presented, but there are a number of awkward cases.

Information in the headword line. Some, such as @Inqilābī, interpret that footnote link location as being for the whole L3/L4 section. I interpret such a reference as being for the information on the headword line. For longer headword lines, this could be confusing, as for some languages there may be several items on the headword line, e.g. simple pasts and past participles for English.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
Senses. For less-documented languages, it may be permissible to cite a dictionary. How are dictionaries supposed to be cited? For some very simple instances, it may be possible to quote a dictionary (or thesaurus) that is in the public domain, though I suspect that may be dispreferred.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
General sources. There seems to be some debate on how to cite a dictionary entry for the whole of an L3/L4 entry. I would argue that the citations should be atomised to the individual parts of the Wiktionary entry, and that where there is more information to be found in the cited dictionary, it is more appropriately referenced under 'Further Reading'. Inqilābī prefers to put the footnote link on the headword line. Note that the <ref> tag has an attribute name to allow multiple links to the same footnote. --RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
Further quotations. Some of the links to the Pali Text Society dictionary are justified by its listing of locations where the words are used, which we should in the fullness of time add to Wiktionary. (I'd rather someone else did the blinking work - I haven't managed to reduce it to a simple mechanical task.) I think that hints for further extensions to the article should go under 'Further Reading'.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
Inflections. Or should we just use non-lemma entries to give sources? There may be some issues where forms don't independently meet the Criteria for Inclusion (CFI). Ideally, we could use footnotes in inflection tables, but for now that is largely an aspiration rather than a real capability.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]

Category:Proto-Chinookan language edit

This category seems very odd to me, and it was created by a bot. Question from a complete layman; is there really a "Proto-Chinookan language"? I see no evidence of that online. Could someone knowledgable in this check it out for validity? Thanks. PseudoSkull (talk) 18:54, 25 October 2021 (UTC)[reply]

I'm not knowledgeable in this particular matter but I do find scientific attestation on the internet: [7], [8], [9]. Fytcha (talk) 19:01, 25 October 2021 (UTC)[reply]

It was added by @-sche: diff. DTLHS (talk) 19:30, 25 October 2021 (UTC)[reply]

Chinookan languages are one the self-evident families in North America, and no one is in doubt about its validity. So there must have been a common proto-language. However, Proto-Chinookan has not yet been reconstructed except for person and tense/aspect-marking prefixes in two papers by Silverstein. Since Sapir, comparatists have been more eager to find evidence for the inclusion of Chinookan in the Penutian stock, rather than doing the necessary homework for Proto-Chinookan first. –Austronesier (talk) 10:09, 26 October 2021 (UTC)[reply]

Interesting; as Austronesier says, it's a 'real' protolanguage (the Chinookan languages are clearly descended from a common source, and a few references refer to it), but I no longer recall (years later) why I added it, since no entries make reference to it (maybe I was going to mention it in etymologies and got sidetracked). The references on it are sparse; no objection if someone wants to remove it. - -sche (discuss) 22:14, 14 November 2021 (UTC)[reply]

Vote: New page-protection level edit

Proposing a new protection level in-between of autoconfirmed and template editor. Auto-confirmed right can be easily obtained, and per details at w:WP:AUTOC, it just requires 4 days + 10 edits, which someone can easily get. Disruption by auto-confirmed users exists and isn't that rare, as can be seen with [10], [11], [12] and many others. A new protection level would be stricter than auto-confirmed protection, which would be used to prevent auto-confirmed disruption.

[Note: If both options pass, the one with greater SUPPORT:OPPOSE ratio will be implemented.]

Schedule:

Starts: 16:09, 26 October 2021 (UTC)
~~Ends: 16:09, 10 November 2021 (UTC)~~

Option 1: Extended confirmed user group edit

Creating a new right like w:WP:XC named extendedconfirmed similar to WP ― auto granted when 30 days + 500 edits. This would enable a extended-confirmed protection.

Support edit

Support, as proposer. Svartava2 (talk) 16:09, 26 October 2021 (UTC)[reply]
Support. It could be a good protection-level that admins place on heavily-vandalized pages. And Wikipedia has it too. We should also look into other user permission groups to condition voting at RFD, etc. as I had suggested at Wiktionary:Beer parlour/2021/March § Dentonius. But we can worry about that in a later discussion. Imetsia (talk) 17:02, 26 October 2021 (UTC)[reply]
Changed to abstain, now that we have the second option below, which I think is better. Imetsia (talk) 15:23, 27 October 2021 (UTC)[reply]
Support--Jusjih (talk) 21:44, 1 November 2021 (UTC)[reply]

Oppose edit

Oppose. Unsafe move. Trusted people who are prolific editors can be made admins or template editors, if need be. Also, it would be very problematic if Wonderfool is able to edit protected pages. ·~ dictátor·mundꟾ 14:57, 27 October 2021 (UTC)[reply]
Oppose Out-of-process vote. Not publicized as regular vote, etc. DCDuring (talk) 17:20, 27 October 2021 (UTC)[reply]
Oppose We have a process for votes. This is pseudo-vote is not following our process. What else will Svartava2 do that ignores our established processes? This whole exercise reads to me as disrespect for the Wiktionary community.

By not following the process, this pseudo-vote has not been publicized, and does not last the indicated one month. Proper editor notification and proper timelines are even more important when it comes to our infrastructure, such as the permissions system.

I cannot support this kind of procedural abuse. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:49, 27 October 2021 (UTC)[reply]

Abstain edit

Abstain Seems redundant with the autopatroller capability. Both identify non-admins with a large number of good edits. Is it technically possible to set pages to be editable by admins and autopatrollers? Vox Sciurorum (talk) 17:18, 26 October 2021 (UTC)[reply]
@Vox Sciurorum No. currently it isn't possible to protect a page as to be edited only by autopatrollers and admins. Added the option for this now (see below). Svartava2 (talk) 17:54, 26 October 2021 (UTC)[reply]
Abstain: I think option 2 is better. P U C – 12:02, 27 October 2021 (UTC)[reply]
Abstain per PUC. Imetsia (talk) 15:23, 27 October 2021 (UTC)[reply]

Option 2: Autopatroller user group edit

Adding an option for protection which would allow only autopatrollers and admins to edit the protected page, which currently doesn't exist. Autopatrollers are generally trusted users, unlikely to intentionally disrupt pages. Related discussion: User talk:Bhagadatta#Lahore.

Support edit

Support, as proposer. Plus, this would make the autopatroller group a bit more useful. Svartava2 (talk) 17:54, 26 October 2021 (UTC)[reply]
Support; I think this new protection level would, for some pages, strike the right balance between too little and too much. P U C – 18:05, 26 October 2021 (UTC)[reply]
Support. Imetsia (talk) 19:22, 26 October 2021 (UTC)[reply]

~~# Support.~~ There are certain pages I can't edit despite having over 10,000 edits over 6 years, simply because they're targets for vandalism. Andrew Sheedy (talk) 04:43, 27 October 2021 (UTC)[reply]

I have stricken my vote, not because I do not support this, but because I interpreted this as an opinion poll, but now see that it has been presented as a vote. My support still stands, but I agree with those below who are saying that this needs a formal vote. Andrew Sheedy (talk) 02:57, 28 October 2021 (UTC)[reply]

Support -- 𝓑𝓱𝓪𝓰𝓪 𝓭𝓪𝓽𝓽𝓪^{(𝓽𝓪𝓵𝓴)} 13:09, 27 October 2021 (UTC)[reply]

~~# Support This seems like the right approach to me. DCDuring (talk) 14:21, 27 October 2021 (UTC)~~[reply]

Support--Jusjih (talk) 21:44, 1 November 2021 (UTC)[reply]
@Jusjih: This vote was cancelled on 28 October; see Wiktionary:Votes/2021-10/Autopatroller-level page protection. J3133 (talk) 10:57, 2 November 2021 (UTC)[reply]

Oppose edit

Oppose. There are also autopatrollers who are disruptive, controversial editors. Our current protection level is fine: WT:Autopatroller is not supposed to be a right. The proposer himself was recently made an autopatroller, and is eager to go on a vandalism spree. ·~ dictátor·mundꟾ 14:57, 27 October 2021 (UTC)[reply]
Why not propose removal of autopatroller right for such a (non)contributor? DCDuring (talk) 15:39, 27 October 2021 (UTC)[reply]

No protection level is entirely failproof: some administrators have also been controversial editors in the past. It's simply a question of cost-benefit analysis: will allowing more (experienced) editors to edit certain pages be worth the increased risk of controversial edits on those pages by said editors? On the whole, I'd say the answer is yes. The autopatroller status is not given lightly, and we can trust that autopatrollers are people who know what they're doing and should consequently be free to edit (almost) any page. If we then notice that even that protection level is not enough on one page or another, we could either increase the protection level for that page, either revisit the question of the autopatroller status of the editor, as DCDuring suggests.

The other risk I see is that of an admin starting to use the new protection level too liberally, on pages who should really be editable by all auto-confirmed users, not only autopatrolled ones. But that appears unlikely to me, so again, I think the benefits outweigh the costs. P U C – 16:21, 27 October 2021 (UTC)[reply]
Inqilabi, when will you stop making nonsensical and illogical comments? I initially proposed only for extended confirmed, but later Imetsia at Discord suggested for the autopatroller one, and I agreed to include that, and that has nothing to do with my recent whitelisting. I'm not a non-contributor. The 2 admins who player a role in whitelisting are sensible, unlike you. Just because of our recent controversies and disputes, please do not make every page our battle ground, like you also did at diff. For your inappropriate comments, you should be blocked to be taught a lesson. Svartava2 (talk) 16:59, 27 October 2021 (UTC)[reply]

Even if this is a plot to gain access to a protected page, the admin who prescribed the meaning or etymology of the word is still going to be watching the page to protect it against dissenting views. This proposal belongs in a discussion among the admins. This vote is a waste of time. Only the admins will be deciding how to use any protection mechanism. As far as I know there is no existing voter-approved policy saying that the some list of protection levels is exhaustive, so nothing to vote to amend. Vox Sciurorum (talk) 18:23, 27 October 2021 (UTC)[reply]

@Inqilābī: I see no reason to impugn the motives of another contributor based on speculation and personal feelings. There is no evidence to allege that Svartava has created this vote pretextually so he can "go on a vandalism spree." Argue the issue based on its merits rather than by attacking the person who created it. At this point, your ad hominems have become quite annoying, disruptive, and probably blockworthy. Imetsia (talk) 19:47, 27 October 2021 (UTC)[reply]
TBH, having watched some of their interactions from afar, I wouldn't say Svartava is beyond reproach either; so without condoning Inqilabi's ad hominems I can see where they're coming from. But yes, it's regrettable that personal feelings are getting in the way.

In general, Inqilabi and Svartava strike me both as overly enthusiastic teenagers, who have knowledge and sometimes good ideas (the idea discussed here being, in my opinion, one of them) but should learn to be more patient and accommodating. And that's me saying this. P U C – 23:00, 27 October 2021 (UTC)[reply]
You people are mistaken, I always try my best to keep the project from harm. This was my alert to the community about the danger posed by Svartava, the only person with whom I have a standoff, caused by Svartava’s impatience and ill will. ·~ dictátor·mundꟾ 10:33, 28 October 2021 (UTC)[reply]
Grow up please.we had a much better position in each other's eyes before a few disputes ruined it. I was patient and good willed until I accepted your dictatorship. When I disputed it, I was no longer so. My attempt to get a clean start with account user:Svartava was in good faith I had even informed a check user to ensure that I wasn't doing anything wrong. After that you thoroughly convinced yourself that I was a vandal, which I clearly wasn't. All I would like to say is that this isn't the right discussion to talk about this; you could start a new discussion on BP or my talk page. I humbly request you to keep any discussion free of unrelated and off topic comments. Svartava2 (talk) 16:06, 28 October 2021 (UTC)[reply]
Oppose Out-of-process vote. Not publicized as regular vote, etc. DCDuring (talk) 17:21, 27 October 2021 (UTC)[reply]
Oppose We have a process for votes. This is pseudo-vote is not following our process. What else will Svartava2 do that ignores our established processes? This whole exercise reads to me as disrespect for the Wiktionary community.

By not following the process, this pseudo-vote has not been publicized, and does not last the indicated one month. Proper editor notification and proper timelines are even more important when it comes to our infrastructure, such as the permissions system.

I cannot support this kind of procedural abuse. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:49, 27 October 2021 (UTC)[reply]

@Eirikr: If you saw this as a non-binding poll, as I do, what would be your opinion about the proposal? P U C – 23:15, 27 October 2021 (UTC)[reply]

@PUC: It says right there at the top that this is a vote. This has many of the trappings of a formal binding procedural vote, without following the actual procedure for such a vote. I cannot view this as just a straw poll.

That aside, even as a poll, I think this is poorly constructed. It's not clear why we need either option. It's not clear why we wouldn't just censure the users causing trouble. This seems like an ill-defined solution in search of a problem. ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:20, 28 October 2021 (UTC)[reply]

Some may depend on a vote appearing in the vote box, which appears on the watchlist. Some valuable contributors may not have the time to waste on long-winded BP discussions. We often have votes when someone makes (or proposes to make) a change which some think is not widely supported and which has implications for multiple entries. If this isn't a vote, it should not purport, as in the heading, to be a vote. If it is a vote, we have a framework for such things, discussion first, possibly a poll, then a vote. I, for one, would really like to hear from a broader group of contributors, including those who are active patrolers, but I don't want to waste their time on whatever this is: proposal or poll or vote. DCDuring (talk) 01:03, 28 October 2021 (UTC)[reply]

Abstain edit

Decision edit

This is NOT A VALID VOTE. If we want a vote, we follow the rules, make a vote page etc. DCDuring (talk) 15:35, 27 October 2021 (UTC)[reply]

@DCDuring: This would of course lead to the implementation of the option which has consensus and is valid. Taking important decisions on BP is precedented. The voting at BP is also precendented, see wiktionary:Beer_parlour/2021/February#Splitting_WT:RFVN and the creation of WT:Requests for verification/CJK. Svartava2 (talk) 16:59, 27 October 2021 (UTC)[reply]

The better call would have been to present the idea and ask for input, instead of setting up a vote here. —Μετάknowledge^{discuss/deeds} 19:33, 27 October 2021 (UTC)[reply]

I don't think there's clear-enough guidance on when to create votes and when simple discussions will suffice. Sometimes votes are created when unnecessary (the plus-templates and Prakrit-lects votes are prime examples), while at other times no vote takes place when some editors argue it should. In this case, we're voting on whether to add (not change) something to the existing state of things on a matter that does not call for a policy decision (i.e. we don't need to edit anything in our guidelines for this addition to work). I'd say a BP vote is enough for most such cases. If only there were a body with the power of judicial review to ordain whether or not a vote is required on a case-by-case basis... Imetsia (talk) 19:47, 27 October 2021 (UTC)[reply]

@Imetsia, this is a proposed decision affecting our permissions structure, affecting what our userbase can and cannot do. Any changes to the permissions system has, by its very nature, a very broad impact. This is not a trivial change. As a non-trivial change, this proposal should follow established process for gaining consensus as broadly as possible. At present, that process is spelled out at Wiktionary:Voting policy. ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:49, 27 October 2021 (UTC)[reply]

We've created a new role before without any formal vote: Wiktionary:Beer parlour/2018/November § Mover role. Admittedly the impact was more limited than that of the current proposal, but still. P U C – 23:15, 27 October 2021 (UTC)[reply]

Vote cancelled. Some people are opposing simply because this vote is on the beer parlour and not on a separate page. I will create a proper vote today of 20 days, which will have only the second option since it looks like this is the one preferred. Svartava2 (talk) 06:13, 28 October 2021 (UTC)[reply]
The proper vote is here. Svartava2 (talk) 16:06, 28 October 2021 (UTC)[reply]
This vote still makes no change in policy. It's like voting to bell the cat, aspirational but not productive. Vox Sciurorum (talk) 16:11, 28 October 2021 (UTC)[reply]
Now if that vote passes, somebody will implement that. Svartava2 (talk) 16:22, 28 October 2021 (UTC)[reply]
@Svartava2: I would suggest delaying the vote for a week or two. There's no rush. Let people emit ideas, objections here or on the vote talkpage, etc. P U C – 18:33, 28 October 2021 (UTC)[reply]
I actually thought the exact same thing before creating it―can/should a vote start right after its creation? But since it seems that this idea seems to have consensus, I started this anyway. Whatever, I have created this vote just for the people who were not accepting its legitimacy for it being on BP. Now since the vote has started and votes have been casted, I think it's best to keep it going. Svartava2 (talk) 19:57, 28 October 2021 (UTC)[reply]

How to deal with this case? (pitch-accent paradigms in Lithuanian) edit

I was editing the entry abatija (“abbey”), and ran into an issue. This noun has two alternative accent paradigms, abãtija (1) and abatijà (2). In total, there are 6 forms with 3 distinct pronunciations that would be written abatija:

abãtija - nominative paradigm 1 (lemma)
abãtija - instrumental, paradigm 1
abãtija - vocative, paradigm 1
abatijà - nominative, paradigm 2 (lemma)
abatijà - instrumental, paradigm 2
abatìja - vocative, paradigm 2

Currently, I think all but the last are indicated properly, since they appear under a bold headword that lists the alternatives of "abãtija" or "abatijà". I'm not sure how to indicate the "abatìja" pronunciation correctly. Would it be proper to create a new ===Noun=== heading just for that pronunciation? These things are a bit of a mystery to me. 70.175.192.217 16:19, 26 October 2021 (UTC)[reply]

That depends on what you want to do with it. For example, if you want to give a pronunciation section for that form, it's a good idea to split the section into "Pronunciation 1" and "Pronunciation 2" (compare Afar awka, even though the issue is different there) or "Etymology 1" and "Etymology 2" (compare Finnish napsaa). If you don't, you could just ignore it and leave it to the reader to find the accentuation in the inflection section. Thadh (talk) 23:46, 27 October 2021 (UTC)[reply]

I'm thinking of making more navbox-style templates for topics. Thoughts? edit

In the style of {{table:colors}}, I'd like to add more templates that navigate between sequences or tightly-bound lists of entries. By “tightly-bound”, I mean fairly manageable and non-arbitrary sets of terms like states of the United States, rather than broad topics that would fill up a huge navbox like emotions. E.g. to start with, I'm thinking of times of day with the sequence being something like:

midnight
daytime
dawn/twilight
daybreak/sunrise
morning
midday/noon
lunchtime
afternoon
teatime
dinnertime
sunset/dusk
evening/twilight
suppertime
nighttime

Thoughts on the general idea of having more navboxes in See also sections or on the specific examples I've given here? —Justin (koavf)❤T☮C☺M☯ 03:23, 27 October 2021 (UTC)[reply]

I support this. Highly valuable for anyone learning English (and any other language, if you're able to implement it more broadly, which would be great). Andrew Sheedy (talk) 04:44, 27 October 2021 (UTC)[reply]

I support this too, and for other languages as well, per Andrew Sheedy. Plus it can help spot gaps in our coverage. P U C – 12:01, 27 October 2021 (UTC)[reply]

I'd support, per all the above. Imetsia (talk) 15:26, 27 October 2021 (UTC)[reply]

I support this. I've seen the color table many times and I've always enjoyed having it available. You can ping me when they're ready and I will translate them into my languages. Fytcha (talk) 22:45, 27 October 2021 (UTC)[reply]

I’d support for things that are clearly non-arbitrary sets like states of the USA, but oppose for the example given of times of day. The identification and semantic division of times of day is highly culturally dependent and so poorly suited for a one-size-fits-all list; individual languages should have individual, manually customized lists in such cases, not a common template. Shoehorning extremely culturally specific notions like ‘teatime’ into a universal list of times of day in all languages is a bad idea. This is also a major problem with the color table template, which, for many languages, provides an extremely misleading picture of how those languages actually conceptualize and divide up the color space. — Vorziblix (talk · contribs) 20:46, 29 October 2021 (UTC)[reply]

@Vorziblix: Do you like the idea of language-specific versions of that time-of-day template? I.e. not a translated table like the color one but just one-off templates? Do you think that would be useful? —Justin (koavf)❤T☮C☺M☯ 04:44, 30 October 2021 (UTC)[reply]

I think language-specific versions would be good (for example, the times of the day in Spanish or Portuguese would need to include madrugada), but it would also be good to indicate variations within a table for a given language. For instance, "teatime" could be put in parentheses or marked with a qualifier. Likewise, you could list two variations in a single box in the table: "dinnertime (US), suppertime (Canada)", for instance (obviously, adapted to cover all English-speaking countries). I do think it's valuable to have tables like these, even if some elements are variable. Andrew Sheedy (talk) 16:40, 30 October 2021 (UTC)[reply]

@Koavf: Yes, I’d support one-off language-specific lists, with items chosen as appropriate to each particular language. I do think lists are helpful in that form. (In fact, for the color-table template, I made a couple such one-off versions for languages where the original was wholly unsuitable, e.g. Template:table:colors/egy. Unfortunately there are many more languages that still use the common table despite its unsuitability for them.) I also agree with @Andrew Sheedy that including variations within a particular language is a good idea; in many cases we already do this, albeit quite messily (e.g. at Template:list:Gregorian calendar months/sh/Latn). — Vorziblix (talk · contribs) 13:49, 1 November 2021 (UTC)[reply]

Example edit

I made template:table:USA, template:table:USA/en, and inserted it into Indiana. I guess now that I'm thinking about it, the trick is getting alphabetical order for different languages... Seems tricky. Any thoughts? —Justin (koavf)❤T☮C☺M☯ 05:22, 30 October 2021 (UTC)[reply]

@Andrew Sheedy: @Imetsia: @Fytcha: @Vorziblix: For visibility. —Justin (koavf)❤T☮C☺M☯ 05:23, 30 October 2021 (UTC)[reply]

It seems like most of these navigational tables are of two kinds, where there is either a clear sequence (temporal: days of the week, months of the year, zodiac symbols or spatial: solar system) and some where there is a more-or-less arbitrary sequence (card suits, chess pieces [these could start with the lowest value and move to the highest but it does the opposite now]). None of them are alphabetical. The only way to really arrange U. S. states like this is by admission to the Union but that is not a very intuitive listing. —Justin (koavf)❤T☮C☺M☯ 05:57, 30 October 2021 (UTC)[reply]

Looks good. I'm not sure how you would automate an alphabetization for other languages, since templates aren't really my thing. Andrew Sheedy (talk) 16:32, 30 October 2021 (UTC)[reply]

I agree, looks good. To alphabetize them you’d probably need to rewrite the template in Lua. Unfortunately there’s no simple solution using just MediaWiki parser functions or the like (that I’m aware of, at least). — Vorziblix (talk · contribs) 13:49, 1 November 2021 (UTC)[reply]

Heads up that I propagated this template to all entries per the above encouragement. Alphabetizing for other languages is still a big consideration that I'm not competent to bother with. I'll continue making more of these and post them for feedback. —Justin (koavf)❤T☮C☺M☯ 23:20, 14 November 2021 (UTC)[reply]

What is Wiktionary:Requested_entries_(Swiss_German)? edit

I thought Swiss German is treated as a part of Alemannic German, which already has Wiktionary:Requested_entries_(Alemannic_German). Is it a remnant from the time Swiss German was treated as a separate language? Fytcha (talk) 22:40, 27 October 2021 (UTC)[reply]

@Fytcha: Once merged into Wiktionary:Requested entries (Alemannic German), I think it can be speedily deleted. —Μετάknowledge^{discuss/deeds} 18:28, 29 October 2021 (UTC)[reply]

@Metaknowledge: Done. Fytcha (talk) 18:44, 29 October 2021 (UTC)[reply]

Template idea: `{{der?}}` edit

Having just done some maintenance in a language that I don't know, I was asking myself why there isn't a template exactly like {{der}} only that it places the entries in a hidden category (similar to how etyl does it) so that they can be checked and replaced with {{bor}}/{{inh}} by people that maintain that language. Fytcha (talk) 16:56, 29 October 2021 (UTC)[reply]

@Fytcha: We have {{etystub}} and {{rfe}} for that. In this case, I'm pretty sure it's an inheritance, but in other cases you could write something like "Ultimately from" and then {{etystub}}. Thadh (talk) 17:24, 29 October 2021 (UTC)[reply]

@Thadh: They both produce visible text in the article however. I'm not sure if I'm comfortable deploying this en masse. There are lots of articles using {{der}} as the first derivational step that need cleanup by somebody knowledgeable in the language. Fytcha (talk) 18:21, 29 October 2021 (UTC)[reply]

That is the problem with en-masse replacement of {{etyl}} by {{der}}... There's not much we can do about that except restoring {{etyl}} or adding these bulky templates. Thadh (talk) 18:38, 29 October 2021 (UTC)[reply]

A simple approach is to define {{der?}} in such a way that "{{der?|L1|L2|...}}" expands to "{{etyl|L2|L1}} {{m|L2|...}}". If the point is merely being able to find applications, just define {{der?}} as a synonym of {{derive}}, and use "What links here" + Show transclusions / Hide links on page Template:der?. --Lambiam 19:36, 29 October 2021 (UTC)[reply]

Someone who doesn't know how to view the documentation of templates might not understand the difference between {{der}} and {{der?}}. For translations, we have {{t}} vs {{t-check}}. Maybe something more like {{der-check}} or {{der-chk}}? This would also be useful for those working on {{etyl}} cleanup, so they don't have to choose between doing nothing and adding {{der}}. For that matter, it could be used to mass-replace all remaining instances of {{etyl}} in entries so we can finally deprecate the template for all languages. Chuck Entz (talk) 20:20, 29 October 2021 (UTC)[reply]

Good points, I agree that {{der-check}} would be a better name. Fytcha (talk) 20:27, 29 October 2021 (UTC)[reply]

Best way to nominate a list of pages for deletion edit

In playing around with the form creation/validation bot, I've identified 80 pages that contain only a Spanish form of that either references a non-existent lemma/part of speech, or references a valid lemma that does not list the given form in its header. Here's the list. Can I just have the bot add a {{d}} tag to every page, or would that create a bunch of pings or otherwise complicate life for the mods? JeffDoozan (talk) 21:15, 29 October 2021 (UTC)[reply]

@JeffDoozan Some of those should have entries like írrito for írrita, but otherwise I'd send them en masse to RFV or RFD respectively. I've seen that happen a few times. AG202 (talk) 03:29, 31 October 2021 (UTC)[reply]

No need for that, I've taken care of them all. I only created a few pages (including one stub, marucho); most were inflections of misspellings or entries that failed RFV/RFD. Thanks for posting it rather than tagging everything. Ultimateria (talk) 04:54, 1 November 2021 (UTC)[reply]

Deletion of Wiktionary:Wanted entries edit

I'm here to announce that this project has failed RFD and I'm in the process of exporting the links to the appropriate Category:Requested entries pages before deleting, and I'll store the remaining links in a userpage. You've probably already noticed the banner missing from Recent changes and your Watchlist; this is why. Ultimateria (talk) 01:53, 1 November 2021 (UTC)[reply]

Unfortunately WT:REE was already a very large page and this has made it much larger. Should we split it into 26 pages by first letter now? People add far more words than they define or remove. Equinox ◑ 21:06, 7 November 2021 (UTC)[reply]

Wiktionary:Beer parlour/2021/October

Definitions of Letters edit

Rhyming categories for Middle Chinese edit

HSK lists of Mandarin words update edit

Let's talk about the Desktop Improvements edit

Unifying the transliteration of ʾalef and ʿayin in Semitic languages edit

Request for new language family and proto-language codes: North Halmahera / Proto-North Halmahera edit

Inconsistent treatment of Arabic words in Persianate languages edit

Romanization pages for Mandarin and Cantonese - possible update task for a bot? edit

User TheNicodene - revert war to hide unresolved abuse edit

Macedonian: standard, non-standard, misspelling edit

Arbitration edit

Wiktionary:Etymology edit

Wording of RFD banner edit

Proposal for new parameter in linking templates: "alternative script" edit

Splitting Hebrew roots? edit

Adding DRAE links to all Spanish lemmas edit

The phrasebook is in dire need of rules. edit

Major opportunity for us to step in for word of the year edit

New SOP policy idea edit

Voting to elect members to the Movement Charter drafting committee is now open (October 12 - 24) edit

Wiktionary:Votes/2021-10/Standardising wording for showing cognates edit

Bot to generate Spanish forms edit

bot edit

Effect of Apple’s iCloud Private Relay edit

Over-eager abusefilter rule? edit

Automatically generating form-of entries? edit

Help needed for bor cleanup edit

Using Template:head to populate Category:English N-letter words edit

Talk to the Community Tech edit

Old Prussian Macrons edit

Reference Placement edit

Category:Proto-Chinookan language edit

Vote: New page-protection level edit

Option 1: Extended confirmed user group edit

Support edit

Oppose edit

Abstain edit

Option 2: Autopatroller user group edit

Support edit

Oppose edit

Abstain edit

Decision edit

How to deal with this case? (pitch-accent paradigms in Lithuanian) edit

I'm thinking of making more navbox-style templates for topics. Thoughts? edit

Example edit

What is Wiktionary:Requested_entries_(Swiss_German)? edit

Template idea: {{der?}} edit

Best way to nominate a list of pages for deletion edit

Deletion of Wiktionary:Wanted entries edit

Template idea: `{{der?}}` edit