Open main menu

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019


Contents

September 2019

Unified approach for Korean hanja entriesEdit

Using the entry at (ju) and (su) as an example:

  1. Should we have separate etymology sections for every hanja in hangeul entries? Many of these are only used as affixes rather than unbound morphemes, and some entries such as (i) can be assigned to as many as 250 hanja.
  2. Would it be better to set a criteria, e.g. only create individual etymology sections at hangeul entries for basic hanja or for those that have entries in major Korean dictionaries?
  3. Where would Sino-Korean compounds be listed to prevent duplication of content? At the hangeul entries or the hanja entries? KevinUp (talk) 00:28, 1 September 2019 (UTC)
My answers are that:
  1. Too many separate etymology sections for every hanja in hangeul entries are redundant.
  2. Basic Hanja for educational use should suffice. Others are much rarely used.
  3. Duplicating Sino-Korean compounds at the hangeul and the hanja entries is even more important if Chinese, Japanese, or/and Vietnamese share the same scripts in compounds, like 日本 (Japan).--Jusjih (talk) 23:34, 17 October 2019 (UTC)

Merge Middle Korean hanja and modern Korean hanjaEdit

Modern Korean dictionaries do not distinguish between Middle Korean hanja and modern Korean hanja. Using the entry at 顋#Korean as an example:

  1. Shall we merge Middle Korean hanja and modern Korean hanja under a unified Korean header using the format of ?
  2. Is the {{hanja form of}} template suitable for the definition line of such entries?

Note that hanja is used more frequently in Middle Korean literature compared to modern Korean literature, but readings are only available in modern Korean because they are not explicitly stated in Middle Korean literature.

Please state here if you oppose a unified approach for Korean hanja entries. KevinUp (talk) 00:28, 1 September 2019 (UTC)

Your example may not be very good to merge. I speak Korean only in very basic level, so I advise asking native Korean speakers knowing hanja.--Jusjih (talk) 23:37, 17 October 2019 (UTC)

Article layout revisitedEdit

Previous discussion: Wiktionary:Beer parlour/2018/November#confusing article layout, Wiktionary:Beer parlour/2016/November#Rethinking the approach to the presentation of senses

As of 2019, what are the community's thoughts on an approach similar to User:Wyang/zh-def?

I like the distinct background color which makes definitions easier to find. Some languages (not all) may benefit from a single "definitions" header.

Currently, Chinese Han character entries which uses a single "definitions" header does not indicate whether a particular definition is a "noun", "verb", "particle", etc. and would benefit from proper categorization.

Comments are welcome. KevinUp (talk) 03:20, 1 September 2019 (UTC)

I support the layout 100%. I don't support putting everything on a page into 1 template. DTLHS (talk) 03:24, 1 September 2019 (UTC)
Putting everything on a page into one template - This would affect only the definitions. Other templates can still be used within this "definitions" template. KevinUp (talk) 07:10, 1 September 2019 (UTC)
I generally like the layout or at least an approach that is more beautiful and I also like having data structured in templates. I do not like expanding the width 100% (e.g. what happens with pictures or other media?) and having things collapsed--this is not accessible to users. —Justin (koavf)TCM 04:12, 1 September 2019 (UTC)
Yes, the collapsible approach is perhaps not that practical. Some of us might be looking for something specific and collapsing everything would cause some information to be hidden when CTRL+F is used. KevinUp (talk) 07:10, 1 September 2019 (UTC)
Not handy for search and basic display but also not useful for users who have scripts disabled or who use screen readers/text browsers or who have certain sensory motor issues that make tapping on a million links to display content on a page a real chore. —Justin (koavf)TCM 07:15, 1 September 2019 (UTC)
Well, we could apply visibility options such as "Show derived terms", "Show quotations" similar to what we currently have on the desktop site. KevinUp (talk) 08:02, 1 September 2019 (UTC)
Sure, but I am opposed to all of the collapsing content that we have now for the same reason. To be sure, entries like set or a are going to be long: that's the nature of those sorts of entries. Making things inaccessible with collapsing content (even for Finnish declensions that I am never going to look at, let alone understand) is just bad practice. JavaScript is great but it shouldn't be mandatory for interacting with basic text like this. —Justin (koavf)TCM 08:56, 1 September 2019 (UTC)
@Koavf: Is the collapsible content inaccessible without JavaScript? My impression is that it only disappears when the JavaScript code runs. — Eru·tuon 16:19, 1 September 2019 (UTC)
@Erutuon: Turn off scripts and everything is expanded by default (which is good). Non-script users will have no problem seeing this content. —Justin (koavf)TCM 17:27, 1 September 2019 (UTC)
@Koavf: Hmm, I thought you were saying that users without JavaScript wouldn't be able to see collapsible content; maybe I misread you. I think collapsible content is collapsed by default for new visitors. What if it were expanded? Then users who have difficulty with the buttons wouldn't have to click anything to see content, but would if they wanted to be able to scroll more quickly. Perhaps it would be optimal if various categories of content were shown or hidden based on which state would lead to less clicking, but I don't know how to get that information. — Eru·tuon 17:56, 1 September 2019 (UTC)
You did not: I was just sloppy. Basic functionality shouldn't be based on scripts unless it's really something dynamic. The site we have doesn't include interactive elements like a game or anything that really needs to change state in front of someone's eyes or based on his inputs: it's a reference work made up of text with some accompanying media. Scripts just to collapse things that are a mild nuisance to scroll past are just a bad idea. It's generally easier to hit "Page Down" or smash the space bar a couple of times (these don't require very fine motor skills) to go past something you don't care about than it is to tab over to the little arrow that will expand the box or click on it. Finding data would be difficult and informative but I would still be in favor of not hiding anything that is the actual content of the dictionary (but I'm fine with the option of allowing it to be collapsed based on user interaction or preferences--unfortunately, our "expand all declension tables" preferences don't stick around at the moment.) —Justin (koavf)TCM 18:21, 1 September 2019 (UTC)
I would also prefer for lists (not tables) to be expanded by default with an option to hide it if one wishes to do so. KevinUp (talk) 18:28, 1 September 2019 (UTC)
@Koavf: When you click the "show x" or "hide x" buttons in the "Visibility" menu in the sidebar, the resulting state is saved in your browser. It's not saved on a per-user basis though; do you mean that the state changes when you switch between browsers? — Eru·tuon 20:02, 1 September 2019 (UTC)
@Erutuon: No, using the same browser, it eventually goes away as a preference. It would be better if it were an actual user preference. —Justin (koavf)TCM 20:32, 1 September 2019 (UTC)
@Koavf: That sounds like a bug. The setting is saved in localStorage (source code in MediaWiki:Gadget-VisibilityToggles.js), so it shouldn't go away. I am not sure how to add it in Special:Preferences if that's what you mean. One difficulty with having a checkbox for each category of visibility toggle is that there isn't a set number of categories (synonyms, translations, inflection, derived terms, etc.); they are generated based on section headers or the contents of HTML tags in the parser output. (In MediaWiki:Gadget-defaultVisibilityToggles.js, the category is the first argument to window.VisibilityToggles.register.) — Eru·tuon 20:49, 1 September 2019 (UTC)
Disgusting. Hard no. --{{victar|talk}} 18:06, 1 September 2019 (UTC)
Could potentially go for something like this. It's hard to judge from a Chinese entry since I don't understand that language. We would also need to be careful about what we hide/collapse by default and what we don't (and possibly tie that into individual user settings). Oh yes, and I agree with whoever made a fuss about JavaScript-less users. It should remain readable in Lynx etc. (it doesn't have to be beautiful, as long as we show all the content to those clients rather than some unusable JS placeholder). Equinox 10:00, 5 September 2019 (UTC)
Additional point I just remembered: quite a large number of people are colour-blind (in one way or another) and it's hard to find sets of colours that will suit all those different colour-blindnesses. With graphs and charts, you can ameliorate this by using texture (red spots, blue stripes, green crosses), but with text you can't do a lot. So we shouldn't rely on colour alone to indicate anything: it should only be a bonus hint, and also needs to have strong contrast with the background. Equinox 11:55, 5 September 2019 (UTC)
I do like the look of this, I would like to see an expanded version which would demonstrate how other key parts of an entry would be handled (e.g. etymology, translations). While I don't like using a single uber-template for this sort of thing, the benefits of having the data in the entry organized in a machine-readable manner may outweigh the costs of such a method. - TheDaveRoss 11:58, 5 September 2019 (UTC)

Code for comparisonEdit

New code
{{zh-def
|n|[[sugar]]
|syn: 食糖
|ant: 鹽
|x1: {{zh-x|糖尿病|[[diabetes]]}}
|x2: {{zh-x|糖{tong4}水|[[sugar water]]|C}}
|-
|n|[[candy]]; [[sweets]]
|mw: m:塊-“piece”,c:嚿-“piece”
|syn: 糖果
|x1: {{zh-x|棒棒糖|lollipop|C}}
|x2: {{zh-x|糖 食 得 多 冇益。|Eating too much '''candy''' is unhealthy.|C}}
|-
|n|{{zh-alt-form|醣|[[saccharide]]}}
|lb: organic chemistry
|x1: {{zh-x|多糖|polysaccharide}}
}}
Current code
# [[sugar]]
#: {{zh-x|糖尿病|[[diabetes]]}}
#: {{zh-x|糖{tong4}水|[[sugar water]]|C}}
# [[candy]]; [[sweets]] {{zh-mw|m:塊|c:嚿}}
#: {{zh-x|棒棒糖{tong4-2}|lollipop|C}}
#: {{zh-x|糖 食 得 多 冇益。|Eating too much '''candy''' is unhealthy.|C}}
# {{lb|zh|organic chemistry}} {{zh-alt-form|醣|[[saccharide]]}}
#: {{zh-x|多糖|polysaccharide}}

====Synonyms====
* {{sense|sugar}} {{zh-l|食糖}}
* {{sense|candy}} {{zh-l|糖果}}

====Antonyms====
* {{sense|sugar}} {{zh-l|鹽}}



















I copied the code from User:Wyang/zh-def#Code so that other users can comment on the approach rather than the appearance.

Some languages (not all) may benefit from such a structure. KevinUp (talk) 18:28, 1 September 2019 (UTC)

I have a feeling this styling can be done with CSS and JS, rather than having to put so much load on Lua modules. —AryamanA (मुझसे बात करेंयोगदान) 01:35, 11 September 2019 (UTC)
I love it! Even if without hide/show, if everything is shown, it is great. I like the little buttons: Synonyms, Example... sarri.greek (talk) 09:36, 12 September 2019 (UTC)
Please don't do this. It will be a maintenance and (editor) usability nightmare. Individual templates are easier to understand, composeable and potentially cacheable. The proposed solution nests templates and has parameters inside parameters, with its own syntax. Also, I don't understand the point of "this would affect only the definitions" – definitions make up the bulk of the dictionary. Instead of moving the data into templates we should be looking at moving data to a Wikibase instance (in the long term). Jberkel 17:52, 17 September 2019 (UTC)

Moving forwardEdit

If this is going to go anywhere at all, I feel that we need to put some work into creating several hundred examples (with complex entries) of the proposed format: pages with multiple etymologies, pages with multiple pronunciations, pages with a single sense, inflected entries. Otherwise it's impossible to see the edge cases and the potential amount of effort it will take. DTLHS (talk) 17:20, 17 September 2019 (UTC)

Is there any way to do this without using a module to do the heavy lifting? If not we should test very large entries as well since we run into Lua errors frequently and this will potentially exacerbate that issue. - TheDaveRoss 17:25, 17 September 2019 (UTC)
We don't know until we have actual examples to work with. DTLHS (talk) 17:57, 17 September 2019 (UTC)
I think that the Lua memory issue has gotten out of control. I'll point this out to meta:Community Tech when the 2020 version of meta:Community Wishlist Survey 2019 is available. KevinUp (talk) 18:24, 17 September 2019 (UTC)
Overall, the comments regarding this proposed format are positive. The colors will need to be tweaked and collapsibility made expanded by default. Anyway, the closest example we have for the appearance of entries using this format can be found at entries such as かん (kan) and とうきょう (Tōkyō). This is just an example of how entries might look in future if we decide to implement such an approach. KevinUp (talk) 18:24, 17 September 2019 (UTC)
There's definitely a long way to go before this actually gets implemented. We could perhaps test this out with Chinese Han character entries, which has already replaced the parts of speech header by a single definitions header. (I would like to see more precise categorization of Category:Mandarin nouns, Category:Cantonese nouns, etc.) KevinUp (talk) 18:24, 17 September 2019 (UTC)
I am not talking about changing anything in the mainspace. You should create examples in your own user space. And especially you need to create examples with more than just Japanese and Chinese entries. DTLHS (talk) 18:27, 17 September 2019 (UTC)

Requesting language code for Middle JapaneseEdit

Previous discussion at Wiktionary:Beer parlour/2018/February#Middle Japanese, https://en.wiktionary.org/wiki/User_talk:Poketalker#Template_%7B%7Bbor%7Cja%7Cltc%7D%7D

Category:Japanese language currently lacks an ancestor, Middle Japanese, which can be further broken down into:

  1. Early Middle Japanese (800 to 1200AD)
  2. Late Middle Japanese (1200 to 1600AD)

This is because there are no ISO language codes for Middle Japanese. Therefore I would like to propose three new language codes for:

  1. Middle Japanese - ja-mid
  2. Early Middle Japanese - ja-mid-ear
  3. Late Middle Japanese - ja-mid-lat

By having these language codes we are able to create categories such as:

  1. Category:Middle Japanese terms with quotations
  2. Category:Middle Japanese reference templates
  3. Category:Early Middle Japanese terms borrowed from Middle Chinese
  4. Category:Late Middle Japanese terms borrowed from Early Mandarin

Technical considerationsEdit

These three languages can be designated as etymology-only languages because Middle Japanese is already merged with modern Japanese based on current practices. KevinUp (talk) 03:20, 1 September 2019 (UTC)

The etymology language codes can be created, but unfortunately categories starting with an etymology language name aren't supported. That is, templates can't categorize into them and there aren't category boilerplate templates for them. For instance, {{der}} only accepts an etymology language code as its second parameter (the language from which a term is derived), not first. Changing this would at least allow for more specificity in etymologies.
Middle Japanese can't treated as an ancestor of Japanese if it is an etymology language with Japanese as its parent. It doesn't make sense for a language to descend from a subvariety of itself. (That sort of relationship makes Module:family tree crash with a stack overflow, and it breaks the link to further-back ancestors. I tested this by making grc-koi, Koine Greek, an ancestor of grc, Ancient Greek, and previewing some pages. In ἐπί, Ancient Greek was not seen as a descendant of Proto-Indo-European anymore. I suppose this would be fixed by giving the etymology language an ancestor, though.) It would make sense for Modern Japanese (an etymology language) to descend from Middle Japanese, though without a category for Modern Japanese terms inherited from Middle Japanese, this relationship would only be used in family trees, if Module:family tree would display it. — Eru·tuon 08:03, 1 September 2019 (UTC)
Thank you for looking into this. If we can't create categories for etymology-only languages, I think (1) Middle Japanese will have to be designated as a full language code with Old Japanese as its ancestor and Japanese as its descendant. As for (2) Early Middle Japanese and (3) Late Middle Japanese, these two languages can be set as etymology-only languages with Middle Japanese as their ancestor.
Meanwhile, Category #3 and #4 above can be replaced by Category:Japanese terms derived from Middle Chinese and Category:Japanese terms derived from Mandarin (derived instead of borrowed and not that specific). KevinUp (talk) 18:28, 1 September 2019 (UTC)
At the moment, Middle Japanese can only have scripts if it is given a full language code. In any case, if it were an etymology language, its scripts couldn't be used anywhere but in etymology templates.
With Middle Japanese as an etymology language, Module:etymology would currently allow Category:Japanese terms derived from Middle Japanese (not to express an inheritance relationship, but the situation where one language borrowed from a second language, which borrowed from a subvariety of the first language), but would not allow Category:Japanese terms inherited from Middle Japanese, because it resolves an etymology language to its parent before checking that the first language can inherit from the second. So "Japanese inherited from Middle Japanese" is resolved to "Japanese inherited from Japanese", which the module objects to. If Middle Japanese is not made a full language, two ideas: allowing a term in one language to be inherited from a subvariety of the language, or allowing etymology languages in both positions of the derivation relationship (Category:Modern Japanese terms inherited from Middle Japanese instead, which makes more sense than Category:Japanese terms inherited from Middle Japanese). — Eru·tuon 21:23, 1 September 2019 (UTC)
I moved your comment below up here in case you haven't read my reply above. I think Middle Japanese will have to be made a full language, like how Middle Chinese is made a full language to avoid the complications you mentioned above. KevinUp (talk) 21:36, 1 September 2019 (UTC)
mid is the code for Mandaic, so mid-anything is not appropriate for a code for anything but a variety of Mandaic.--Prosfilaes (talk) 19:25, 1 September 2019 (UTC)
Thanks for pointing this out. I've changed the proposed language code to ja-mid instead. KevinUp (talk) 21:36, 1 September 2019 (UTC)
  • Various thoughts.
  1. Will we also create a code for Early Modern Japanese? Broadly speaking, "modern Japanese" can be dated from around the mid-to-late-1800s with the fall of the Edo Shogunate and the rise of the Meiji, the opening of the country and the influx of foreign words and concepts, the repurposing of existing words for new meanings, and the deliberate forging of new vocabulary in an attempt to modernize and standardize the language.
  2. Do we really need to make these new codes into full-fledged, separate and distinct languages, with their own entries and template infrastructure and the like? This seems like the wrong way to work around what seems to be a minor technical issue with the etym inheritance implementation.
I'll hazard a guess to say that most of the entries that we might put in the proposed new "language" headings for Early and Late Middle Japanese would be duplicating content from our modern Japanese entries. The main differences come down to things like sense development (such as ありがとう (arigatō) shifting from "in a manner difficult to exist" to "in a manner difficult to bear" to "welcome" and then the modern "thanks" sense), phonetic realization (such as /je/ and /we/ merging ultimately into /e/) and conjugation patterns (like the 下二段 (shimo nidan) lower bigrade conjugation pattern flattening out into the 下一段 (shimo ichidan) or modern lower monograde pattern). I feel much more comfortable trying to explain all of this in the context of "Japanese", rather than duplicating entry data across multiple different language headings, especially as the older senses and sometimes even conjugations are still used. I'd also like to point out that monolingual sources treat Middle Japanese as a matter of footnotes and formatting within entries for the modern language, rather than as a distinct entity.
‑‑ Eiríkr Útlendi │Tala við mig 23:03, 3 September 2019 (UTC)
  1. @Eirikr: Yes, I think it would be a good idea to create a code for Early Modern Japanese called ja-ear set as an etym-only language with Category:Japanese language as its ancestor. By doing so we can have categories such as Category:Chinese terms borrowed from Early Modern Japanese.
  2. The early and late varieties (Early Middle Japanese, Late Middle Japanese, Early Modern Japanese) will not be having their own entries and template infrastructure because these languages will only be used in the etymology section to display statements such as "From Early Modern Japanese X, from Late Middle Japanese Y", etc. to reflect sound or spelling changes.
As for Middle Japanese, it will be made a full language so that we can use the language code in templates and quotations within the Japanese section. I agree that some of the older senses and conjugations are still used in written Japanese so it is not necessary to create a separate entry for Middle Japanese. Middle Japanese can be merged into Japanese like how monolingual dictionaries treat the language. {{ja-see}} can be used to redirect entries with archaic spelling to the modern spelling. KevinUp (talk) 22:27, 4 September 2019 (UTC)
Okay, so the proposal is to have Middle Japanese as a full language but with no entries of its own? At the moment that means that Middle Japanese links would go to the Middle Japanese section, not the Japanese section as intended. Perhaps Module:links could be made to direct Middle Japanese links to the Japanese section. It would complicate linking in other modules because they couldn't rely on the section name being the canonical name anymore. — Eru·tuon 02:16, 5 September 2019 (UTC)
Yes, the plan is to have Middle Japanese as a full language with no entries of its own, similar to how Middle Chinese is unified with Chinese. The linking problem is an issue for languages that use such an approach. For example, all the hanzi entries in Category:Cantonese nouns link to TERM#Cantonese rather than the correct form TERM#Chinese. One way to overcome the linking issue for Middle Japanese is to periodically search for the following:
  1. {{l|ja-mid|TERM}} → convert to {{ja-l|TERM}} {{q|Middle Japanese}}
  2. {{m|ja-mid|TERM}} → convert to {{ja-mid-inline|TERM}} (new template similar to {{okm-inline}})
  3. {{cog|ja-mid|TERM}} → convert to {{cog|ja-mid|-}} {{ja-l|TERM}}
  4. {{desc|ja-mid|TERM}} → convert to {{desc|ja-mid|-}} {{ja-l|TERM}}
This is of course, an inefficient way to deal with this issue, but it is not uncommon to have links that link to nowhere, For example, I often click on Middle French links that only have a French section. I wonder if there's a way to identify links that already have a page but lack an entry in the target language so that false positives can be identified. KevinUp (talk) 03:15, 5 September 2019 (UTC)
Yeah, actually Jberkel's "wanted" lists check for that. For instance, quite a few of the links in the Serbo-Croatian list go to pages that already exist. So that's good, it won't be too hard to clean up the links. — Eru·tuon 03:30, 5 September 2019 (UTC)

Practical considerationsEdit

Pinging also @Dine2016, Eirikr, Poketalker, Suzukaze-c, TAKASUGI Shinji to inform them about this proposal.
  1. Currently, we have quotes from Nippo Jisho (日葡辞書) which are written in Latin script. Shall we add Latin as one of the scripts for Middle Japanese along with the Japanese script?
  2. Any thoughts on adding entries into Category:Japanese terms inherited from Middle Japanese after the language code is available? KevinUp (talk) 18:28, 1 September 2019 (UTC)
    This would include pretty much everything that is not a modern coinage or borrowing. I'm not sure about the utility / usefulness / use case for this category. See my comment above about keeping this within the context of "Japanese". ‑‑ Eiríkr Útlendi │Tala við mig 23:03, 3 September 2019 (UTC)
    Yes, this category would include all terms that existed in pre-modern literature. Perhaps some other category such as Category:Middle Japanese terms borrowed from Middle Chinese would be more useful. Lemmas can be put into this category if quotations of the Sino-Japanese term can be found in Middle Japanese. KevinUp (talk) 22:27, 4 September 2019 (UTC)
3. What shall we do with the following entries?
  1. かはす#Middle Japanese
  2. かはる#Middle Japanese
  3. かふ#Middle Japanese
  4. かへす#Middle Japanese
  5. かへる#Middle Japanese
  6. かめ#Middle Japanese
Shall these entries be merged into Japanese? KevinUp (talk) 21:54, 3 September 2019 (UTC)
@Poketalker When you have the time, please take a look at these entries and merge it with the modern form. KevinUp (talk) 22:27, 4 September 2019 (UTC)

Creation of language codesEdit

@Erutuon, Eirikr Shall the following language codes be created?

Language name Proposed code Remarks Ancestor Status
Middle Japanese ja-mid Full language Category:Old Japanese language
Early Middle Japanese ja-mid-ear Etymology only Category:Middle Japanese language   Done
Late Middle Japanese ja-mid-lat Etymology only Category:Middle Japanese language   Done
Early Modern Japanese ja-ear Etymology only Category:Japanese language   Done

KevinUp (talk) 06:34, 25 September 2019 (UTC)

Update: I've created the languages codes for ja-mid-ear, ja-mid-lat and ja-ear. KevinUp (talk) 19:13, 27 September 2019 (UTC)

Category:User_la-5Edit

This category was created by a single user who grossly overestimates their skill in the Latin language - they haven't managed to even correctly write the description, although they refer to themselves in it in the singular. There is no legitimate need for this category any more than there is a need for Category:User_en-5. I propose that it be deleted. Brutal Russian (talk) 13:48, 3 September 2019 (UTC)

I missed that; that’s really gross. The custom one on the author’s, Aearthrise’s, user page is likewise horrifying. He does not even inflect … Fay Freak (talk) 00:31, 4 September 2019 (UTC)
LOL, though. Mélange a trois (talk) 21:55, 4 September 2019 (UTC)
Was going to suggest the same thing, also the Category:User la-N should be deleted. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 04:22, 5 September 2019 (UTC)
@Brutal Russian:@Fay Freak:,@Mélange a trois:,@Holodwig21: You are incorrect about there not being inflections, monsieur. "Iste usuarius potest contribuere cum cognoscentiā professionalis de linguā romanā; id est forma imperialis, ecclesiastica, et mediævalis" translates, in the tradition of spoken Medieval Latin, this user can contribute with the knowledge of a professional of the roman language; i.e. the imperial, ecclesiastical, and medieval form. My word choice is more free than "classical-only". That said, we should delete the category of User_la-5, since there is no need for the category like en-5. Aearthrise (talk) 17:17, 25 September 2019 (UTC)
@Brutal Russian: And the la-5 category that does exist here I never used; it was a copy of the one on wikipedia Aearthrise (talk) 17:20, 25 September 2019 (UTC)
@Brutal Russian: Now the la-5 category is better written. Aearthrise (talk) 02:21, 26 September 2019 (UTC)
@Aearthrise: While it's somewhat more gooder than the previous version, it still doesn't quite make sense, translating to "These users can contribute about as well as in the speech of the Latin language profession". Brutal Russian (talk) 06:51, 26 September 2019 (UTC)

User_la-x category templatesEdit

(Notifying Fay Freak, JohnC5, Benwing2, Lambiam): @Urszag

Are all written in broken Latin.

  • The verb "contribuere" is not used in the intended sense - nor any other single verb to my knowledge;
  • "usor" in all category descriptions should read "usuarius" as in the template;
  • la-0: Hic usuarius aut nullam aut paucam linguam intellegere potest - should read "Hic usuarius aut nihil aut pauca latine intellegere potest";
  • la-2: "media latinitas" means "medieval Latin" - rephrase as "medius gradus", "satis bene...potest";
  • la-3: "callidissima latinitas" means "most ingenious" or "extremely cunning Latin" - rephrase as "probe ac latine";
  • la-4: the whole phrasing smacks of translationese, should probably say "latine loquuntur pariter~similiter ac/tamquam sermone patrio".

Could someone kindly direct me to the templates that should be edited? In addition, as far as I understand it's important that the phrasing reflect one's active knowledge of a language, and not just passive understanding - is this correct? In that case I'm planning to change the phrasing to "latine scit et scribere potest". My thinking is that due to the general lack of active Latin users people might be rather judging their reading skill - there are more la-3 tagged users than it-3 and ru-3, which I find difficult to believe (it says "speaks fluently" in the Russian template). If anyone has further translation suggestions, they'd be very welcome. Brutal Russian (talk) 14:57, 3 September 2019 (UTC)

Template:User la-0 DTLHS (talk) 14:59, 3 September 2019 (UTC)
And Template:User la-1, Template:User la-2, Template:User la-3, Template:User la-4, and Template:User la. (: Maybe they should be renamed la-I through la-IV. :) The first sense of contribuo at Gaffiot is “to bring in one’s share”, which is similar in meaning to “to contribute”.  --Lambiam 20:13, 3 September 2019 (UTC)
Thank you both. You can get a sense of using a word similar in meaning if you substitute contribute in the English description for any of its synonyms, e.g. "this user can bestow in simple Latin". Brutal Russian (talk) 21:29, 3 September 2019 (UTC)
  • contribuere Hm, on the internet? The English contribute, the German beitragen etc. arguably hadn’t this sense either before Stallman invented it; though I see that this stretches the Latin meaning more (simply put, Latin contribuere does not mean “to contribute”; in German it is more zuweisen, zuschlagen). What do you suggest? We should ponder more how to do GNU propaganda in Latin. But maybe as a Discord user you aren’t much into it.
  • Yes, usor isn’t even a word, except as a scanno for uxor or ūsūrīs etc.; see also Talk:proprietor. Ridiculous.
  • Indeed, media latinitas is Middle Latin aka Medieval Latin.
  • Yeah, the superlative callidissima is off, and evidently translates Romance.
  • Yeah, they tried to translate modern linguistic categories (“native speaker”, “natively” – how would a Roman say?)

When you have invented true Latin formulations, you should not miss to get Meta Wiki and the rest to fix their bad Latin. They have similarly bad phrasings, though not the same. I do not understand where the data is saved on Meta Wiki, possibly it is in software (can somebody find the texts?), but you find the texts displayed via meta:Category:User la. Fay Freak (talk) 22:35, 3 September 2019 (UTC)

  • It's not on Meta Wiki. If I remember correctly, they set up an independent organization that oversees Babel. Chuck Entz (talk) 03:01, 5 September 2019 (UTC)
Vicipædia defends usor as Neo-Latin.  --Lambiam 10:45, 4 September 2019 (UTC)
  • @Fay Freak I don't know whether there is an established word for it (even if a medieval one), I just know that contribuere is not that word - neither could I properly define what exactly it means, I just know it doesn't mean the same as its English or Romance look-alikes. Whatever the GNU propagandists come up with should at the very least be usable intransitively, for instance conferre. Btw, could you elaborate on why Discord users aren't supposed to be into it? =P
  • I've never come across a proficient speaker saying "native speaker" or discussing how to say it - I myself wouldn't outright censure nātīvus locūtor/ōrātor, but I wouldn't endorse it either - I'm not even sure which of the several options S&H give for "native" is the best one. For the time being, I don't think there's a need to use an equivalent to the English expression - the phrasing I suggest in the initial edit looks fine to me.
  • @Lambiam As a fellow Discord Latinist who writes excellent Latin says - and we don't agree in everything, but here we definitely do - if that's what it says on Vicipaedia, it's definitely wrong. They've managed to get the name of the whole language wrong, for Jupiter's sake (it's not Latina any more than it's Italiana, Española, Française etc). I think it remains so broken because people who see that it is aren't the same people who know how to fix it, and the former despair before even trying (that's true for me at least). I wouldn't be surprised if that page is usor's first attestation xD Brutal Russian (talk) 15:55, 4 September 2019 (UTC)
@Lambiam@Benwing2@JohnC5@Fay Freak@Urszag@Brutal Russian It's not incorrect to use contribuere, though its classical meaning of "paying a public expense" or "joining territory" is different than later meanings. This is an example of a medieval Latin meaning of contribuere: Traité de Savone 17 November 1394 ...et exercitus ipsius domini ducis contra illos contra quos dictus dominus dux guerram faciet et habebit, hoc modo videlicet quod ipsum commune et Saône teneantur dare et contribuere ipsi domino duci balistarios centum tantum...|...and the army of the warlord against those whom said warlord may make and will have wars, in this way let it be known that that community and Saône must give and contribute to that warlord only 100 bowmen...; another example is a French-Latin dictionary from 1750 - "Cic. Contribuer, donner, fournir, donner, apporter, attribuer, assigner, former une tribu de, mettre en une tribu.; Contribuere injuriam injuria Sen. Rendre injure pour injure...", it's the same as the dictionary's definition of conferre;;;Don't let your classicist zealousness cloud your ability to understand that Latin is a long-lived language that did change. Aearthrise (talk) 03:33, 26 September 2019 (UTC)
@Aearthrise Using contribuere to mean conferre is classical, albeit peculiar in a figurative way, seeing how far removed this meaning is from the primary one. Using contribuere intransitively is, as far as I can see, neither classically nor medievally correct. You will have noticed that I've even suggested an English approximation of the kind of mistranslation this leads with "This user can bestow in simple English", illustrating how dictionary definitions and synonyms aren't reliable guides to correct usage. If it had been medievally correct but classically incorrect, I'm sure you wouldn't have suggested adopting the medieval usage over the classical one any more than regularly adopting habeō factum to mean fēcī, for precisely the same reasons that are too obvious to need voicing.
The Seneca citation in the dictionary you linked has been dialed via a Chinese telephone - the actual text is "an alterum alteri [sc. beneficium injuriae] contribuere et nihil negotii habere, ut beneficium iniuria tollatur, beneficio iniuria", the relevant part of which the latest Loeb translates as "or ought I to combine the two into one". This is what appears to be the word's most basic meaning, and it has nothing to do either with the meaning we're discussing or the one the French (mis)translation purports - not to mention being transitive. I suggest you use a dictionary at least a hundred years less ancient than this - preferrably the Oxford Latin Dictionary (available at libgen) - and always check the citations yourself. Perhaps then you will find that my judgement of what is and isn't correct Latin hasn't in fact been clouded, but yours clarified. Brutal Russian (talk) 06:29, 26 September 2019 (UTC)
@Brutal Russian I thank you Brutal Russian; I appreciate your laying out the problems of the verb's usage, and the revision of the sources related to it. Aearthrise (talk) 01:42, 28 September 2019 (UTC)

A category for images?Edit

I think it would be useful to have a category which shows which entries (in each language) include images. However I'm not sure if there is a way that can be found to automatically categorise them rather than adding a category manually. Anyway, let's see what the reaction to this proposal is like first. DonnanZ (talk) 13:36, 5 September 2019 (UTC)

Maybe. Why is it useful? Who will use it, for what? Equinox 13:48, 5 September 2019 (UTC)
That's what I want to find out. I for one would make use of it, even to find entries that don't have images and would be better with them. And we seem to have categories for virtually everything else. DonnanZ (talk) 13:57, 5 September 2019 (UTC)
But how would you use a category of images to find entries that don't have images? I think there is no technology to say "list all entries NOT in a category". Equinox 14:11, 5 September 2019 (UTC)
Technology, no. One could notice an entry missing from the category, investigate and possibly rectify the omission if a suitable image can be found on Commons. And if one can't be found creating one from your own work is possible if you know where to find and photograph something suitable, I have been doing that lately. DonnanZ (talk) 14:20, 5 September 2019 (UTC)
Actually, there is a method to search for entries that lack a certain template or entries that are not in a certain category. Try this: KevinUp (talk) 15:36, 5 September 2019 (UTC)
I am not aware of any way to do this (automatically, it could easily by done asynchronously by analyzing database dumps) unless we started putting all images in templates. DTLHS (talk) 17:42, 5 September 2019 (UTC)
A template attached to the image entry is what I am thinking of, as long as it would allow access to the image itself. DonnanZ (talk) 18:39, 5 September 2019 (UTC)
To create a category for images such as Category:English terms with images, a template for images such as {{image|en|File:Carrots with stems.jpg}} will be needed. KevinUp (talk) 22:25, 5 September 2019 (UTC)
Hmm, not what I intended. I would like that to be superseded by the pagename if possible, e.g. {{image|PAGENAME|File:Carrots with stems.jpg}}. DonnanZ (talk) 23:15, 5 September 2019 (UTC)
Then we need to use templates, in place of using Mediawiki code. And the templates can never keep up with the capabilities of Mediawiki to display images. If you think that one way to embed images is the only used here then you are mistaken. Given that words for plants denote both the plant and the fruit it often makes sense to have a gallery showing the fruit maybe in different processing stages, the plant from the near in different seasons (bearing fruits, or when yet only blossoms), the plant from afar. Example: بَلاذُر(balāḏur) which meant the marking-nut before the New World explorations and now means cashew, so it has two, slightly different to German Neugewürz with its descendants. That’s what I expect from a dictionary to tell me what a plant name means.
We also use {{multiple images}}, or mostly only Sgconlaw and I use that, for example on moccasin, for different purposes – I use it to have horizontal grouping if there is space to the left but not enough content in the bottom to show one image under another. Turkish ispinoz / اسپنوز‎: Male and female finch.
This {{multiple images}} is problematic though because it doesn’t let me change relative image sizes so I have to pick roughly fitting ones :/.
It is probably better to only have the category system (if it has any use) but let editors categorize manually with {{cln}}. Fay Freak (talk) 00:14, 6 September 2019 (UTC)
A
The purpose of a category for entries that had images would be what? To review the appropriateness and adequacy of images? One could use search to find them in groups by using searchbox searches like 'incategory:"English nouns" incategory:"English lemmas" insource:/\[\[((fF)ile|(iI)mage)\:/'. One could add searches for galleries and other display generators.
I would have thought that the main problem is finding entries that might need images and also fit with one's topical interest or skills.
One could always use, say, Category:Requests for images in English entries to find a few entries that need images. Intersecting that category with a topical category would narrow the search. One could look for items in such a category that also had {{comcatlite}} or {{commons}} to find some that would be easy to fill. One could also exclude pages that already had images using -insource:/\[\[((fF)ile|(iI)mage)\:/'
A problem with a category like 'English entries without images' is that the category is so large as to not be usable in doing intersection searches in the search box. Another is that it would miss large numbers of definitions that were missing images because the English L2 section already had one definition with an image.
I would think that using {{rfi|en}} and adding "topic=" tags (or equivalent) would enable targeted searches (with or without categorization) for definitions that needed images. DCDuring (talk) 00:41, 6 September 2019 (UTC)
  • I must admit I forgot about {{rfi}}, which I have been able to comply with once or twice, but this isn't added in every case where images are desirable. DonnanZ (talk) 09:12, 6 September 2019 (UTC)
I was thinking of Category:English terms with images as mentioned by KevinUp (above), I feel Category:English terms without images would end up being far too large, and therefore a definite non-starter. DonnanZ (talk) 12:46, 6 September 2019 (UTC)
You can find pages with images using the following searches: insource:/(File|Image):.*(jpg|jpeg|png)/. This can be combined with incategory keyword. (A duplication od DCDuring's information above, oh well.) --Dan Polansky (talk) 10:39, 6 September 2019 (UTC)
There is so much that an active contributor can do with CirrusSearch. I'm reading up on regular expressions to do fancier things. But the "basics" are quite powerful. See CirrusSearch Help. DCDuring (talk) 13:52, 6 September 2019 (UTC)
All this is great to know - we should include it in a Help page like Help:Advanced Wiktionary skills. --Mélange a trois (talk) 21:33, 6 September 2019 (UTC)

Isekiri or Itsekiri?Edit

Discussion moved from WT:Tea room/2019/September#Isekiri or Itsekiri?.

Module:languages/data3/i gives “Isekiri” as the main name for language code its. But “Itsekiri” is much more common.  --Lambiam 03:20, 7 September 2019 (UTC)

This is about the site as a whole, so I moved it from the Tea room, which is for discussing specific entries.
As for the topic itself, I notice that Wikipedia has w:Isekiri language as a redirect to w:Itsekiri language. Chuck Entz (talk) 03:56, 7 September 2019 (UTC)
I might add that its' spelling can lead to some head-scratching when used without quotes in sentences... ;p Chuck Entz (talk) 04:01, 7 September 2019 (UTC)
I support the change. @-sche, in case you want to weigh in. —Μετάknowledgediscuss/deeds 01:16, 8 September 2019 (UTC)
A look at Glottolog makes it seem like the 't'-less form has become more common in more recent works that are specifically about the language; however, in broader literature both exist, and in an Ngram Itsekiri is much more common, as Lambiam says. So, support. - -sche (discuss) 02:28, 15 September 2019 (UTC)

Replacing de-sysop votes with confirmation votesEdit

Please read and comment on this proposal here: Wiktionary:Votes/2019-09/Replacing de-sysop votes with confirmation votes. —Μετάknowledgediscuss/deeds 03:56, 8 September 2019 (UTC)

Wikipedia Moss ProjectEdit

Over at Wikipedia, some of you may be aware, there is a project called Moss which seeks to eliminate a pet hate of mine, tyops. A useful byproduct of this project is that it finds words potential missing Wiktionary words. Below around 100 such entries. The number at the beginning represents the number of occurrences in en.wikipedia. Enjoy and happy editing! --Mélange a trois (talk) 17:27, 8 September 2019 (UTC)

If only WP were valid attestation. This doesn't belong here. We could put it in WT:REE or a subpage thereof, an appendix, or a userpage [sic]. DCDuring (talk) 23:40, 8 September 2019 (UTC)
Technical, moth anatomy
Technical, AUS and NZ, ecology
Science fiction. Primarily Babylon 5 universe, but some other usage.
It would be cute/fun/useful? to have some automated bot-o-matic thingy that would connect that project with this one, along the lines of Wiktionary:Wanted entries. But don't go mad because a lot of them will be rubbish words. Equinox 23:59, 8 September 2019 (UTC)
Unsurprisingly, it's much easier to create long lists of "things that could be words" than to actually create entries. DTLHS (talk) 00:26, 9 September 2019 (UTC)
Very true, but to me that suggests that we just need a way to (permanently? or for some period of years) strike non-words from the record. I've noticed BTW that sometimes the very same word appears twice in WT:REE in successive years, perhaps because somebody forgot they'd added it before. Equinox 00:27, 9 September 2019 (UTC)
How about a list of failed requests, with explanation of why? Chuck Entz (talk) 01:50, 9 September 2019 (UTC)
Yes yes but once again we're suffering from not having any structure to our entries, just big lists and bullets and indents. If we have a list of "words to avoid" who's to say anyone will look at it? If we had a form where you filled in WORD + DEFINITION + SOURCES then we could validate it right off. Equinox 02:05, 9 September 2019 (UTC)

Inconsistent use of qualifiers in translationsEdit

I noticed e.g. here in the Swedish translations, that qualifiers sometimes comes before the terms. TranslationAdder.js always inserts them after the term. I found no guidance in EL.

I would like this to be more consistent so that my input filler script can pick up the qualifiers consistently. I suggest we agree here and then instruct a bot to do the job of moving them right. WDYT?--So9q (talk) 08:39, 11 September 2019 (UTC)

I think that in this specific instance {{sense}} is more appropriate – which IMO should always come before the term. The use of {{sense}} in translations should of course be rare, because there ought to be a single sense per list, but occasionally a target language will have distinctions that are not present in English. I see though that other examples that I can think of also use {{qualifier}}; e.g. at sister we have “Turkish: abla (tr) (elder), ...”, while I feel “Turkish: (elder sister): abla (tr), ...” would be better.  --Lambiam 10:32, 11 September 2019 (UTC)
Another example where qualifier comes first (see german translations). There is also semicolons as separators instead of commas. What a mess.--So9q (talk) 13:45, 11 September 2019 (UTC)
You will not find any consistency here. Translation sections are a free for all. Trying to clean up translations has driven several users off of the project in frustration. DTLHS (talk) 15:27, 11 September 2019 (UTC)
Thanks for the warning. I just found Wiktionary:Translations and it contains nothing about qualifiers or senses to my surprise.
Two questions come to mind:
  • Will we need a vote or discussion about where to put the qualifier or can we agree that always putting it after the term is correct?
  • Can we agree to always use commas between terms and not e.g. semicolons, colons or full stops?--So9q (talk) 13:44, 13 September 2019 (UTC)
You would need a vote since most of the people who add translations probably aren't reading this. DTLHS (talk) 15:26, 13 September 2019 (UTC)
I went ahead creating a vote, see Wiktionary:Votes/pl-2019-09/Qualifiers after terms in translation section.--So9q (talk) 14:10, 26 September 2019 (UTC)
Shouldnt’t it technically be distinguishable by whether it comes before or after commata?
It can also be more complicated. For stubble “short stalks left in a field after harvest” I could for some Slavic languages discern a distinction between the stubble itself and a field of stubble, which latter might likewise been sought and are more commonly used in the respective languages, and I added them inside of the qualifiers. Guess one needs AI for the translation tables.
I support more markup though, so at least those who know can do better. @DTLHS I am myself shocked how many wrong langcodes or uses of {{t}} outside of translation tables by me you had to correct. I believe it is facilitated by the different source code formatting, the fact that in the translation tables the language names are in plain text. If the name was fetched like with {{m+}} the former error would be impossible and the latter less likely because it arises from copying terms to reuse them and deleting the plain text language name but forgetting to replace the template name. The language name is fetched somewhere anyway for the section link, unlike with {{t-simple}}.
(On the other hand one adds more language names than there are codes. Sometimes Christian Palestinian Aramaic, Jewish Palestinian Aramaic, Jewish Babylonian Aramaic under “Aramaic”, example in bed) Fay Freak (talk) 16:12, 11 September 2019 (UTC)
I'm no tech maven, but further automation of the translation tables is likely to have a dramatic negative effect on entry load times for some of our larger entries, especially those with highly polysemous English terms. [[a]] and [[water]] come to mind, but there are others. Maybe someone with good tech foo can come up with Summer-of-Code projects that would help with these. DCDuring (talk) 19:05, 11 September 2019 (UTC)
Please keep the discussion on topic about the use of qualifiers in the translation section.--So9q (talk) 13:44, 13 September 2019 (UTC)
My understanding is that it has always been preferred that qualifiers be after the translations, but the Translation Adder used to insert them before the translation (I think because that was "easier" to code). I recall that there was a short thread somewhere which led to that behavior of the Translation Adder being fixed (it may have been Ruakh who fixed it). - -sche (discuss) 02:38, 15 September 2019 (UTC)
And I recently saw a user (forgotten the name) using either qual or gloss to give a literal English rendering of the linked foreign term! Equinox 02:42, 15 September 2019 (UTC)
Oh, I've seen (and done!) that in a few places. I think there are a few places where it's useful, like devil's beating his wife. - -sche (discuss) 02:47, 15 September 2019 (UTC)
FYI my ImprovedTranslationAdder.js now supports adding literal translations the correct way e.g: {{t+|af|jakkals trou met wolf se vrou|lit=jackal is marrying wolf's wife}}

Should we include non-native audio pronunciations?Edit

I came across a French user's audio file for the English word bicycle and removed it on sight as being unhelpful for English dictionary users, as well as nominating multiple files by the same user for deletion here. Two users including the uploader are arguing to keep the files, and I wanted some Wiktionary editors to weigh in. Maybe I'm wrong in wanting them deleted from Commons since I don't know their policy, but can we agree that there's no place for these in Wiktionary? Ultimateria (talk) 15:11, 11 September 2019 (UTC)

I also want to know if @Derbeth and @0x010C will choose to stop importing such recordings. Ultimateria (talk) 15:24, 11 September 2019 (UTC)
They might be kept, provided that they are sufficiently marked that they are not added automatically by bots. Who knows what one can use them for, maybe to illustrate articles about learning languages. They should not be included in the dictionary. There are enough native accents, it’s only noise. Fay Freak (talk) 16:18, 11 September 2019 (UTC)
User:Fay Freak brings up a possible tangential use of them but that would really be more appropriate at b: or v: (likely at b:fr: and v:fr, specifically) for interactive media teaching someone to speak English. I mean, there are already so many accents and lects of English let alone all of the hundreds of millions (billions?) of non-natives who have such wildly varying accents. I really don't see the value of this here since the goal is to show how a word is used by the community that uses that word. If a word gets adopted into another language, then use that pronunciation for that entry. —Justin (koavf)TCM 17:13, 11 September 2019 (UTC)
I am having a similar problem in trying to delete an incorrect Armenian pronunciation here. Maybe we should block the bots for importing incorrect pronunciations, until their owners learn to maintain blacklists. --Vahag (talk) 18:16, 11 September 2019 (UTC)
Yeah that’s what I mean: They can be kept on Commons as for example usable to illustrate language learning, or erroneous pronunciation, as on Wikiversity, but the files have to be marked in one way or the other so the bot can distinguish. Fay Freak (talk) 18:26, 11 September 2019 (UTC)
Some imports (including the bicycle one) are from Lingua Libre via Commons. Lingua Libre collects all sorts of audio, including non-native pronunciation. The recording metadata has a reference to the speaker (example), which includes language levels, so a bot could only import recordings made by native speakers. – Jberkel 18:35, 11 September 2019 (UTC)
I think not, since the majority of users are likely to want to know the standard (Inner Circle) versions. At least we should ensure we have those before we try for anything more exotic. Equinox 18:21, 11 September 2019 (UTC)
I think we should stop trying to treat Commons names as meaning anything. Just because they aren't useful for us doesn't mean anything for Commons.--Prosfilaes (talk) 00:58, 12 September 2019 (UTC)
I use LinguaLibre to record pronunciations, and admittedly I got a couple of them wrong. For one I even recorded a fart sound, just to see if anyone actually listens to them (they did, and quite quickly removed it from the site). As for non-native pronunciation, it seems obvious that it is not as good as, for example, my beautiful voice. Also, I'd love to hear each word spoken by 10 different accents - Liverpudlian, Scottish, South African, Deep South etc. Hmm, I wonder which word has the most audio pronunciations? I'm sure one of the more geeky users can find out. --Mélange a trois (talk) 09:46, 13 September 2019 (UTC)

On the decline of Urban DictionaryEdit

https://www.wired.com/story/urban-dictionary-20-years/

Not sure how much applies to this online crowd-sourced dictionary effort but it's worth thinking thru some of the problems with UD's methods and ensuring that we don't fall the same fate. —Justin (koavf)TCM 20:42, 11 September 2019 (UTC)

"Where Oxford and Merriam-Webster erected walls around language, essentially controlling what words and expressions society deemed acceptable," really? I find very little value in this article and I don't think the author knows much about lexicography. I guess it points out (indirectly) that there is more to a dictionary than a 2D line between "descriptivism" and "prescriptivism": there are also "dictionaries" that simply invent vocabulary out of the ether ("inventionism"). DTLHS (talk) 20:53, 11 September 2019 (UTC)
Yeeaaahhhh (long drawn-out expression of dubiousness) ... The article author doesn't understand lexicography, and clearly doesn't distinguish between a list of memes, and a dictionary. UD is great if you want some idea of the current zeitgeist for a particular term, but it's useless as, well, a dictionary.
Ah, well. ‑‑ Eiríkr Útlendi │Tala við mig 21:11, 11 September 2019 (UTC)
I would guess the main reason for decline in sites with "user-generated content" is that now there are a lot more users and many of them are young children. A bit like what Eternal September did to Usenet. Equinox 21:19, 11 September 2019 (UTC)
It’s not like we couldn’t need a lot more users. Which decline? All runs fine until one tries to suppress information for politics, or for special rights. Fay Freak (talk) 21:56, 11 September 2019 (UTC)
"Which decline": well, politics aside, you don't think that Pewdiepie screaming and swearing his way through Minecraft, and the typical crop of YouTube comments, are a bit more inane than the quiet intelligent bloggers of the early 2000s? Equinox 22:04, 11 September 2019 (UTC)
No, I think that this is what is left after the legal trammels have grown ever heftier. Regulation everywhere, only some big players can calculate with it, and children who can’t care. If you are a blogger you will possibly drown in cease-and-desist letters because your privacy notice misses some trifle. It’s how at some time there were many producers of CPUs, now there are two – the laws have provided loopholes to eliminate competition, and the desire to become competitive. And in compliance with modern identity politics everyone is triggered by everything and tolerant towards addictions and degenerations thus tame, on top of coming out of schools more damaged than educated, which has always left majorities inane. Education always depended on incalculable details, and the cult of equality has stifled it. Everyone goes to school but learns nothing; everyone converses in cramped networks but may not tread on anyone’s toes; everyone may work but idlers stand at the door to have a share of it. That’s how you nurture the ugly. Fay Freak (talk) 23:41, 11 September 2019 (UTC)
  • FWIW my wife is a teacher and a damn good one, and I disagree with your characterization of "education" with such a broad brush. ‑‑ Eiríkr Útlendi │Tala við mig 21:15, 16 September 2019 (UTC)
  • This comment reeks of a desire to ignore the facts in favour of shoehorning one's ideological agenda in even when it's in opposition to the facts or not even wrong. The real reason for Pewdiepie being vapid when compared to bloggers is that the internet has became mainstream, with the inevitable results; this "mainstreaming" happens to all forms of new media. A decent portion of his audience is literally children; what do you expect? There's still plenty of intelligent content on the internet if you know where to look; it's just that you'd rather pontificate towards a brick wall than have to actually bother to engage with anything. Hazarasp (parlement · werkis) 10:24, 24 September 2019 (UTC)
Are you saying we need more users? I can't parse that. DTLHS (talk) 22:09, 11 September 2019 (UTC)
To answer a question that was not directed at me (pardon, Equinox): we need quality and quantity. A lot of one does not make the project that we want to make. —Justin (koavf)TCM 22:35, 11 September 2019 (UTC)
So they complain about what? That Urban Dictionary depicts harsh reality and does not censor enough or that it has too much joke content and does not censor enough?
Nothing to applaud there. The internet should have dumps, and there should be lawless zones. Urban Dictionary still often has the definition you need that could not pass elsewhere. Fay Freak (talk) 21:56, 11 September 2019 (UTC)
I'm pleased about the fact that the author didn't think Wiktionary significant enough to be worth a mention in his article. --Mélange a trois (talk) 09:40, 13 September 2019 (UTC)
I doubt the author knows what it is lol —AryamanA (मुझसे बात करेंयोगदान) 21:09, 15 September 2019 (UTC)

Clean up of templates for derived and related termsEdit

Hi, I read the previous discussion at Wiktionary:Beer_parlour/2018/November#Titles_of_morphological_relations_templates more or less in its entirety. I suggest we do a thorough clean up of these templates and only keep the one(s) we all agreed on keeping and using.

To be more precise we currently have a whole host of templates in use in our main space:

  • {{col3}} and related ones
  • {{der-top}} and related ones
  • {{rel-top}} and related ones
  • {{Template:User:Donnanz/der3-u}} found here.

I would very much prefer to only have one template left when we are done if possible. All terms that should appear alike should be inserted using the same template with a few different parameters for e.g. title, number of columns, etc.

As an aside I got interested in this topic when browsing on my mobile with the default skin Vector on entries with a lot of derived terms (like rock) and where they were not collapsed by default (see picture of the rendering using the default mobile frontend skin minerva). Terrible to scroll this and apparently no way to collapse. WDYT?--So9q (talk) 09:05, 12 September 2019 (UTC)

Is that really Vector? It looks like the mobile site, which uses Minerva. You can check with mw.config.get("skin") in JavaScript. (Special:Preferences only controls the skin used in the desktop site.) The mobile site just doesn't run many of the collapsibility scripts, only NavFrame. I wouldn't feel confident working on it myself because I never use it. — Eru·tuon 04:37, 13 September 2019 (UTC)
Oh, you are probably right. I would really like a menu on mobile for easily changing skin. Do you know how I could inject JS or HTML to do that?--So9q (talk) 06:19, 13 September 2019 (UTC)
You can change the skin by adding the query string ?useskin=whatever to the URL (for instance, https://en.m.wiktionary.org/wiki/aer?useskin=vector), but other skins aren't designed for mobile so the page just looks a lot like the desktop site. — Eru·tuon 08:37, 13 September 2019 (UTC)
FYI I went ahead and submitted a request for deletion: .--So9q (talk) 22:16, 14 September 2019 (UTC)

Why do we mark race commonalities of English-language surnames?Edit

Pretty much exactly that. It seems a bit strange, given the abstractness and variance of race. Starbeam2 (talk) 01:36, 13 September 2019 (UTC)

I assume you are referring to things like "Aggarwal is most common among Asian/Pacific Islander (94.32%) individuals."? That information was added by a particular user and I have seen other users support removing it, but no action has been taken. DTLHS (talk) 01:37, 13 September 2019 (UTC)
Don't know about surnames but I have wondered about names like Shaniqua, which are not seen outside of black American communities. Well that one has a usage note. Given names are usually chosen by a parent, who belongs to such-and-such a culture. Surnames are a bit different... Equinox 02:13, 13 September 2019 (UTC)
Sure but surnames are also very culture-bound usually. —Justin (koavf)TCM 02:22, 13 September 2019 (UTC)
So what can possibly be more "culture-bound" than given names that only black Americans use, like Shaniqua? I would wager money that no black Briton has that name. It's absolutely part of the culture. Equinox 02:27, 13 September 2019 (UTC)
No one is arguing otherwise. Certainly, the name Shaniqua is an African-American one. Not sure what your point is. Both "Moishe" and "Cantor" are Jewish names. Similarly, "Rodrigo" and "Hernandez" are both Hispanic. —Justin (koavf)TCM 02:29, 13 September 2019 (UTC)
My point is that your immediate parents usually choose your given name, but your surname is usually left alone, and persists for a long time. Saying "Shaniqua is a black name" is an observation about how black Americans tend to name their kids; saying "Goldstein is a Jewish name" is quite another matter: maybe my great-great-grandfather was the last Jew in the family. Equinox 02:32, 13 September 2019 (UTC)
For every "Goldstein" who has a very tenuous connection to the Jewish people, there are 10,000 "Jacob"s who have no relationship to the Jewish people. Personal names are much more likely to not be culture-bound/associated than surnames. —Justin (koavf)TCM 18:59, 14 September 2019 (UTC)
Because it tells you about how English language surnames are distributed, at least in the US. For all the problems with race, it's still a good proxy for ethnic groups in the US.--Prosfilaes (talk) 02:33, 13 September 2019 (UTC)
There's more than one ethnic group per race. Starbeam2 (talk) 19:41, 17 September 2019 (UTC)
I think it's useful having this information. It provides at least vague information about where the name came from, and can be useful to fiction writers who might be trying to find a name that suits a particular demographic. Andrew Sheedy (talk) 02:55, 13 September 2019 (UTC)
I understand the need for demonstrating association, but it does rub me the wrong way how obsessively it shows up. I admit surnames should mention their connotations on the page, but only if it's A) especially prominent and B) not a repeating a general rule. Rule A is for names like Poindexter, which is associated with nerdy people, and Rule B means to exclude Steinberg for Jewish stereotypical names, since it demonstrates the -berg suffix used in stereotypical formations of "[Ashkenazic] Jewish names". Also, the US Census doesn't perfectly reflect how race is seen in the US: as Middle Eastern North African people (MENA), many Hispanics, many Portuguese-descended people, many Latinos, Sephardic Jews, Romanis, Ashkenazic Jews, Armenians, and Kartvelians are considered "White" on legal papers despite it not socially being the case for many of them, especially the first 6 groups. Nonetheless, I don't plan on touching those parts of the pages at the moment. Starbeam2 (talk) 18:47, 14 September 2019 (UTC)
You do realize that the race/ethnicity questions in the census are mostly self-reported. It's quite possible to object to the default categories, but I believe "other" is an option. See infobox for question at w:Race and ethnicity in the United States Census#21st century. DCDuring (talk) 22:14, 15 September 2019 (UTC)
I'm aware, but "other" doesn't always elucidate things, and race is basically decided by society at large not the individual person. Starbeam2 (talk) 19:41, 17 September 2019 (UTC)
I find it useful for researching the etymology. --Vahag (talk) 04:40, 13 September 2019 (UTC)
We do add the etymology often, or at least we try to. Starbeam2 (talk) 19:41, 17 September 2019 (UTC)
I don't think these statistics really belong in a dictionary. —AryamanA (मुझसे बात करेंयोगदान) 21:04, 15 September 2019 (UTC)
I decided to include that information when I added the surnames. I decided to do so for two reasons: I had the information and it seemed lexically relevant. I concur that there are problems with the stats (as has been mentioned above) and that the relevance to a dictionary is not inarguable. I would not strenuously object to their removal if people felt that they don't belong, I would even be willing to remove them myself if that is the verdict. - TheDaveRoss 22:47, 15 September 2019 (UTC)

Requesting AWB/JWB rightsEdit

Hi, I would like to semi-automate tedious editing tasks with JWB. I need an administrator to add my name to the list of approved users.

I promise to be careful and responsible as always in my use of this tool. Thanks in advance.--So9q (talk) 06:03, 14 September 2019 (UTC)

dialectalEdit

Where's the information for the label "dialectal" come from? When meaning Of or relating to a dialect, such a dialect should be added as well as the source, for example in unlight --Backinstadiums (talk) 15:52, 15 September 2019 (UTC)

Our glossary, to which the label (dialectal) links, gives two meanings:
  1. Of or relating to a dialect.
  2. Not linguistically standard.
The latter sense need not be tied to any specific identifiable dialect; it could also be slang or a colloquialism. It may be unfortunate that the label combines these two senses, especially as we also have the label (nonstandard). Many of the Turkish terms labelled “dialectal” can more properly be called “regionalisms”, but the regions in which the terms are current do usually not correspond to a well-defined and named geographic subdivision. Compare the distributions of faucet vs. spigot in the US, where the latter (in the sense of “faucet”) also does not keep to a well-defined border but spills over from Philly into the Midland dialect region.[1]  --Lambiam 16:38, 15 September 2019 (UTC)
"Such a dialect should be added as well as the source" is a counsel of perfection (sense 3). We rarely have very specific information. But it is useful to know that a given definition may not not be generally understood in all places a language is spoken. DCDuring (talk) 18:21, 15 September 2019 (UTC)
@DCDuring: It all depends on the definition of "dialect" to begin with, but if an editor knows a term is dialectal, they must at least know some dialect/region or, for further investigation, add where the info comes from --Backinstadiums (talk) 18:59, 15 September 2019 (UTC)
False. The editor might know that it is regional but be unsure about the region. Also if he writes one region it looks like the term is only from this region. Fay Freak (talk) 19:34, 15 September 2019 (UTC)
What Fay Freak said. DCDuring (talk) 22:11, 15 September 2019 (UTC)
For entries to specify dialects rather than using "dialectal" is a good ideal/goal, but it will take a long time before all the entries currently labelled only as "dialectal" can be labelled more specifically. - -sche (discuss) 19:21, 15 September 2019 (UTC)
This is not possible as a principle. If a term is used only in certain villages and an author uses such a term you do not know which village it points to or whether it is picked up elsewhere. Such can only be solved with dialectological atlantes which are based on surveys and are thus topically restricted. Fay Freak (talk) 19:34, 15 September 2019 (UTC)

I want to add Church Slavonic termsEdit

  Input needed
This discussion needs further input in order to be successfully closed. Please take a look!

I want to add Church Slavonic terms (not Old Church Slavonic), may I? Which code should I use? —This unsigned comment was added by ПростаРечь (talkcontribs) at 06:51, 16 September 2019 (UTC).

w:Church Slavonic and the ISO 639-2 standard says that it uses the same code as Old Church Slavonic, cu.--Prosfilaes (talk) 06:54, 16 September 2019 (UTC)
As I said in my talk page, our convention is to use "Old Church Slavonic" L2 header, which usually just corresponds to "церковнославянский" in Russian sources. It would be wrong to use the language code "cu" and have a header anything but "Old Church Slavonic". The "cu" will add to "Old Church Slavonic" categories, not "Church Slavonic". --Anatoli T. (обсудить/вклад) 10:19, 16 September 2019 (UTC)
@ПростаРечь: I am not well-versed in varieties of Church Slavonic, we may want to have a split, an example ПростаРечь has used is блꙋдити (bluditi) (Church Slavonic language) vs блѫдити (blǫditi) (Old Church Slavonic). ПростаРечь is eager to contribute in "New Church Slavonic" (or simply Church Slavonic) for which we don't have a code and infrastructure. @CodeCat, -sche What do you think? Is the split merited? --Anatoli T. (обсудить/вклад) 11:42, 16 September 2019 (UTC)
In a google you may now find блꙋдити only with diacritic блꙋди́ти ПростаРечь (talk) 11:56, 16 September 2019 (UTC) I only want to add translations for words from the Ostrog Bible.
@ПростаРечь: We haven't been adding stress marks in Old East Slavic or Old Church Slavonic terms, as these are hard or impossible to confirm with certainty and completeness but these may only be verified in this form with stress marks. I've mentioned a possible split in Wiktionary:Requests_for_moves,_mergers_and_splits#Church_Slavonic_from_Old_Church_Slavonic, not sure if it's merited and/or will happen. You can probably do a better analysis of differences. --Anatoli T. (обсудить/вклад) 12:04, 16 September 2019 (UTC)
@Atitarev: I don't want to add stress marks, I only give an example of блꙋдити existence.
@ПростаРечь: Thanks. Острожская Библия (Ostrog Bible) is one of sources for anyone wanting to have a look for an assessment. --Anatoli T. (обсудить/вклад) 12:16, 16 September 2019 (UTC)
@Atitarev:Anyone may also see it in the original ПростаРечь (talk) 12:23, 16 September 2019 (UTC)
(edit conflict) @ПростаРечь: Question. Are you sure, it's not "Old East Slavic" (древнерусский) or old New Russian, rather than "Church Slavonic"? By this time (1581), the Russian language has fully formed and it seems like a mixture of Russian with Old Church Slavonic or just a very ecclesiastic form of older Russian (ru)? When I get accustomed to the fonts, I can actually read and understand it as a Russian text, perhaps with a bit more ease than modern English speakers can read Shakespeare. Sorry, I just can't dedicate too much time at the moment but we need to assess what language it is. (Notifying Benwing2, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Canonicalization): Does anyone want to assess the language of the Ostrog Bible? Is it cu, ru or something completely new? --Anatoli T. (обсудить/вклад) 12:35, 16 September 2019 (UTC)
@Atitarev: The Ostrog Bible doesn't have a polnoglasie "full vocalisation" (Old East Slavic feature) ПростаРечь (talk) 13:21, 16 September 2019 (UTC)
It might be worth considering changing our convention so that the canonical name of cu is Church Slavonic, which we can then divide as needed into dialects such as Old Church Slavonic (= Old Bulgarian?), Serbian Church Slavonic, Russian Church Slavonic, Middle Bulgarian, Bosnian Church Slavonic, Croatian Church Slavonic, and whatever other varieties editors deem desirable. —Mahāgaja · talk 12:26, 16 September 2019 (UTC)
After (edit conflict). Yes, this particular one would be the Russian Church Slavonic, especially for 16th century. It's just too Russian grammatically, even though there are differences. --Anatoli T. (обсудить/вклад) 12:35, 16 September 2019 (UTC)
But the syntax is not what is visible in the dictionary. And this can be seen as Medieval Latin, where the Latin was also too German, too Spanish, too French grammatically, but yet never was Spanish or French. If the endings are like in the Old Church Slavonic original and Old Church Slavonic is still the model intended by writers then this speaks for unity. Also, if we don’t know where to draw the line this also speaks for a more flexible approach with labels. блꙋдити (bluditi) can be added with {{spelling of}} or {{form of}}. Fay Freak (talk) 12:46, 16 September 2019 (UTC)
The main problem with that approach is that pretty much all of our etymologies use cu to mean Old Church Slavonic- are we going to have to add qualifiers to all of them?. Chuck Entz (talk) 12:45, 16 September 2019 (UTC)

May I use "from the Ostrog Bible" label for a while? ПростаРечь (talk) 14:11, 16 September 2019 (UTC)

@ПростаРечь:. Yes, please do for forms not used in other forms of (currently) "Old Church Slavonic" (i.e. words or forms that are specific to this variety and you know it). Please don't use any other language header for now, just "cu", as language codes go with language names and categories. We need to create a new language at Wiktionary to avoid a mess. I think we're dealing with "Church Slavonic" here with the Russian specifics. Technically, it's not a very big deal, I think, just need an agreement. Are you OK to continue using "cu" and "Old Church Slavonic" and a label for a while? We need the community to wake up from slumber!
Please also keep all the discussion here. I don't want to make the decision myself and I'm not so great at creating a new language structure.
I agree with Mahāgaja that we need separate varieties. "cu-r" ("Russian Church Slavonic") seems like a good candidate. Some linguists may cringe but people should realise that what they call using "Church Slavonic" have very distinct flavours on many levels.
Do you agree with creation of a new language code cu-r with a new L2 header "Russian Church Slavonic"? If yes, will start a mini-vote below? --Anatoli T. (обсудить/вклад) 11:27, 17 September 2019 (UTC)
@Atitarev: Russian Church Slavonic (or Russian Synodal recension) is the language of books since the second half of the 17th century, in my opinion. The Ostrog Bible published in Ostroh (Grand Duchy of Lithuania) in 1581 (the 16th century). I would prefer more politically neutral naming unit, e.g. Old East Church Slavonic (or simply Church Slavonic / Middle Church Slavonic for a while) or something like that. ПростаРечь (talk) 11:49, 17 September 2019 (UTC)
@ПростаРечь: Old East Church Slavonic sounds good and it's accurate. I disagree it should be generic, it's distinct from South or West Slavic. We can go with cru code. Starting the vote now. --Anatoli T. (обсудить/вклад) 12:06, 17 September 2019 (UTC)
Just to clarify, I am not suggesting creating a new L2 language code. I am suggesting treating all the varieties mentioned as dialects of cu, which would be renamed "Church Slavonic". Then L2 would read ==Church Slavonic==, and definition lines would include labels like {{lb|cu|Russian Church Slavonic}} (or whatever name we decide on) which would then categorize into a CAT:Russian Church Slavonic (not CAT:Russian Church Slavonic language), which would itself be a subcat of CAT:Church Slavonic language. —Mahāgaja · talk 12:12, 17 September 2019 (UTC)
@Mahagaja: Rather than opposing, can you rewrite the vote, check with User:ПростаРечь and get the ball rolling + revote? I am basically OK with this too. We only have a few people talking. We should be able to agree on something. --Anatoli T. (обсудить/вклад) 12:21, 17 September 2019 (UTC)
@Vorziblix please, don't divide terms from the Ostrog Bible in Ruthenian, Old Russian and so on, otherwise, we risk having an edit war because there are many reliable sources, that contradict each other. Such a sittuation we also have with some Old Dutch texts (e.g. Wachtendonck Psalms, it is hard to determine whether a text actually was written in Old Dutch or in other western Low German dialects). I really want to use a collective name. You may offer such a term. Note: Ivan Fyodorov was the first known Russian printer in the Grand Duchy of Moscow and the Polish-Lithuanian Commonwealth. ПростаРечь (talk) 08:42, 21 September 2019 (UTC)
@ПростаРечь: Apologies. While I agree that calling these recensions Ruthenian/Russian/etc. is not ideal, I do think it’s preferable to use some name that’s seen actual use in academic work. As far as recensions of Middle Church Slavonic go, most papers I’ve looked through seem to agree that there are three or four recensions, viz. a ‘Bulgarian’ recension, a ‘Serbian’ recension, and the East Slavic recensions, which some papers divide into ‘Ruthenian’ and ‘Muscovite’ and some label with a single blanket term, usually ‘Russian’ or ‘Ruthenian’. (See for example Robert Mathiesen (1984), The Church Slavonic Question: An Overview (IX-XX Centuries).) Other terms that occasionally show up include ‘Rusian Church Slavonic’ (with one ‘s’) and ‘East Church Slavonic’. My problem with ‘Old East Church Slavonic’ is mostly the word ‘Old’, which makes it very confusing given that it’s a variant of Middle Church Slavonic and not either OCS or OES, and the fact that it doesn’t seem like anyone else has ever used this term. But perhaps there’s no good solution here. Feel free to change my labels as long as they’re consistent. — Vorziblix (talk · contribs) 12:06, 21 September 2019 (UTC)
@Vorziblix: I would simply use "East Church Slavonic". If there is no objection. ПростаРечь (talk) 12:44, 21 September 2019 (UTC)

Create Old East Church Slavonic with language code cru - a mini-voteEdit

Support
  1.   Support We have a lot of material in this language and it's distinct. --Anatoli T. (обсудить/вклад) 12:06, 17 September 2019 (UTC)
Oppose
  1.   Oppose as mentioned above. I would treat Russian Church Slavonic as a dialect of Church Slavonic, not as a separate language. (And our ad-hoc language codes always begin with an official ISO code, so if we do create a new code for RCS, it should be something like sla-cru or sla-rcs, not just cru, since that is already a deprecated code for a variety of the w:Karu language.) —Mahāgaja · talk 12:12, 17 September 2019 (UTC)
  2.   Oppose Apart from being invented it is needlessly clumsy. Fay Freak (talk) 13:37, 17 September 2019 (UTC)
  3.   Oppose Not ISO compliant. There is already a language with this code as well. —Rua (mew) 17:15, 19 September 2019 (UTC)
  4.   Oppose --{{victar|talk}} 07:57, 25 September 2019 (UTC)
Abstain
  1.   Abstain Bad code, as pointed out by Mahagaja, and unfortunately ad-hoc name. Apart from that I wouldn’t object to keeping the Church Slavonic recensions distinct. — Vorziblix (talk · contribs) 21:48, 17 September 2019 (UTC)

Change canonical name of cu from "Old Church Slavonic" to "Church Slavonic" and create dialect tags and categories for the various recensions of Church Slavonic - a mini-voteEdit

Support
  1.   SupportMahāgaja · talk 12:32, 17 September 2019 (UTC)
  2.   Support This will work as well. --Anatoli T. (обсудить/вклад) 12:39, 17 September 2019 (UTC)
  3.   Support I support, but typographical conventions from this page in such a case should be revised. ПростаРечь (talk) 13:04, 17 September 2019 (UTC)
  4.   Support. Very practical. I assume the recensions and Old Church Slavonic will have their codes like grc-aeo for Aeolic Greek, and we have extra module data for {{lb}}. We have also recently removed “Old Latin”, and I suppose that “Classical Syriac” should also be “Syriac” sooner or later if we respect that people used it in the 19th century or even use it, in so far as Latin is now “used”, as a literary language (we had codes for “Syriac” and “Classical Syriac” but people just got confused and added stuff for the latter under the former). Fay Freak (talk) 13:37, 17 September 2019 (UTC)
  5.   Support, with the proviso that main lemmas should be at the OCS spellings wherever possible, with post-OCS forms entered as alt-form entries. Also note that we already have some later Church Slavonic entries such as телѧ (telę) that should be updated if this succeeds; more can be found by running a search for "later Church Slavonic". — Vorziblix (talk · contribs) 21:48, 17 September 2019 (UTC)
  6.   Support but it's unclear for me how it affects Etymology/Descendants section, because there are 3 cases of Church Slavonic: OCS, NCS0/NCS1 (NCS without/with specified recension).
    • Before: (OCS) "{{cog/desc|cu|WORD}}", (NCS0) "Church Slavonic {{m/l|cu|WORD}}", (NCS1) "Russian Church Slavonic {{m/l|cu|WORD}}"
    • After1: (OCS) "Old {{cog/desc|cu|WORD}}", (NCS0) "[New] {{cog/desc|cu|WORD}}", (NCS1) "Russian {{cog/desc|cu|WORD}}"
    • After2: (OCS) "[Old] {{cog/desc|cu|WORD}}", (NCS0) "New {{cog/desc|cu|WORD}}", (NCS1) "Russian {{cog/desc|cu|WORD}}"
    • After3: (OCS) "Old {{cog/desc|cu|WORD}}", (NCS0) "New {{cog/desc|cu|WORD}}", (NCS1) "Russian {{cog/desc|cu|WORD}}"
    • ("Old" is required to avoid confusion if "New" is default, and vice versa). —Игорь Тълкачь (talk) 02:06, 21 September 2019 (UTC)
Oppose
  1.   Oppose Too great a risk of confusion. We can't count on every editor to label terms appropriately, which reduces the value of Wiktionary for those wanting to distinguish genuine old forms from later inventions. OCS was, at least in its original Bulgarian-Macedonian form, a very close reflection of the local language, which makes it valuable for historical linguistics. The later recensions, especially modern Russian forms, are not particularly useful for that purpose. If we do decide to merge the two, the model of Latin should be followed, with genuinely old forms unmarked and later inventions marked. —Rua (mew) 17:25, 19 September 2019 (UTC)
    We could not count on editors not adding non-old terms as Old Church Slavonic already. Now we want to make it more current for the first time. Fay Freak (talk) 10:48, 20 September 2019 (UTC)
    @Rua Can you offer any solution rather than opposing both options, so should this language (as e.g. in the Ostrog Bible) be ignored and not allowed to have entries? --Anatoli T. (обсудить/вклад) 09:05, 21 September 2019 (UTC)
    I offered the solution of following the Latin model, where "old" is considered the default and later forms/recensions get a context label. —Rua (mew) 10:47, 21 September 2019 (UTC)
    @Rua Basically, as it is now, right? Do you have any concerns at how entries are added now by User:ПростаРечь from the newer Old Russian recensions? --Anatoli T. (обсудить/вклад) 11:06, 21 September 2019 (UTC)
    More or less, but I think "Russian recension" is a better label and will probably be more widely understood, since it's the term used in general studies of CS. —Rua (mew) 15:44, 21 September 2019 (UTC)
    “Russian recension” without further qualification is usually taken to refer to the post-1650s standardized Synodal recension, which should IMO be kept distinct from the earlier (Middle Church Slavonic) forms. (The distinction remains significant even today in that the Old Believers never accepted the Synodal recension and still use the older forms.) — Vorziblix (talk · contribs) 21:27, 23 September 2019 (UTC)
    @Rua But we don’t call later recensions of Latin names like “Medieval Old Latin”. If we keep the L2 named “Old Church Slavonic” the recensions are “Russian Old Church Slavonic” and so on – the labels contradict the header. Also unlike between antiquity and the Middle Ages where history, the Dark Ages create a patent gab, it is not so clear where Old Church Slavonic ends, it slowly degrades. The difference between the vernacular and the literary language was never that great. Still in 18th century Russia one thought the Church Slavonic to be some kind of “High Russian”. And one can never be sure if something is “only late” or rather later authors have preserved something old since we don’t have complete dictionaries of the ancient lect like with Latin. Fay Freak (talk) 14:05, 21 September 2019 (UTC)
    That's a naming issue more than anything. The problem I have is with forms missing yers, or worse, non-native outcomes of certain phonemes that clearly give a local colour to certain words. OCS is by default assumed to be an early form of Bulgarian-Macedonian, and therefore people will use that in historical assessments. If we have words with clearly Russian developments like ъ > o, ǫ > u and ę > ja then it does a disservice to the people using Wiktionary for historical linguistics, because they cannot tell that it's not Bulgarian-Macedonian in origin. We could require that everything be labelled and nothing left to chance, but that's hugely messy when OCS is at its core Bulgarian-Macedonian and the other recensions are basically mixed languages combining true OCS forms with the local language. —Rua (mew) 15:44, 21 September 2019 (UTC)
  2.   Oppose --{{victar|talk}} 07:59, 25 September 2019 (UTC)
Abstain
  1.   Abstain unless we figure out how to map existing uses of cu to cu-old or whatever code is chosen for it. For example, generally when I have entered references to Old Church Slavonic, I use {{cog|cu|...}} or {{bor|cu|...}}/{{der|cu|...}} whereas if I need to enter a reference to e.g. Russian Church Slavonic, I say "Russian Church Slavonic {{m|cu|...}}". If this convention is generally adhered to, we can (maybe) replace all uses of cu in cog/bor/der with cu-old. My concern is that editors have been entering OCS terms using the cu code since that's what its canonical name is, and this info will be lost if we switch the name to just Church Slavonic. Benwing2 (talk) 04:16, 18 September 2019 (UTC)
    @Benwing2 Switching the header by bot and then adding the label “Old Church Slavonic” via {{tlb}}? There aren’t many possibilities thinkable. Some entries might already be not Old Church Slavonic but another Church Slavonic, but by that execution the entries do not become any more wrong. Fay Freak (talk) 10:48, 20 September 2019 (UTC)
    @Fay Freak You're still thinking in terms of entries. The main problem is in etymologies. Right now, {{cog|cu}} displays "Old Church Slavonic", and people have been adding various templates to the etymologies with the "cu" code for many years with the expectation that the display would be exactly that. The second the change is made to the module, all of those etymologies are going to say "Church Slavonic", including those that don't link to any entry. If we don't change all those etymologies to say that it's Old Church Slavonic they're referring to, an etymologically-important distinction will be lost.
    A few of those etymologies may be already referring to later Church Slavonic, but I would imagine that to be very rare. Switching the name would mean that the rate of deceptive naming would switch from almost all okay with a few rare exceptions to almost all deceptive with a few rare exceptions. OCS is a very important language in etymologies, so we need to come up with a solution before going through with this change. Chuck Entz (talk) 21:10, 20 September 2019 (UTC)
    I have not mentioned what I imagine to be then: under the assumptions of existing usage you have made and which I share, one would change these usages to an etymology-language code, cu-ocs or OCS I deem most likely. Then analogously what I have said about entries: Some stated Old Church Slavonic words might already be not Old Church Slavonic but another Church Slavonic, but by that execution the statements do not become any more wrong. It would not switch to “deceptive” in any case. I imagine that for entries the “Old” part goes to {{tlb}} and in links in etymology and descendant sections the code “cu” is replaced with the new code for Old Church Slavonic: Apparently one first creates etymology-only code, then switches cu in etymology and descendant sections to it, then renames L2 sections by removing “Old ”.
    In etymology sections there are by the way probably comparatively few – though from a Slavic viewpoint I can only stress the importance of Old Church Slavonic –, for example I get only 110 hits for the search terms "Cognate to" "Old Church", i. e. any mainspace space that contains the wording “Cognate to” and "Old Church", which does not even comprise only pages which do mention Old Church Slavonic words in the sections in question but also Old Church Slavonic pages. Whereas in Proto-Slavic entries there is virtually always Old Church Slavonic meant; editors have been wary enough to use formattings likeRussian Church Slavonic {{m|cu|...}}, and other editors usually have not added any Church Slavonic term at all.
    I don’t see what I could have missed: If 1. we change cu to return “Church Slavonic” instead of the former “Old Church Slavonic” but in etymology and descendant sections change the occurences of cu that return a name to cu-ocs before and 2. in entry pages we change the L2 headers to have “Church Slavonic” instead of the former “Old Church Slavonic” but in the same bot edit put “Old Church Slavonic” into {{tlb}} unless there are contrary labels (as those ПростаРечь has now deployed), then we do not change any statements. Fay Freak (talk) 01:04, 21 September 2019 (UTC)

Old Church SlavonicEdit

(from the Ostrog Bible)

Declension of swedish uncountable nounsEdit

Do we have a template for that? I tried looking in Category:Swedish_noun_inflection-table_templates but found none that fit. I need it on tull. Thanks in advance.--So9q (talk) 16:37, 16 September 2019 (UTC)

Is tull not countable in the sense of “custom house”? The Swedish Wiktionary lists a plural, not only for the sense tullstation but also for the sense avgift som betalas när vara förs över gräns. Can’t you say tullarna är höga (for which “customs duties” may be a better translation than “toll”, which is more like vägtull )?  --Lambiam 20:35, 16 September 2019 (UTC)
There are two declension templates for uncountable nouns:
 --Lambiam 20:43, 16 September 2019 (UTC)
Yeah, you are right. Thanks for the help. As an aside I asked Gamren to create a new ACCEL template for swedish like he did for danish.--So9q (talk) 20:54, 16 September 2019 (UTC)
@Lambiam: Any idea why they are named irregular? --Lundgren8 (t · c) 21:18, 16 October 2019 (UTC)
I think it is a misnomer; the only “irregularity” I see is the absence of plural forms, so sv-noun-unc-[c|n] should have been enough.  --Lambiam 22:04, 16 October 2019 (UTC)
@Lambiam: I agree, I also find it odd that the default ending is -n/-t rather than -en/et and that the definite form has to be manually written for all uncountable nouns that don’t end in a vowel (which has got to be a majority), e.g. mjölk or arbetslöshet. It should be the other way around in my opinion, so that sv-noun-unc-c yields what’s now in mjölk, and sv-noun-unc-c|ondsk yields what’s now in ondska and then some special case for those on -s like majs (which is now fully manually typed out). (Don’t worry, I’ll bring this up on the appropriate talk page as well.) --Lundgren8 (t · c) 22:15, 16 October 2019 (UTC)

Unhide request entriesEdit

  • I am of the opinion that categories for requests for various things like translations, etymologies, definitions et cetera should not be in “hidden categories”. Now only editors who have opted in in their settings see from the mainspace that there are categories like Requests for etymologies in Russian entries. If they were displayed then users who don’t know about them but are inclined to solve them could be lead into much-needed partipication.
  • A related issue is that the etymology request entries however are cluttered. Category:Requests for etymologies in Latin entries counts 3,550, but the bulk is names and one does not see the bigger fish to fry. Since it is likely that one is interested to solve appellative nouns but not proper nouns and on the other hand people who are interested in proper nouns likely want to solve personal names, demonyms, settlement names, hydronyms, and the like asunder, and these are special fields with special sources and dynamics, I propose that we add a parameter to {{rfe}} / {{rfelite}} to sunder the requests into subcategories at least thus far. Fay Freak (talk) 12:55, 17 September 2019 (UTC)
I'm not quite sure what you're driving at. For example click on "edit" for trading post, it has 10 hidden categories, at the bottom "This page is a member of ten hidden categories" - click on that and they are all revealed. DonnanZ (talk) 19:58, 17 September 2019 (UTC)
@Donnanz: The categories as visible under the page, not the editing window. I have a line “Hidden categories” where I find request categories, if a page is in such a category, under every page because I have set it up in my preferences but if people don’t set it they don’t see these categories. I have argued to unhide the requests to optimize user attraction. Fay Freak (talk) 20:18, 17 September 2019 (UTC)
@Fay Freak: No, I don't think that is necessary. Looking in the translations section for trading post one can see a few red-linked entries, so it's obvious there is no entry, as well as languages marked "please add this translation if you can". It's worth mentioning that in some cases blue links can be false, appearing in one language but no entry exists in another language spelt the same. DonnanZ (talk) 20:39, 17 September 2019 (UTC)
@Donnanz: I mean that people could follow the category links to find more pages where translations are needed. Fay Freak (talk) 20:40, 17 September 2019 (UTC)
You can always add {{t-needed|+ code}} for any missing language, which will generate a request. DonnanZ (talk) 21:06, 17 September 2019 (UTC)
@Donnanz: That’s not what I am about. It’s that people should find the category to find the requests. Now it’s hidden. Fay Freak (talk) 03:17, 18 September 2019 (UTC)
I would like to see a category for images (Wiktionary:Beer_parlour/2019/September#A_category_for_images?), but I don't think it's going to happen. DonnanZ (talk) 12:45, 18 September 2019 (UTC)

First attestations in the etymology sectionEdit

I'm interested to know what other editors think of the following format:

  1. Special:Diff/45606179/54157995
  2. Special:Diff/52498442/52718625
  3. Special:Diff/52513026/53352340

Imagine the entry for England#English with the following under the etymology section:

 

Attested in The Canterbury Tales, 14th century, as Middle English Engelond.
Attested in A Looking Glass for London, 1594, as Early Modern English England.
Also attested in The History of England, 1754-1761 as Modern English England.

 

According to Wiktionary:Etymology, etymologies should be brief and include a simple list of previous forms.

I would prefer to see "From Middle lang term1, from Old lang term2, from Proto-lang term3." rather than "First attested in work W, 15?? as term1. Also attested in work X, 16?? as term2. Also attested in work Y, 16?? as term3 ..."

Shouldn't those statements be added as quotations instead? My impression of "first attestation" is that it implies that the word is a newly coined word that first appeared in that particular written work. For example, we have English words first attested in Chaucer and Category:English terms first attested in Shakespeare.

So what shall we do about these etymologies? Move them to the Citations namespace? Continue to add multiple first attestations? KevinUp (talk) 16:27, 17 September 2019 (UTC)

Yes probably they should be added as quotations. But I wouldn't just move them to the citations namespace (unless you actually have quotations) where they will be forgotten about. DTLHS (talk) 16:31, 17 September 2019 (UTC)
The contributor would seem to be following and extending our common practice of burying definitions in favor of alternative forms, pronunciations, and lists of cognates, just to mention what can appear in each L3 section above the definitions. DCDuring (talk) 17:12, 17 September 2019 (UTC)
I don't fault B2V22BHARAT for this formatting, since there was precedent in Korean entries. Ideally, the Middle Korean entries should actually be created, and it should be moved there, IMO. —Suzukaze-c 19:28, 17 September 2019 (UTC)
Yeah, as I recall, this was the previous format and both of us converted it into this. I think the "also attested" parameter was misused because it was originally meant for terms that had slightly different spellings. I think it would be better to add quotations at the Middle Korean entry and indicate only "from Middle Korean X" in the etymology section for a less cluttered appearance. KevinUp (talk) 19:53, 17 September 2019 (UTC)
@Atitarev, Metaknowledge, TAKASUGI Shinji Any comments? I would suggest these two options:
  1. Create proper entries for the attested form in Middle Korean.
  2. Move these statements to the citations page. Suzukaze-c has created some such as Citations:잡다. More can be found here. KevinUp (talk) 02:24, 18 September 2019 (UTC)
    I support making Middle Korean entries — it's always better to document extinct languages rather than write etymologies as if they don't deserve entries. —Μετάknowledgediscuss/deeds 03:50, 18 September 2019 (UTC)

The hidden category Category:Korean etymologies with first attestations that need to be moved to Middle Korean entries has been created for cleanup purposes. I propose we use the following format for the etymology of native Korean words from now on:

  • Generic format:
> {{ko-etym-native}} From {{inh|ko|okm|-}} {{okm-inline|TERM|Yale-Romanization}}
Output Of native Korean origin. From Middle Korean TERM (Yale: Yale).
  • Examples:
Using 잡다 (japda) as an example:
> {{ko-etym-native}} From {{inh|ko|okm|-}} {{okm-inline|잡다|capta}}
Output Of native Korean origin. From Middle Korean 잡다 (Yale: capta).
Using 짧다 (jjalda) as an example:
> {{ko-etym-native}} From [[Modern Korean]] {{m|ko|져르다}}, from {{inh|ko|okm|-}} {{okm-inline|뎌르다|capta}}, {{okm-inline|댜르다|tyaluta}}.
Output Of native Korean origin. From Modern Korean 져르다 (jyeoreuda), from Middle Korean 뎌르다 (Yale: tyeluta), 댜르다 (Yale: tyaluta).

The reason for using {{okm-inline}} is because Middle Korean uses Yale romanization which is different from Revised Romanization used by South Korea for modern Korean. For example, Middle Korean 뎌르다 is tyeluta not dyeoreuda; Middle Korean 잡다 is capta not japda.

And of course, terms such as Modern Korean 져르다 (jyeoreuda), Middle Korean 뎌르다 (Yale: tyeluta), 댜르다 (Yale: tyaluta) deserve their own entries with quotations, not mere mentions in the etymology section.

If anyone is opposed to the usage of this format please state here. KevinUp (talk) 11:20, 19 September 2019 (UTC)

I understand what you're trying to do. However, I don't understand how reconstructed words(or consonant), namely the "Proto-Indo" European words, which has no record, only based on ideas, can be justified in favor of Latin and Greek Cognates, which has actual records.
To be specific, I fully agree on Latin(cor,cordis)---> Heart(Modern English) shift, because there is actual record of cor, cordis on Latin, but I'm not convinced of the kerd--> Heart part, since 'kerd' is merely a reconstructed word, which has no record that your ancestors have used it.
More examples: quod(Latin)--> what, centum(Latin)--> Hundred --> OK.
Kwod(Latin)--> What, Kemtom(Latin)--> Hundred --> I'm not convinced.
Sincerely, B2V22BHARAT (talk) 13:49, 19 September 2019 (UTC)
If you can present to people the actual usage of Kwod and Kemtom(or at least K*-) in other languages, such as German, Portugese, Spanish, French, etc, then I think people including myself will be more easily convinced. B2V22BHARAT (talk) 14:07, 19 September 2019 (UTC)
For example, like this: *kerd-
Proto-Indo-European root meaning "heart."
It forms all or part of: accord; cardiac; cardio-; concord; core; cordial; courage; credence; credible; :: credit; credo; credulous; creed; discord; grant; heart; incroyable; megalocardia; miscreant; myocardium; :: pericarditis; pericardium; quarry (n.1) "what is hunted;" record; recreant; tachycardia.
It is the hypothetical source of/evidence for its existence is provided by: Greek kardia, Latin cor, Armenian sirt, Old Irish cride, Welsh craidd, Hittite kir, Lithuanian širdis, Russian serdce, Old English heorte, German Herz, Gothic hairto, "heart;" Breton kreiz "middle;" Old Church Slavonic sreda "middle."
I don't know why Greek, Hittite and Breton language are chosen as representation of Proto-Indo European language, but at least in this presentation I can somewhat understand *kerd- sense. Sincerely, B2V22BHARAT (talk) 02:20, 20 September 2019 (UTC)
@KevinUp: I think that using {{okm-inline}} is unnecessary, and that {{defdate}} ([2]) is more effort (with less detail!) than using {{ko-etym-native}} as it has been used. —Suzukaze-c 08:42, 25 September 2019 (UTC)
@Suzukaze-c: I think {{okm-inline}} is necessary because of the differences in romanization. Middle Korean 뎌르다 is tyeluta not dyeoreuda; Middle Korean 잡다 is capta not japda. We don't want other editors, especially new editors to end up correcting these.
As for {{defdate}}, it can be omitted if there is only one spelling (optional not compulsory). It is much more useful when there are spelling changes across different time periods. Compare edit1 and edit2. KevinUp (talk) 09:04, 25 September 2019 (UTC)

Images in non-English termsEdit

Hi, I searched the archives in here and WT:EL and found no information about how to handle images for non-english terms.

My questions is whether it is a good idea to include images on a page for every language of a term. E.g. the article bolt has 3 images on the english tab. Would it be a good idea to copy those to be shown also on the Danish, Old English and Norwegian tabs?--So9q (talk) 19:22, 17 September 2019 (UTC)

I don't think it's a good idea to use the same image in every language entry for the same meaning. That is a bit boring. Other suitable images are often available on Wikipedia Commons. DonnanZ (talk) 20:55, 17 September 2019 (UTC)
What makes sense for users who use tabbed languages does not make sense for those not using that gadget. Sometimes you can't both have your cake and eat it. DCDuring (talk) 23:15, 17 September 2019 (UTC)
Yes at least for things that don't usually have a term in other languages (like Finnish kalakukko, a loaf of bread with a fish baked into it — though this example is spoiled by the fact that we seem to count it as an English word too). Equinox 14:52, 21 September 2019 (UTC)
It is a good thing to add varying images – but boring and wasting bandwidth to have the same –, there are so many unused images, and many things can benefit from multiple images, and if all do not fit on the English page we can at least have multiple across languages. Look at оман / oman, elecampane in all Slavic languages, as an example. It would be very silly to repeat the same image, innit. Effectively it’s one dictionary entry for one word, having inflections and pronunciations for multiple languages. Fay Freak (talk) 15:13, 21 September 2019 (UTC)
I don't think wasting bandwidth is a real consideration: including the same image (it's only a link!) in multiple entries is only a few dozen bytes for the markup. We aren't actually making copies of image files. Equinox 15:39, 21 September 2019 (UTC)
Ultimately this is yet another thing that could, in theory, be solved by separating "meanings" from "renderings" (a bit like HTML vs. CSS, heh): if there is a general concept "an apple" and 3,000 languages have words for it, then the image really belongs to the concept and not to the words. I know it ain't that simple but this issue will keep coming up, and those OmegaWiki people seem to have realised it. Equinox 15:41, 21 September 2019 (UTC)
That was thought for the case when one accesses a foreign language entry defined as “X” with an image and then one clicks on the definition only to find the same image in the English entry. This would load the image twice assuming it is not cached. Fay Freak (talk) 16:20, 21 September 2019 (UTC)
But it would be cached since you just came from a page on the same domain containing the same image. Equinox 18:17, 21 September 2019 (UTC)

Three Questions on Hebrew EntriesEdit

Just a few questions: 1) why do we mark words with bekadgefat letters? Aren't those words 100% predictable? Maybe we should mark irregular pronunciations/ stressings instead? 2) Why do latinizations even for monsyllabic words have accents? 3) Do we capitalize proper noun latinizations? This one applies to more than just Hebrew. Starbeam2 (talk) 19:46, 17 September 2019 (UTC)

1) Probably because for beginners it is not yet so predictable, or it might be relevant if a Hebrew word is mentioned in an etymology section of an other language and the reader cannot be expected to know about it. Then the transcriptions include even this detail so they can just be copied. 2) They shouldn’t. Or perhaps they weren’t monosyllables because of lost schwas. 3) Opinions vary. Fay Freak (talk) 20:05, 17 September 2019 (UTC)
1. I mean, all of them have the same six letters each time. Even if could not read said letters, i could probably recognize the individual shapes. 2) Aye aye, guess i'll fix it. 3) I honestly think it should be the case, as latinization is the only time capitalization is required. Furthermore, i have one more question: 4) If i don't know the stress, but i know the pronunciation of a Hebrew word, can i make a latinization without stress and mark it as such? Starbeam2 (talk) 19:40, 18 September 2019 (UTC)

Policy for Tungusic EntriesEdit

Hey all - I've been editing the Tungusic section of Wiktionary for a little while now, and I'm finding it extremely frustrating to add entries correctly or consistently due to the lack of coherence among experts in how they write things in their papers. And even more frustrating is the fact that currently, I have to convert these Latin-script texts into Cyrillic, when many of the languages do not have a clear, defined orthography or conversion protocol from converting between Latin and Cyrillic. What does one do about this? It seems as if each expert has their own system for representing sounds - some use IPA-based transcription, and some use one that represents the underlying vowel harmony rather than the actual phonetic realisation - both have their merits, in my opinion. And due to the patchy documentation available online, it's extremely difficult, and in some cases, impossible to determine how several different systems represent the same word. Then there's the transcription into Cyrillic, which poorly represents the sounds and is not standardised, which I feel leaves a huge possibility of inaccuracy in entries - something which I feel very uncomfortable about. I want to be comfortable that my entries represent exactly what is presented in linguistic journals, which I do not feel is currently possible.

To amend this, I believe that we should use the Latin-script orthographies used in the journals, even though there are several in use; then decide as a community on the Cyrillic standard to be used across all of the Tungusic languages, which accurately represents the information contained in the Latin orthographies, before transliterating the Latin entries as Cyrillic ones. The vast majority of the Tungusic languages do not have a standard form, or any form at all that is widely utilised - the exceptions being the likes of Evenki and Manchu. The Evenki orthography however, is plagued with all the difficulties of the other languages. Manchu, in my opinion, is highly regular and standardised across all circles of experts and their literature; and thus, does not need adjustment. I'd rather, personally make use of several accurate orthographies, than use one without a standard, that I have to convert from Latin myself, and that is full of inconsistencies and inaccuracies.

Then there is also the question of dialects - Tungusic is made up of many dialect continuums, and I feel that these dialects should be represented accurately, distinctly, and clearly, which is currently not the case. We as a community need to decide on the dialect categories for each of the languages and do our best to label each entry with them. This, in my opinion, is a major part of accurately representing the languages as they are spoken/were spoken.

Please do give me your thoughts on this - I'd love to see this resolved so we can increase the quality of our coverage of these absolutely fascinating languages. TheSilverWolf98 (talk) 00:42, 18 September 2019 (UTC)

Since I've not had any replies, I've created a page that lists Oroch words extracted from numerous academic journals, just to illustrate the variation in the ways linguists represent this language. And how little overlap there is between papers in terms of content. Oroch Wordlist. The case is similar to the one presented here for all Tungusic languages except Manchu. TheSilverWolf98 (talk) 01:54, 20 September 2019 (UTC)
The issue of using Latin vs. some more "native" script is a fraught one. I personally favor using Latin transcription for languages without a standardized native script. This includes cases like Moroccan Arabic and maybe Egyptian Arabic, where (especially in the former case) the Arabic script cannot accurately represent the sounds of the language without extra diacritics and such that (in practice) are never used. Benwing2 (talk) 16:26, 20 September 2019 (UTC)
BTW if we do use Latin transcription, I'd much prefer that we pick one of the academic systems in use (probably whichever one is most common or well-documented) and convert all other representations into that one. Otherwise it will be total chaos for users trying to actually read the entries. Benwing2 (talk) 16:28, 20 September 2019 (UTC)
The method of transcription I see most often (probably because there are many articles by him made available online) is the one used by J A Alonso de la Fuente, though Peter Piispanen, Alexander Vovin, Sergei Sarostin, and others use their own transcriptions. Due to the lack of overlap between the papers (in that they all present different items of vocabulary), it is difficult to see how to convert one to another. I personally am a fan of Alonso de la Fuente's transcription system, as it very clearly displays vowel harmony, and makes use of some simple diacritic sets. If you visit my Oroch Vocabulary page, which I linked above, you can see many examples of his transcriptions. Of course, I'd still like others to offer up their ideas on this. TheSilverWolf98 (talk) 01:06, 21 September 2019 (UTC)
I don't think that a unified Tungusic orthography is possible or desirable. Most of these languages have had attempts at literary standards with varying results in usage and official recognition. I would certainly prefer to base ourselves on the literary corpora, scant as they are, over journal articles.
I suggest using the orthography of the dictionary you're primarily basing your entries on. If you don't even have a dictionary, this doesn't bode well for the entries. If you just want to use the language for linking cognates and listing descendants, this can be handled ad hoc by giving only the transcription. The same goes for dialect partition. Crom daba (talk) 00:24, 24 September 2019 (UTC)

Automatically replacing "Foolang {{m|bar|...}}" with "{{cog|bar|...}}"Edit

Hi. I have written and run a script to automatically replace "{{etyl|FOO|...}} {{m|BAR|...}}" and "Foolang {{m|BAR|...}}" (and similar variants) with {{cog|FOO|...}}. Basically, it looks for expressions of this sort preceded by "Cognate with/of/to" or "Cognates include" or "Compare with/to". It is smart enough to handle chains of terms of the sort "Compare Low German {{m|nds|dick}}, Dutch {{m|nl|dik}}, English {{m|en|thick}}, and Danish {{m|da|tyk}}". It is also smart enough to handle etymology languages. When running over the 20190901 dump, it finds 30,692 replaceable cases on 16,441 pages. However, it also finds 1,733 cases where it can't do the replacement due to an unrecognized language name, a language name not agreeing with the code, etc. Some of these cases have to be handled manually, but some can be automated. For example:

  • There are 506 cases of the form "Danish and Norwegian {{m|da|...}}" or "Spanish and Catalan {{m|es|...}}" or similar. How should we handle these?
    1. replace with e.g. "{{cog|es|TERM}}, {{cog|ca|TERM}}" (which duplicates the term — although that isn't necessarily bad — but includes links to both-language variants of the term);
    2. replace with e.g. "{{cog|es|-}} and {{cog|ca|TERM}}" (which preserves the same appearance except with properly linked language names, but doesn't allow for a link to the term in the first language);
    3. replace with e.g. "{{cog|es|TERM|lang2=ca}}" (which requires changes to the implementation of {{cog}} that I could make; we'd have to decide how to display this, e.g. maybe the first language could display as a language name but link to the term);
    4. leave as is.
  • There are 60 cases of language name "Hindustani" followed by the Hindi and Urdu forms, e.g. on پل we have "Hindustani {{m|hi|फूल}} / {{m|ur|پھول|tr=phūl}}". How should we handle this? Maybe replace with e.g. "{{cog|hi|फूल}} / {{cog|ur|پھول|tr=phūl}}"?
  • There are 35 cases of language name "Mooring North Frisian" along with language code frr (North Frisian). There's no etymology language "Mooring North Frisian", maybe we should create this?

Benwing2 (talk) 04:05, 19 September 2019 (UTC)

Generally, I am in   Support of the replacement of "Foolang {{m|bar|...}}" with "{{cog|bar|...}}" if this occurs after the keywords "Compare ...", "Cognate of ...", "Cognates include ...". There will also be some cases of "Compare unrelated ..." that will need {{noncog}} instead. KevinUp (talk) 08:24, 19 September 2019 (UTC)
A few more things:
  • Some users are incorrectly using {{etyl}} instead of {{cog|-}}. I found some (55 entries) by searching for:
The ones that have "Cognate to {{etyl|lang|-}} [[term]]" can be automatically replaced by {{cog|lang|term}} while the rest that have "Cognate to {{etyl|lang1|lang2}}" will need to be hand-checked.
  • This one is totally unrelated, but there are a lot of entries using "[[term]]" instead of {{l|lang|term}} under "Synonyms", "Derived terms", "Related terms", etc. I suppose this can be automatically done by bot? KevinUp (talk) 08:24, 19 September 2019 (UTC)
@KevinUp I have a script to do this. I ran it before on certain languages, mostly languages with non-Latin scripts. It's safer to do that because you can check the script of the link to make sure it's correct, which helps weed out raw links to English terms. Currently it has the properties of certain languages hardcoded in it (the full name and language code, ranges of script characters, and how to strip accents from the link to see whether a two-part raw link can be converted to a one-part templated link), but I'm pretty sure I can get this info from the languages modules. Let me see if I can resurrect the script and get it working on all languages. Benwing2 (talk) 02:52, 20 September 2019 (UTC)
BTW any languages you know of that are particular offenders? Benwing2 (talk) 02:54, 20 September 2019 (UTC)
For non-Latin script languages, raw links are mostly present in recently created entries (2018-2019). As for Latin-script languages, I've come across raw links for Spanish and Italian in older entries. KevinUp (talk) 08:01, 20 September 2019 (UTC)
I just realized that the main offenders are actually English entries. For example, I recently fixed historical method#English which had raw links since 2008 when {{l|en}} was not yet used for semantic terms.
Also, I just came across this Finnish entry which had a lot of raw compound links. I've converted it to {{der4|fi}}, so you might want to run the bot on Finnish. KevinUp (talk) 16:58, 20 September 2019 (UTC)
@Benwing2: Mooring North Frisian is a dialect of North Frisian. Personally, I'd just replace it with "Mooring {{cog|frr|...}}", just as we might write "Australian {{cog|en|...}}". —Mahāgaja · talk 09:26, 20 September 2019 (UTC)
There is also the following layout: Compare [[w:Hebrew language|Hebrew]] {{m|he|רשם|tr=rasham}}. Fay Freak (talk) 10:41, 20 September 2019 (UTC)
@KevinUp I resurrected and cleaned up my script. I ran it for about 20 languages and it replaced 82,574 raw links on 50,400 pages. I then expanded it to 88 languages and reran it, and it replaced another 39,732 raw links on 15,704 pages. I then did a postprocessing run that should have gotten all or nearly all of the false positives (it found about 800 potential false positives, which I checked by hand and fixed as necessary). With a bit more work I could probably get it working on all 7,000+ languages but I think I'm reaching the point of diminishing returns. Note that I purposely didn't do English (because I'm not sure whether it's universally agreed to replace raw English links with templated links), also Chinese, Japanese, and Korean (because I'm not sure whether {{l}} is appropriate for them or if there are language-specific variants that should be used instead). If you have the answer to my uncertainty for any of these four languages, please let me know. Benwing2 (talk) 06:12, 22 September 2019 (UTC)
BTW most of the false positives were due to badly formatted entries in Finnish, Esperanto or Icelandic; not sure why these languages in particular were offenders. Benwing2 (talk) 06:17, 22 September 2019 (UTC)
@Benwing2: Brilliant work done. Thank you for running the script. For English entries, I think this 2016 vote indicates that most editors are in favor of converting raw links to {{l}}. Perhaps you can run the script to convert raw English links under "alternative forms" and "see also" to {{alter}} and {{l}} respectively, because these are the main offenders (Remember to exclude Thesaurus links and Category links from the script).
  • For Chinese, raw links under "Synonyms, Antonyms, Related terms, See also" use {{zh-l}} while those under "Derived terms" use {{zh-der|term1|term2|term3|...}}.
  • For Japanese, all raw links will need to be fixed by hand because the kana or transliteration must be provided. The correct format can be either {{ja-l|term}} or {{m|ja|term|tr=}} or {{ja-r|kanji mixed-script|kana}}.
Because all of these will need to be manually fixed, can you list out the affected entries on a separate page?
Thank you very much for dealing with this matter. KevinUp (talk) 20:58, 22 September 2019 (UTC)
@KevinUp Thanks for the info! I will implement it. Meanwhile, I wrote a script to replace raw descendant links with {{desc}}. There was a partial run by User:MewBot previously to do this but it seems to have missed some spots. My bot found 37,978 replaceable cases on 19,760 pages. I haven't done the final saving run yet. The logic is approximately as follows:
  1. Look for things like * → Lithuanian: {{l|lt|Akvilė}} or ** Occitan: [[agla]] or some mixture or raw and templated links.
  2. Replace the raw links with templated links, based on the language code of a templated link on the same line, or (if there are no such templated links) on the language code corresponding to the language name, if the name can be uniquely mapped to a regular or etymology language based on canonical names.
  3. Replace the combination of name + first templated link with {{desc}} if the language name and template code match (either exactly in the case of regular languages, or by matching the regular-language parent in the case of etymology languages). If the character → occurs before the language name, remove it and add |bor=1 to {{desc}}.
  4. There's also a rule that operates before all others to special-case Serbo-Croatian, which typically lists both Cyrillic and Latin versions and needs to use {{desc|...|sclb=1}}.
It issued 6,796 warnings, of which 3,053 were due to an unrecognized language name; 1,406 were due to a language name that was recognized but not the canonical name of any language; 2,084 were due to mismatch between language name and code; 75 were due to there being templated links with different language codes on the same line; 59 were due to a language name that was the canonical name of both a regular and an etymology language (oops); and a few other warnings occurred due to strangenesses in raw links (mismatching two-part links, non-convertible links like [[nouvelle#Noun|nouvelle]], [[w:Scipionyx|Scipionyx]] or [[#Etymology 3|renna]].
The top list of unrecognized language names is:
 241 Church Slavonic
 123 East Slavic
 118 Written Tibetan
 109 Northern
 105 Beijing
  63 Middle Latin
  62 Written Burmese
  61 Khalkha
  61 Eastern
  58 Eastern Yugur
  50 Orkhon
  50 Classical
  48 Guangzhou
  39 Southern
  37 Shanghai
  37 Sgaw
  33 Palaung
  31 Mnong
  29 Gallo-Latin
  27 Taiwan
  27 Dalian
  26 Syriac
  25 Western Malayo-Polynesian
  22 Russian Church Slavonic
  22 Finnic
  21 Northern Ryukyuan
  20 Faeroese
The top list of observed non-canonical language names is:
 270 Romansh
 194 Nynorsk
 137 Azeri
  81 Old Frankish
  43 Cuman
  36 Khorezmian
  35 East Frisian
  30 Uighur
  25 Meadow Mari
  25 Hill Mari
  20 Komi
  19 Croatian
  18 Nancowry
  18 Mari
  15 Malaccan Creole Portuguese
  14 Modern Greek
  12 Odia
  12 Languedocien
  12 Gascon
  11 Nogay
  11 Kurripako
  10 Official Aramaic
The top list of mismatches between name and code is:
 124 Middle Chinese:ltc!=zh
 106 Old Chinese:och!=zh
  45 Middle French:frm!=fr
  44 Old French:fro!=fr
  42 Low German:nds!=nds-de
  40 Arabic:ar!=xng
  38 Middle English:enm!=en
  35 Mamluk-Kipchak:trk-mmk!=qwm
  27 Old Spanish:osp!=es
  26 Chinese:zh!=xng
  25 French:fr!=en
  23 Solon:tuw-sol!=evn
  22 Norwegian:no!=nb
  22 Latin:la!=sh
  20 Spanish:es!=pt
  18 English:en!=enm
  17 Norwegian:no!=nn
  16 Portuguese:pt!=es
  16 Aromanian:rup!=ro
  15 West Frisian:fy!=ofs
  15 Old Portuguese:roa-opt!=pt
  14 Uyghur:ug!=xng
  14 Spanish:es!=en
  14 Galician:gl!=ga
We can probably special-case some of the most common unrecognized and non-canonical language names. In the mismatches, usually the language name is correct but sometimes it appears to be the other way around, e.g. nb and nn are more correct than "Norwegian", and some cases may need to be handled manually.
The mismatches involving language code xng may be script names; an example is Reconstruction:Proto-Mongolic/temexen. Can you take a look at that and let me know what's going on? It may be possible to handle these script names using {{desc|...|sclb=1}} similar to Serbo-Croatian above, but I don't understand what "Armenian" is doing in the list. Thanks! Benwing2 (talk) 23:35, 22 September 2019 (UTC)
@KevinUp I went through the top 99 cases of mismatches between lang name and code, and decided what to do, i.e. one of (a) go with the language name and change the code to match (the majority of cases); (b) go with the code and change the language name to match; (c) leave alone. One thing I did was change my script to "correct" lang code "zh" used for Old Chinese, Middle Chinese and Cantonese to the appropriate code for that language (och, ltc, yue). An example that has all three plus modern Mandarin is Reconstruction:Proto-Sino-Tibetan/(s/z)a-j. However, I see now this may be controversial, and the links to code "zh" may be intentional, because all these languages are merged under the "Chinese" header. If "zh" is correct, then unfortunately {{desc}} can't be used without additional support for cases like this. Can you comment? Thanks! Benwing2 (talk) 05:16, 23 September 2019 (UTC)
@Benwing2: Regarding unrecognized language names, I am able to propose the following fixes: KevinUp (talk) 22:42, 23 September 2019 (UTC)
These language names are problematic:
Regarding Reconstruction:Proto-Mongolic/temexen, I've edited the entry at Special:Diff/53606330/54274360 and updated the documentation at Wiktionary:About Proto-Mongolic#Descendants to reflect the language codes.
"Uyghur", "Arabic", "Armenian", "Chinese" are transliterations of Category:Middle Mongolian language using the respective scripts so I can propose the following changes:
  • ** Uyghur: {{l|xng| -> ** Uyghur: {{l|xng|sc=Mong|
  • ** Arabic: {{l|xng| -> ** Arabic: {{l|xng|sc=Arab|
  • ** Armenian: {{l|xcl| -> ** Middle Armenian: {{l|xng|sc=Armn|
  • ** Chinese: {{l|xng| -> ** Chinese: {{l|xng|sc=Hant|
Regarding Reconstruction:Proto-Sino-Tibetan/(s/z)a-j, the following languages encountered in Category:Proto-Sino-Tibetan lemmas can be removed because they are treated as the same language (unified Chinese) with different readings that are provided in the pronunciation section of Chinese entries:
  1. Modern Mandarin / Mandarin / Beijing / Sichuanese / {{desc|cmn}}
  2. Yue / Cantonese / Guangzhou {{desc|yue}}
  3. Min / Coastal Min / Inland Min / {{desc|zhx-min-pro}}
  4. Min Bei / {{desc|mnp}}
  5. Min Dong / {{desc|cdo}}
  6. Min Nan / Xiamen {{desc|nan}}
  7. Hokkien / {{desc|nan-hok}}
  8. Teochew / {{desc|zhx-teo}}
  9. Hakka / {{desc|hak}}
  10. Wu / {{desc|wuu}}
  11. Shanghai / Shanghainese / {{desc|wuu-sha}}
As for Middle Chinese, Old Chinese, Mandarin and Cantonese, the following convention has to be applied to all entries because the entries for [[TERM#Old Chinese]], [[TERM#Mandarin]], etc does not exist.
  1. Middle Chinese -> {{desc|ltc|-}} {{ltc-l|term}}
  2. Old Chinese -> {{desc|och|-}} {{och-l|term}}
  3. Mandarin -> {{desc|cmn|-}} {{zh-l|term}}
  4. Cantonese -> {{desc|yue|-}} {{l|zh|term}}
The templates on the right will provide the transliteration if it is available. This edit is a proposed solution for Proto-Sino-Tibetan entries.
There's a lot of work to be done, but I'm glad you're able to assist with this. Thank you very much for writing the scripts and running the bot. KevinUp (talk) 22:42, 23 September 2019 (UTC)
@KevinUp Thanks for your detailed post! Maybe you can help review my proposed changes to canonicalize non-canonical language names. I went through all languages that occurred more than I think 3 times, and came up with the following:
non_canonical_to_canonical_names = {
  "Romansh": "Romansch",
  "Nynorsk": "Norwegian Nynorsk",
  # Nynorsk: more specific than Norwegian
  "Azeri": "Azerbaijani",
  "Old Frankish": "Frankish",
  "Cuman": "Kipchak", # is this correct?
  "Khorezmian": "Khwarezmian",
  "East Frisian": "Saterland Frisian",
  "Uighur": "Uyghur",
  "Meadow Mari": "Eastern Mari",
  "Hill Mari": "Western Mari",
  "Komi": "Komi-Zyrian",
  # Croatian: ? map to Serbo-Croatian?
  # Nancowry: more specific than Central Nicobarese
  "Mari": "Eastern Mari",
  "Malaccan Creole Portuguese": "Kristang",
  "Modern Greek": "Greek",
  "Odia": "Oriya",
  # Languedocien: more specific than Occitan
  # Gascon: more specific than Occitan
  "Nogay": "Nogai",
  "Kurripako": "Curripaco",
  # Official Aramaic: ? more specific than Aramaic?
  "Southern Altay": "Southern Altai",
  "Ludic": "Ludian",
  # Sorani: ? map to Central Kurdish?
  "Sinhala": "Sinhalese",
  "Car": "Car Nicobarese",
  # Serbian: ? map to Serbo-Croatian?
  "Kurmanji": "Northern Kurdish",
  # Chakavian: more specific than Serbo-Croatian
  # Valencian: more specific than Catalan
  # Logudorese Sardinian: more specific than Sardinian
  # Campidanese: more specific than Sardinian
  "Awakatek": "Aguacateca",
  # Auvergnat: more specific than Occitan
  "Yukuna": "Yucuna",
  "West Greenlandic Pidgin": "Greenlandic Pidgin",
  # Walser: more specific than Alemannic German
  # Swiss German: more specific than German
  "Papiamento": "Papiamentu",
  "Low Saxon": "Low German",
  # Kinyarwanda: ? more specific than Rwanda-Rundi?
  # Kajkavian: more specific than Serbo-Croatian
  "Izhorian": "Ingrian",
  # Flemish: ? more specific than Dutch?
  "Belarussian": "Belarusian",
  "Sipakapa": "Sipakapense",
  # Ripuarian: ? more specific than Central Franonian?
  # Nuorese: more specific than Sardinian
  # Moselle Franconian: ? more specific than Central Franconian?
  # Logudorese: more specific than Sardinian
  "Inupiaq": "Inupiak",
  # Frisian: not same as West Frisian
  "Abkhazian": "Abkhaz",
  "Tangkhul": "Tangkhul Naga",
  # Siglitun: ? more specific than Inuktitut?
  "Salako": "Kendayan",
  "Proto-Sami": "Proto-Samic",
  # Poitevin: more specific than French
  "Old Uighur": "Old Uyghur",
  # Nunatsiavummiut: ? more specific than Inuktitut?
  "Khamnigan": "Khamnigan Mongol",
  # Inuinnaqtun: ? more specific than Inkutitut?
  "Ilokano": "Ilocano",
  "High German": "German",
  # Erzgebirgisch: more specific than East Central German
  # Bontok: not same as Central Bontoc
  "Bikol": "Bikol Central",
  "Balochi": "Baluchi",
  # Amuzgo: not same as Guerrero Amuzgo
}
Since you seem familiar with lots of languages, maybe you could review this? Some cases were obvious to me, e.g. Gascon, Languedocien, Auvergnat, etc. cannot be mapped to "Occitan" because they're more specific dialects even though they're listed as other names of Occitan. Conversely, Amuzgo probably cannot be mapped to Guerrero Amuzgo because it's less specific, even though "Amuzgo" is listed as another name of "Guerrero Amuzgo" and the Guerrero Amuzgo language code was consistently used; I imagine the use of the Guerrero Amuzgo code could be a mistake and would need manual checking. Other cases are less obvious; e.g. even though I'm familiar with the situation re. Serbo-Croatian, Serbian and Croatian, it's not completely obvious that mapping "Serbian" and "Croatian" to "Serbo-Croatian" is correct. Benwing2 (talk) 00:47, 24 September 2019 (UTC)
Also take a look at my changes to Reconstruction:Proto-Mongolic/temexen. You can use |sclb=1 with {{desc}} to have it list the script name instead of the language name. Note that Mongolian, Arabic and Armenian scripts (among others) can be autodetected very accurately, so you don't need to explicitly specify them. This doesn't work for Chinese characters; if you don't specify the script, it displays as "Unspecified". When you use |sc=Hant, you get "Traditional Han" instead of "Chinese"; hopefully this is OK. Also, what's with the Middle Armenian entry? Is this a transcription in Armenian characters or an actual borrowing into Middle Armenian? In the former case it should presumably be tagged as language xng; in the latter case it should be tagged with |bor=1 to indicate that it's a borrowing, which displays an arrow (→) before it. Benwing2 (talk) 01:04, 24 September 2019 (UTC)
I implemented your suggestions for the unrecognized languages but from your change to Reconstruction:Proto-Sino-Tibetan/(s/z)a-j it looks like Old Chinese, Middle Chinese, etc. need to be done by hand. I have a fast way of doing this; basically, (1) I dump all the pages I want to work on to a file, (2) I save a copy of the original file, (3) I then hand-edit the file, (4) I use a script to push the changes. It needs the original copy to make sure that no one else changed the page in the meantime. This allows you to work much more quickly, do search-and-replace operations across all pages, etc. This is probably a bit like AWB or JWB although it might be faster; not sure. I can try to make these subs myself but you might have to do them as you're more familiar with these languages. If so, I'll save the dumped pages in my userspace and let you edit them. Benwing2 (talk) 02:24, 24 September 2019 (UTC)
@Benwing2: It took me a while, but here are my recommendations to canonicalize non-canonical language names: KevinUp (talk) 19:25, 24 September 2019 (UTC)
  1. "Romansh": "Romansch"
    Proceed to replace with {{desc|rm}}
  2. "Nynorsk": "Norwegian Nynorsk"
    Proceed to replace with {{desc|nn}}
  3. "Azeri": "Azerbaijani"
    Proceed to replace with {{desc|az}}
  4. "Old Frankish": "Frankish"
    Proceed to replace with {{desc|frk}}
  5. "Cuman": "Kipchak"
    Proceed to replace with {{desc|qwm}}
  6. "Khorezmian": "Khwarezmian"
    Proceed to replace with {{desc|xco}}
  7. "East Frisian": "Saterland Frisian"
    Proceed to replace with {{desc|stq}}
  8. "Uighur": "Uyghur"
    Proceed to replace with {{desc|ug}}
  9. "Meadow Mari": "Eastern Mari"
    Proceed to replace with {{desc|chm}}
  10. "Hill Mari": "Western Mari"
    Proceed to replace with {{desc|mrj}}
  11. "Komi": "Komi-Zyrian"
    Proceed to replace with {{desc|kpv}}
  12. "Croatian": ? map to Serbo-Croatian?
    Map to {{desc|sh}} and add {{q|[[:Category:Croatian Serbo-Croatian|Croatian]]}} at the end of the line.
  13. "Nancowry": more specific than Central Nicobarese
    Do not replace. Central Nicobarese appears to be a language group with three varieties including Nancowry.
  14. "Mari": "Eastern Mari"
    Do not replace. May refer to Eastern Mari or Western Mari.
  15. "Malaccan Creole Portuguese": "Kristang"
    Proceed to replace with {{desc|mcm}}
  16. "Modern Greek": "Greek"
    Proceed to replace with {{desc|el}}
  17. "Odia": "Oriya"
    Proceed to replace with {{desc|or}}
  18. "Languedocien": more specific than Occitan
    Map to {{desc|oc}} and add {{q|[[:Category:Languedocian Occitan‎|Languedocian]]}} at the end of the line.
  19. "Gascon": more specific than Occitan
    Map to {{desc|oc}} and add {{q|[[:Category:Gascon Occitan‎|Gascon]]}} at the end of the line.
  20. "Nogay": "Nogai"
    Proceed to replace with {{desc|nog}}
  21. "Kurripako": "Curripaco"
    Proceed to replace with {{desc|kpc}}
  22. "Official Aramaic": ? more specific than Aramaic?
    Map to {{desc|arc-imp}} (Category:Imperial Aramaic)
  23. "Southern Altay": "Southern Altai"
    Proceed to replace with {{desc|alt}}
  24. "Ludic": "Ludian"
    Proceed to replace with {{desc|lud}}
  25. "Sorani": ? map to Central Kurdish?
    Proceed to replace with {{desc|ckb}}. Sorani is the endonym of the language.
  26. "Sinhala": "Sinhalese"
    Proceed to replace with {{desc|si}}
  27. "Car": "Car Nicobarese"
    Proceed to replace with {{desc|caq}}
  28. "Serbian": ? map to Serbo-Croatian?
    Map to {{desc|sh}} and add {{q|[[:Category:Serbian Serbo-Croatian|Serbian]]}} at the end of the line.
  29. "Kurmanji": "Northern Kurdish"
    Proceed to replace with {{desc|ckb}}. Kurmaji is the endonym of the language.
  30. "Chakavian": more specific than Serbo-Croatian
    Map to {{desc|sh}} and add {{q|[[:Category:Chakavian Serbo-Croatian|Chakavian]]}} at the end of the line.
  31. "Valencian": more specific than Catalan
    Map to {{desc|ca}} and add {{q|[[:Category:Valencian Catalan|Valencian]]}} at the end of the line.
  32. "Logudorese Sardinian": more specific than Sardinian
    Map to [[Logudorese]] {{desc|sc}}
  33. "Campidanese": more specific than Sardinian
    Map to [[Campidanese]] {{desc|sc}}
  34. "Awakatek": "Aguacateca"
    Proceed to replace with {{desc|agu}}
  35. "Auvergnat": more specific than Occitan
    Map to {{desc|oc}} and add {{q|[[:Category:Auvergnese Occitan‎|Auvergnese]]}} at the end of the line.
  36. "Yukuna": "Yucuna"
    Proceed to replace with {{desc|ycn}}
  37. "West Greenlandic Pidgin": "Greenlandic Pidgin"
    Proceed to replace with {{desc|crp-gep}}
  38. "Walser": more specific than Alemannic German
    Map to {{desc|gsw}} (Category:Alemannic German language) and add {{q|[[:Category:Walser German|Walser]]}} at the end of the line.
  39. "Swiss German": more specific than German
    Map to {{desc|gsw}} (Category:Alemannic German language) and add {{q|[[:Category:Switzerland German|Switzerland]]}} at the end of the line.
  40. "Papiamento": "Papiamentu"
    Proceed to replace with {{desc|pap}}
  41. "Low Saxon": "Low German"
    Proceed to replace with {{desc|nds}}
  42. "Kinyarwanda": ? more specific than Rwanda-Rundi?
    Map to {{desc|rw}} and add {{q|[[:Category:Rwandan Rwanda-Rundi|Kinyarwanda]]}} at the end of the line.
  43. "Kajkavian": more specific than Serbo-Croatian
    Map to {{desc|sh}} and add {{q|[[:Category:Kajkavian Serbo-Croatian|Kajkavian]]}} at the end of the line.
  44. "Izhorian": "Ingrian"
    Proceed to replace with {{desc|izh}}
  45. "Flemish": ? more specific than Dutch?
    Map to {{desc|nl}} and add {{q|[[:Category:Belgian Dutch|Flemish]]}} at the end of the line.
  46. "Belarussian": "Belarusian"
    Proceed to replace with {{desc|be}}
  47. "Sipakapa": "Sipakapense"
    Proceed to replace with {{desc|qum}}
  48. "Ripuarian": ? more specific than Central Franconian?
    Map to {{desc|gmw-cfr}} and add {{q|[[:Category:Ripuarian Central Franconian|Ripuarian]]}} at the end of the line.
  49. "Nuorese": more specific than Sardinian
    Map to [[Nuorese]] {{desc|sc}}
  50. "Moselle Franconian": ? more specific than Central Franconian?
    Map to {{desc|gmw-cfr}} and add {{q|[[:Category:Moselle Central Franconian|Moselle]]}} at the end of the line.
  51. "Logudorese": more specific than Sardinian
    Map to [[Logudorese]] {{desc|sc}}
  52. "Inupiaq": "Inupiak"
    Proceed to replace with {{desc|ik}}
  53. "Frisian": not same as West Frisian
    Do not replace. Could be one of various languages in Category:Frisian languages.
  54. "Abkhazian": "Abkhaz"
    Proceed to replace with {{desc|ab}}
  55. "Tangkhul": "Tangkhul Naga"
    Proceed to replace with {{desc|nmf}}
  56. "Siglitun": ? more specific than Inuktitut?
    Do not replace. May require a language code. wikipedia:Sigtulun indicates that this is a dialect of Inuvialuktun which is a sub-variety of Category:Inuktitut language.
  57. "Salako": "Kendayan"
    Proceed to replace with {{desc|knx}}
  58. "Proto-Sami": "Proto-Samic"
    Proceed to replace with {{desc|smi-pro}}
  59. "Poitevin": more specific than French
    Map to {{desc|roa-poi}} (Category:Poitevin-Saintongeais language)
  60. "Old Uighur": "Old Uyghur"
    Proceed to replace with {{desc|oui}}
  61. "Nunatsiavummiut": ? more specific than Inuktitut?
    Do not replace. May require a language code. wikipedia:Inuttitut indicates that Nunatsiavummiut is a dialect of Inuttitut which is a sub-variety of Category:Inuktitut language.
  62. "Khamnigan": "Khamnigan Mongol"
    Proceed to replace with {{desc|xgn-kha}}
  63. "Inuinnaqtun": ? more specific than Inkutitut?
    Do not replace. May require a language code. wikipedia:Inuinnaqtun indicates that this is a dialect of Inuvialuktun which is a sub-variety of Category:Inuktitut language.
  64. "Ilokano": "Ilocano"
    Proceed to replace with {{desc|ilo}}
  65. "High German": "German"
    Check individually. May refer to German ({{desc|de}}) or one of various High German languages.
  66. "Erzgebirgisch": more specific than East Central German
    Map to {{desc|gmw-ecg}} and add {{q|[[:Category:Erzgebirgisch East Central German|Erzgebirgisch]]}} (uncreated category) at the end of the line.
  67. "Bontok": not same as Central Bontoc
    Do not replace. Appears to be a language group (Bontoc language) with five dialects.
  68. "Bikol": "Bikol Central"
    Proceed to replace with {{desc|bcl}}
  69. "Balochi": "Baluchi"
    Proceed to replace with {{desc|bal}}
  70. "Amuzgo": not same as Guerrero Amuzgo
    Appears to be a language group with up to four varieties. May require a language code.
Regarding Reconstruction:Proto-Mongolic/temexen, I've updated the Armenian transliteration as {{desc|xng||թաման|sc=Armn|sclb=1|tr=tʿaman}} (It's not an actual word in Middle Armenian, but a 13th century transliteration of Middle Mongolian using the Armenian script). I'm not sure why omitting sclb=1 gave an "Unspecified" error for the Armenian term.
Regarding Reconstruction:Proto-Sino-Tibetan/(s/z)a-j, I think I can track down Old Chinese and Middle Chinese descendants using insource:/\{\{desc\|och/ incategory:"Proto-Sino-Tibetan lemmas" after your bot has converted "Old Chinese" etc into {{desc|och}}. It will take some time for me to check and edit the IPA transliteration generated by {{och-l}} and {{ltc-l}}, so you don't have to dump the pages for me to edit as I can do it after you've run the script. KevinUp (talk) 19:25, 24 September 2019 (UTC)
Rather than using {{q}}, it would be better to create etymology language codes for the dialects and put those codes in {{desc}}. Then the links in those dialects can be easily located. — Eru·tuon 19:57, 24 September 2019 (UTC)
When you're done looking at descendants I imagine most of the same considerations would apply to translation sections. DTLHS (talk) 20:44, 24 September 2019 (UTC)
@KevinUp Thank you very much! As for the Proto-Sino-Tibetan pages, many of them have manually-specified Baxter-Sagart and Zhengzhang transliterations following the Old Chinese, and sometimes have |ts= params for the Middle Chinese. An example with both is Reconstruction:Proto-Sino-Tibetan/nja-ŋ/k, where |ts= appears to contain Zhengzhang's version and |tr= appears to contain Baxter's version. Currently |ts= isn't supported in {{zh-l}} or variants. I think what I'll do is add support for |ts= to them, which will allow me to convert the links, and leave alone the manually-specified Old Chinese transliterations and such; you'll have to edit them afterwards. Benwing2 (talk) 01:03, 25 September 2019 (UTC)
@DTLHS Good point about translations, I'll look into them afterwards. Benwing2 (talk) 01:04, 25 September 2019 (UTC)
@Erutuon I agree with you about adding etymology languages for these variants. Benwing2 (talk) 01:05, 25 September 2019 (UTC)
@Benwing2: I've updated the entry for Reconstruction:Proto-Sino-Tibetan/nja-ŋ/k at Special:Diff/50520820/54285945. For Old Chinese, I've rearranged the order to display Zhengzhang's transliteration first using {{och-l}} followed by Baxter-Sagart's transliteration which is manually specified.
I'm not sure why |ts= and |tr= is used for Middle Chinese, but it can be replaced by {{ltc-l}} instead (Different linguists have different reconstructions, other reconstructions can be viewed at the Chinese entry).
So for Proto-Sino-Tibetan pages, the Middle Chinese and Old Chinese transliterations will need to be manually converted (some characters have multiple readings) while all the other lects (Mandarin, Cantonese, Min Nan, etc) can be removed. KevinUp (talk) 07:50, 25 September 2019 (UTC)
@KevinUp Thanks. I did a run last night to add {{desc}}, incorporating your various suggestions on mapping non-canonical language names. It changed about 35,000 entries on about 20,000 pages. It didn't make any changes involving adding {{q}} tags and such, other than "Written" in the case of Tibetan and Burmese; these await the discussion below on new etymology languages. I also didn't have it change Proto-Sino-Tibetan entries with Middle Chinese, Old Chinese or Cantonese, or remove the various modern lects from those pages. If you're sure about these changes, I'll do a separate run to fix up the Proto-Sino-Tibetan pages. Benwing2 (talk) 15:17, 25 September 2019 (UTC)
@KevinUp, Mahagaja Also, I notice that the naming of the Proto-Sino-Tibetan pages isn't consistent. For example, we have Reconstruction:Proto-Sino-Tibetan/g-tam ~ g-dam with a tilde, but Reconstruction:Proto-Sino-Tibetan/s-b/m-ruːl with a slash, and Reconstruction:Proto-Sino-Tibetan/na-(n/t) with both slash and parens. The pages Reconstruction:Proto-Sino-Tibetan/(s/r)-ma(ŋ/k) and Reconstruction:Proto-Sino-Tibetan/s/r-m(u/i/ja)l have the same prefix alternation, but one uses parens for it and the other doesn't. We should agree on a standard and rename the pages appropriately. Benwing2 (talk) 15:22, 25 September 2019 (UTC)
I think most of those sit-pro reconstruction pages are really User:Wyang's babies; unfortunately he seems to have left the project in a huff again. —Mahāgaja · talk 15:55, 25 September 2019 (UTC)

Implementing the ISO 639-6 code for the Hachijō language, hhjm, into wiktionary.Edit

How would we go about implementing the ISO 639-6 code for the Hachijō language, hhjm, into wiktionary.

We don't use four letter codes, it would be something like und-hjm. DTLHS (talk) 02:17, 20 September 2019 (UTC)
According to the Wikipedia article, it's generally considered a dialect of Japanese. If we need something besides dialect marking, it'd be ja-hjm or something. I'd prefer if we used something consistent with w:IETF language tags; that is ja-hachijo (6-8 letter extension name that I'd be happy to try and register officially.)--Prosfilaes (talk) 12:42, 20 September 2019 (UTC)
@Prosfilaes: Sorry for the late reply. Okinawan is also generally considered a dialect despite not being very mutually intelligible. Yet we have most if not all the Ryukyuan languages in Wiktionary. One of those codes would work too I think. MiguelX413 (talk) 21:43, 22 September 2019 (UTC)
To judge from the Wikipedia article, it doesn't seem like a good idea to give it a code that marks it a dialect of Japanese, nor of any of the Ryukyuan languages. I'd recommend jpx-hcj instead using the code for the Japonic language family rather than the code for Japanese specifically. —Mahāgaja · talk 09:03, 27 September 2019 (UTC)

Creating new entries with no definitionEdit

We have the rfdef template that allows us to add a sense line with no actual definition, so that Kiwima another user who knows the meaning can fill it in. (This is perennially abused by Wonderfool for phrases he has encountered in sports journalism.)

I have a pretty large list of "good" words that are definitely CFI-attestable but whose meaning I can't work out, usually because it's very specialised (particle physics etc.) though some might be regionalisms or what not. I'm tempted to start creating entries for these, since some info can be given (part of speech, pronunciation, 3 citations, etc.) even without a definition — and perhaps users are more likely to work on them than they would be with a big list of red links.

Would people approve of this or object? Equinox 14:46, 21 September 2019 (UTC)

  Support Of course you can do as much as you know, and you cannot be blamed for being lazy if defining the term requires special knowledge or a clear mind you do not have. If you leave requests, it is inviting (especially if the request categories won’t be hidden). And English coverage is probably at the point where definitions become harder for that reason. Fay Freak (talk) 15:00, 21 September 2019 (UTC)
  Support I reserve the right to change my mind if this is somehow abused. I have done this a few times, but usually because I wanted to look something up, because the citations I found didn't support the first definition that I had added, or because I was interrupted before finishing. DCDuring (talk) 19:04, 21 September 2019 (UTC)
And then there are all those entries which require "a clear mind [I may] not have", at least at the time. DCDuring (talk) 18:07, 23 September 2019 (UTC)
I sometimes create Finnish entries with rfdef, mostly for the purpose of me filling them in later. I remember one of the first larger projects I took part in was filling in a few hundred Finnish rfdefs to empty that category, so this is certainly something that has been done before. — surjection?〉 17:34, 23 September 2019 (UTC)
Okay, I am going to do this with some of my lists to keep the size down. I've just created panspot, slickspot, saccharoid (noun), hydropump, exotomous, brachial (noun), eschatologism. 2+ citations for each. Equinox 12:02, 24 September 2019 (UTC)
Why not put in some external links to help whoever might follow up: {{R:Century 1911}} (esp. for older terms), {{R:OneLook}}, {{pedia}}, {{comcatlite}}, any others that seem appropriate. DCDuring (talk) 13:36, 24 September 2019 (UTC)

Getting rid of dialects from "other names" in Module:languages/data2, etc.Edit

I would like to either delete dialects from the "other names" section of languages (e.g. other name "Italian Walser" for language "Alemannic German") or move them to a separate "dialects" section. IMO, "other names" should only contain synonyms for the language (e.g. Farsi = Persian, High German = German, Slovenian = Slovene, Serbo-Croat = Serbo-Croatian, Daco-Romanian = Romanian, etc.). Otherwise, certain bot jobs get much harder. Any objections? Benwing2 (talk) 08:49, 22 September 2019 (UTC)

A dialects field seems a better approach than just deleting dialect names. There may be hairy issues, like of which language a given dialect in a dialect continuum is a dialect. But there is no hard rule that the dialects of different languages form disjoint sets. (The current coding also allows a language to have several ancestors, like for Saramaccan). For flexibility, I can envisage a scheme in which dialects are treated on the some footing as languages, with their own codes, which then would require a field status, with values like "language" and "dialect" but also allowing a later extension to "language family". And for recording synchronic relationships there should then be fields like parents and members, where the latter could replace dialects.  --Lambiam 10:49, 22 September 2019 (UTC)
Setting aside possible future schemes, if we just add a field now it should probably be lects rather than dialects, as lots of the included varieties aren’t (strictly speaking) dialects. E.g. for Serbo-Croatian the various national divisions are standardized registers based on one and the same dialect. — Vorziblix (talk · contribs) 19:28, 22 September 2019 (UTC)
Yes, those should go into a new field in the language data. The reason why they shouldn’t be removed is that sometimes it is hard to find how a language is named on Wiktionary because the spelling of the language name varies. Having synonymous or in this case often used pars pro toto names there helps to find the language category. Fay Freak (talk) 13:05, 22 September 2019 (UTC)
I agree with Fay Freak that it's useful to have dialect names somewhere in the language data modules. That allows you to search in Category:Language data modules to figure out which language code or language name Wiktionary uses. If possible, the dialect names should be given codes in Module:etymology languages/data and moved there. Then they can have their own otherNames fields. For instance, in the otherNames of Alemannic German, Walser German, Walserdeutsch, and Wallisertiitsch probably refer to the same dialect, so if an etymology language were created, one of them would be the canonical name (probably Walser German) and the others would be in its otherNames field.
Unfortunately not everyone understands how Module:etymology languages/data works, or knows that they can use the search box in Category:Language data modules (and the search results aren't very user-friendly anyway, especially if you don't understand Lua code), so a language-searching gadget might be nice. This is a start, but it only searches codes and canonical names of regular languages in submodules of Module:languages. It should also be able to search otherNames and the names of etymology languages. — Eru·tuon 19:59, 22 September 2019 (UTC)
Dialect names are also kept in Module:hy:Dialects and similar. We should synchronize all these lists. --Vahag (talk) 08:35, 23 September 2019 (UTC)
OK, I'm thinking of splitting otherNames into two new fields aliases (true alternative names) and dialects, and updating the Lua code accordingly so that code that currently looks at otherNames is changed to look at both aliases and otherNames. (There are only a few spots that access the field directly and I'm pretty sure I can find them all.) The advantage of using a new field is that it's clear when a given language has been audited. How does this sound? Benwing2 (talk) 01:38, 24 September 2019 (UTC)
Maybe I should use lects instead of dialects, as User:Vorziblix proposes. Benwing2 (talk) 01:40, 24 September 2019 (UTC)
I guess I'd also be in favor of lects, because there may be sociolects or chronolects or other odd things represented. — Eru·tuon 02:28, 24 September 2019 (UTC)
If (dia)lects were moved to a new field, it should be done by those familiar with those languages, not just some mass migration. There's a lot of gray that needs navigating, like when a dialect name is a common stand-in for the standard name. --{{victar|talk}} 05:56, 24 September 2019 (UTC)
I just came across some languages such as Category:Inuktitut language with lects that are incorrectly placed under "Other names". I would support splitting otherNames into aliases and lects, and of course, this has to be done manually, not done by mass migration to prevent errors. KevinUp (talk) 19:37, 24 September 2019 (UTC)
@Victar I agree that there are gray areas. My plan was to move stuff where it's clear (e.g. Uighur is clearly an alias of Uyghur; Auvergnat, Gascon, Languedocien are clearly dialects of Occitan, not aliases) and leave the remainder alone. This can be done bit-by-bit. Benwing2 (talk) 02:37, 25 September 2019 (UTC)

Etymologies: categorization vs redundancyEdit

In my opinion, etymologies should be as succinct as possible while still giving enough information for it to be useful. For instance: this etymology of French bedeau has so much information it could be confusing to a reader. This one ignores the terms Germanic origins, which is interesting and useful to me at least. And this one conveys the right amount information in my opinion, however its categorization is not exhaustive.

A compromise could be to have etymology categorization without it being stated explicitly in the written etymology. That still doesn't completely solve the "how much etymology" dilemma (do we want to still categorize Cebuano sin as being from Old High Germanic? or Icelandic skunkur from Proto-Algonquian?), but it'd be easier to write concise etymologies without sacrificing categories.

I have a (very) rudimentary outline of what a categorization template might look like here: User:Julia/etycat. Let me know if this is something that people would be on board with, or if anyone has other/better ideas about how we can improve our etymologies. Julia 15:07, 22 September 2019 (UTC)

Might comment more on this later; I probably prefer etymologies which are decently detailed (e.g. Middle English/High German/Low German/Dutch etc. should at least go back to Proto-Germanic), though I ultimately don't care too much about how it's done as long as categories are kept. One solution would be to have a template which allows users to "expand" the etymology (thus showing the more distant etymology of the respective word) in the manner of Template:grc-IPA (can possibly explain in further depth if you or others don't understand/misunderstand what I mean). Hazarasp (parlement · werkis) 10:46, 24 September 2019 (UTC)
Certainly, the longer of the two can be cleaned up a bit, and the cognates can be removed, but I do not think we are doing the etymology full justice, as the form of the word comes from *bidilaz but the sense comes from *budilaz, so it is really a merged term. The idea of a categorisation template is great, but I fear that in the end that would just add more work and upkeep. Having the ability to kill 2 birds with one stone (adding the etymology AND cateogising the term at the same time) is a major plus. Leasnam (talk) 17:48, 27 September 2019 (UTC)
My preference would be to have an expandable etym with as much information as possible. You don't always know what aspect of the etymology will be of interest to the reader (so there would be disagreement about where to draw the line), and it would be annoying to have to click through four different pages to get the entire history of the word. Andrew Sheedy (talk) 20:50, 27 September 2019 (UTC)
@Andrew Sheedy, that is a very good point too. We do this already on many etymology sections. Leasnam (talk) 22:21, 27 September 2019 (UTC)

Synophone reference in a Wiktionary definition.Edit

Homophones, synophones, synographs, & homographs for disambiguation or clarification of unexpected, unexplained, or just subtle, aural or written ambiguity (perhaps treated similarly to rhymes or anagrams).

i.e.: kefir (fermented drink, often mis-pronounced as) v keffer (black beetle, or derogatory racial epithet a'la "N-word") are carefully pronounced differently, but often mispronounced the same.

Is there a standard format for including references to such potential confusions, for disambiguation, when adding a definition?

Wikidity (talk) 21:14, 23 September 2019 (UTC)

One approach, seen at complementary and principal, is to use the Usage notes section for drawing attention to common confusion.  --Lambiam 15:46, 24 September 2019 (UTC)
There’s also some information about how we handle these at Wiktionary:Entry_layout#Pronunciation; see the fifth bullet point about homophones there. Unrelatedly, the beetle is chafer in English (as against e.g. Dutch kever), is not necessarily black, and doesn’t have any etymological relation with kaffir (the religious/racial epithet). — Vorziblix (talk · contribs) 16:24, 24 September 2019 (UTC)

New etymology languagesEdit

@Erutuon, KevinUp User:Erutuon suggested creating etymology languages for varieties that are listed at least somewhat frequently in Descendants sections. Following are my suggestions for etymology languages for these variants:

  1. Serbian, Croatian: sh-ser, sh-cro (Are we sure we want to do this? I can see this being abused.)
  2. Chakavian, Kajkavian, Torlakian: sh-cha, sh-kaj, sh-tor
  3. Provençal, Auvergnat, Gascon, Languedocien, Limousin, Vivaro-Alpine, Judeo-Occitan: oc-pro, oc-auv, oc-gas, oc-lan, oc-lim, oc-viv, oc-jud
  4. Nancowry, Camorta, Katchal (per Wikipedia these are three separate languages): ncb-nan, ncb-cam, ncb-kat
  5. Valencian: ca-val
  6. Logudorese, Campidanese: sc-src or just src, sc-sro or just sro (the shorter forms are ISO 639-3 codes)
    • Nuorese: sc-nuo (this is a conservative variety of Logudorese)
  7. Walser German: gsw-wae or just wae (the latter is an ISO 639-3 code)
  8. Swiss German: User:KevinUp suggested just mapping this to Alemannic German. Are we sure we want to do that?
    My mistake. It ought to be Category:Switzerland Alemannic German instead. KevinUp (talk) 17:27, 25 September 2019 (UTC)
  9. Kinyarwanda, Kirundi/Rundi: rw-kin or just kin, rw-run or just run (the shorter forms are ISO 639-3 codes)
  10. Flemish: nl-fle (or nl-vla?) Is this a linguistic (rather than geographic) entity at all?
    • Glottolog subdivides Flemish: Antverpian (the dialect of the city of Antwerp), French Flemish, West Flemish, East Flemish and Limburgish. We could potentially assign codes nl-ant (Antverpian), nl-fre (French Flemish), nl-eas; the other two already have codes.
  11. Ripuarian, Moselle Franconian: gmw-cfr-rip, gmw-cfr-mos (or are gmw-rip/gmw-mos better?); note also that Luxembourgish (code lb) is a subvariety of Moselle Franconian.
  12. Siglitun, Natsilingmiutut (also Netsilik, Natsilik, Nattilik, Netsilingmiut, Natsilingmiutut, Nattilingmiutut, Nattiliŋmiutut), Inuinnaqtun: These are dialects of Inuvialuktun, which appears to be a language grouped under the Inuktitut macro-language. I don't think I have enough knowledge here to do justice to Inuit varieties.
  13. Thuringian, Upper Saxon German, Erzgebirgisch, Lusatian-New Marchian (or Lausitzisch-Neumärkisch): I feel on uncertain ground here, but we could assign codes gwm-ecg-thu, gwm-ecg-upp, gwm-ecg-erz, gwm-ecg-lus (or shorter forms gwm-thu, gwm-upp, gwm-erz, gwm-lus).

Benwing2 (talk) 02:35, 25 September 2019 (UTC)

  • For Kajkavian, there's the ISO 639-3 code kjv. — Eru·tuon 02:58, 25 September 2019 (UTC)
    Yeah, but the analogy sh-cha, sh-kaj, sh-tor is good.
  • Serbian, Croatian: Better not. The mere presence as langcodes reinforces again the impression of there being different languages, the impression which political climate has inculcated, representing borders only 30 years old. If I really want to distinguish I write it out. Additionally assignments to one of the countries only are often tentative. I can have a hard time to say which region of Germany a term is used in, and we wouldn’t create codes for “Hesse German”, “Lower Saxony German”, etc. (although these are separate states). While okay in labels for linking there is less room for uncertainty.
Agreed that having Serbian and Croatian is probably a bad idea. I’m going through some occurrences of these in Descendants and Etymology sections, and almost all are just being used as a synonym for Serbo-Croatian, i.e. they aren’t even characteristic of one national standard or another. Even in the rare cases where we are dealing with a word characteristic of one of these standards, it’s almost never exclusive to that standard — all the more so when we factor in the other two standards (Bosnian and Montenegrin). In sum, the vanishingly small number of cases when these might come in handy as etymology languages is IMO far outweighed by their potential for abuse, as is evidenced by the fact that most preexisting occurrences fell under the latter rather than the former. — Vorziblix (talk · contribs) 15:57, 25 September 2019 (UTC)
Swiss German: Confusing, there is also a “Swiss High German”, only used in but a single entry, English kepi. Apparently to mean Swiss Standard German, while there is “Austrian German” to mean Austrian Standard German.
11–13: Can’t imagine people using them systematically and the dialect areas are uncertain in the borders and vary between maps but okay. Don’t know though why you left out Silesian and High Prussian. There is extensive literature in both, not only on both. If we want a dialectologist to come we need to have all them codes. Fay Freak (talk) 04:55, 25 September 2019 (UTC)
  • Regarding Flemish, the term is used in two different senses, which is confusing. Colloquially, it is used for the variant of standard Dutch spoken in Belgium. It is also the name of one of the Dutch dialect groups, which is geographically confined to the historical County of Flanders, an area that today includes French Flanders (a tiny part of France) and Zeelandic Flanders (in the Netherlands). It can be split further into West Flemish and East Flemish. The dialect spoken in Bruges is West Flemish, while that in Ghent is East Flemish. Another Dutch dialect group spoken on both sides of the Dutch–Belgian border is Brabantian; Antverpian is in the Brabantian group. If native Dutch speakers in Brussels speak standard Dutch, they speak Flemish in the colloquial sense, but they are not Flemish speakers in the dialect sense: their dialect is Brabantian. For more, see Dutch dialects. A variegated picture is offered by File:Languages Benelux.PNG.  --Lambiam 05:57, 25 September 2019 (UTC)
@Fay Freak, Lambiam Thanks for your comments. I agree about Serbian and Croatian. I didn't intentionally leave out Silesian or High Prussian, I just looked up w:East Central German and copied the four subdivisions listed in the box on the right. I see that the article itself divides things a bit differently and does include Silesian and High Prussian. Based on both of your comments, I'm going to skip etymology languages for Dutch dialects, East Central German dialects and Inuktitut dialects as well as "Swiss German" as I don't have the background to do them justice. Should I also skip adding entries for Ripuarian and Moselle Franconian? I'm not sure if this entry was being referred to. Benwing2 (talk) 15:08, 25 September 2019 (UTC)
@Benwing2 I've compiled your suggestions into the following table. I made some amendments by using 6 characters instead of 9 characters for the proposed language codes. Of these, I find the Dutch varieties to be most inconsistent. It seems odd for East Flemish to be made an etymology language whereas West Flemish is a full language. KevinUp (talk) 17:27, 25 September 2019 (UTC)
Regional variant Proposed code Status Parent language Language code Remarks
Serbian sh-ser ... Serbo-Croatian sh Not recommended.
Croatian sh-cro ... Serbo-Croatian sh Not recommended.
Chakavian sh-cha   Done Serbo-Croatian sh
Kajkavian sh-kaj / kjv   Done Serbo-Croatian sh ISO 639-3 code: kjv
Torlakian sh-tor   Done Serbo-Croatian sh
Provençal oc-pro / prv   Done Occitan oc
Auvergnat oc-auv   Done Occitan oc Auvergnat or Auvergnese?
Gascon oc-gas   Done Occitan oc
Languedocian oc-lan   Done Occitan oc
Limousin oc-lim   Done Occitan oc
Vivaro-Alpine oc-viv   Done Occitan oc
Aranese oc-ara   Done Occitan oc
Judeo-Occitan oc-jud   Done Occitan oc Also known as Shuadit
Nancowry ncb-nan   Done Central Nicobarese ncb
Camorta ncb-cam   Done Central Nicobarese ncb
Katchal ncb-kat   Done Central Nicobarese ncb
Valencian ca-val   Done Catalan ca
Logudorese sc-src   Done Sardinian sc
Campidanese sc-sro   Done Sardinian sc
Nuorese sc-nuo   Done Sardinian sc Conservative variety of Logudorese
Walser gsw-wal ... Alemannic German gsw
Switzerland Alemannic German gsw-swi ... Alemannic German gsw Not to be confused with Switzerland German
Kinyarwanda rw-kin   Done Rwanda-Rundi rw
Kirundi or Rundi rw-run   Done Rwanda-Rundi rw
Flemish nl-vla ... Dutch nl Appears to be a dialect continuum. Same as Belgian Dutch?
Antwerpian nl-ant ... Dutch nl
French Flemish nl-fre ... Dutch nl Appears to be a dialect of West Flemish.
East Flemish nl-eas ... Dutch nl
Brabantian nl-bra ... Dutch nl
Ripuarian gmw-rip ... Central Franconian gmw-cfr
Moselle gmw-mos ... Central Franconian gmw-cfr
Siglitun ikt-sig ... Inuvialuktun ikt
Natsilingmiutut ikt-nat ... Inuvialuktun ikt
Inuinnaqtun ikt-inu ... Inuvialuktun ikt
Thuringian gmw-thu ... East Central German gmw-ecg
Upper Saxon gmw-upp ... East Central German gmw-ecg
Erzgebirgisch gmw-erz ... East Central German gmw-ecg
Lusatian-New Marchian gmw-lus ... East Central German gmw-ecg
Silesian German gmw-sil ... East Central German gmw-ecg
High Prussian gmw-hig ... East Central German gmw-ecg

I've added the three dialects of Inuvialuktun as well as Silesian and High Prussian to the list above for your consideration. KevinUp (talk) 17:27, 25 September 2019 (UTC)

Including codes for the topolects Belgian Dutch and Surinamese Dutch should not be problematic – although I’m not sure where such codes would be used. For the former, just steer clear from fle, vla and vls. What about nl-blg (not bel to avoid any confusion with Belarusian) and nl-sur (not srn to avoid confusion with Sranan Tongo)?  --Lambiam 18:25, 25 September 2019 (UTC)
There is some heated debate on whether West Flemish is a language or a dialect over on Wikipedia at Talk:West Flemish. I have no opinion or expertise on the matter, except that I know it does not fit Weinreich’s definition: a shprakh iz a dialekt mit an armey un flot.  --Lambiam 18:49, 25 September 2019 (UTC)
@KevinUp Thanks! Hopefully if no one objects I can implement them pretty soon. Note that Silesian should be called Silesian German because there's another Silesian (which, depending on your viewpoint, is a dialect of Polish or a distinct language). Benwing2 (talk) 02:28, 26 September 2019 (UTC)
@KevinUp I went ahead and created etymology languages for the Occitan varieties. I included Aranese, which is a standardized subvariety of Gascon, and I used "Shuadit" instead of "Judeo-Occitan" since that is the name used in Wikipedia and Wikidata. I notice that there are already etymology languages for Guernsey Norman and Jersey Norman, which use the codes roa-jer and roa-grn (when I would expect more like nrf-jer and nrf-grn, since nrf is the code for Norman). I originally followed this convention and used codes like roa-gas for Gascon, roa-pro for Provençal, etc., but then switched to oc-gas, oc-pro etc. as in the above table. I think it's more logical to name etymology languages after their immediate parent non-etymology language instead of after the overarching family, unless there's some uncertainty as to the correct parent (which there doesn't seem to be here). Certainly it is more clear, just examining the code, what it might correspond to (e.g. roa-pro could potentially correspond to any Romance language starting with Pro...), and it reduces the chance of clashes. Furthermore, it makes it easier to construct links, which currently must use the non-etymology-language parent (given the code roa-jer for Jersey, it's far from obvious that you should use code nrf for links to Jersey entries, but much more obvious if the code is nrf-jer). For this reason, I created alias codes nrf-jer = roa-jer, and nrf-grn = roa-grn. If we agree on the principle I've enumerated, we can eventually obsolete the old codes for Jersey and Guernsey Norman, but that is a task for another day. Benwing2 (talk) 07:58, 27 September 2019 (UTC)
@Benwing2: Thanks for creating the language codes for the Occitan varieties. I noticed that Category:Occitan language has two separate codes for Category:Provençal (prv and oc-pro). Any idea what's going on?
Yes, using nrf-xxx for the Norman varieties would be more consistent compared to using roa-xxx. Thanks for creating the aliases. KevinUp (talk) 19:48, 27 September 2019 (UTC)
@Lingo Bingo Dingo Hi. Benwing2 is creating etymology-only language codes for several languages to be used in the "descendants" section. We're having some trouble with "Flemish" and its variants. Are you familiar with it?
  1. Is Flemish the same as Category:Belgian Dutch?
  2. Are we missing a language family known as Low Franconian languages for Dutch, Flemish and its variants, which is more precise compared to Category:West Germanic languages?
  3. Would it be relevant to create an etymology language code for Flemish for the purpose of listing descendants? It seems that Flemish refers to a dialect continuum, rather than a single language.
  4. Any comments on French Flemish, East Flemish and Brabantian? Does any of these three languages deserve to be upgraded to a full language, like Category:Limburgish language and Category:West Flemish language? KevinUp (talk) 19:48, 27 September 2019 (UTC)
I'm not very knowledgeable about Dutch dialects, so I'll ping some other Dutch contributors @Rua, Morgengave, Mnemosientje, DrJos, Curious, Mofvanes, Voltaigne. In general, I'm not sure whether dialectal etymology language codes are very useful for modern Dutch; though country-specific codes and a code for Brabantian may be useful. But really, you'd have to ask the people who work with etymologies of terms borrowed from these varieties of Dutch.
  1. As noted above, this depends on usage, in everyday language it would mean "Belgian Dutch", but in a linguistic context I'd interpret "Flemish" as meaning "East and West Flemish". It is probably confusing to use it as a synonym of "Belgian Dutch" here.
  2. Wouldn't such a category basically include Dutch, West Flemish, Zealandic, Afrikaans and perhaps Jersey Dutch? It wouldn't seem very useful to me.
  3. No opinion, though this might be a rather messy solution if it is intended to cover both East and West Flemish. But such a code for "Belgian Dutch" may be useful.
←₰-→ Lingo Bingo Dingo (talk) 08:26, 28 September 2019 (UTC)
Thanks for the reply. I think the terms listed under descendants as "Flemish" will have to be checked to determine whether these refer to "West Flemish" or some other variety. KevinUp (talk) 09:52, 28 September 2019 (UTC)
@KevinUp How do you see the tree of children under "Category:Occitan language"? I found it once and saw the Provençal duplication but I can't find it now. There was an existing entry for Provençal causing the duplication; I tried to merge it with the new one but can't verify that this fixed things. I also changed the language codes for the various Occitan dialects to correspond to the retired ISO 639-3 codes that used to be used for them; hopefully this is OK. If anyone was using the old codes, it will trigger an error, and we can correct them. Benwing2 (talk) 20:09, 28 September 2019 (UTC)
@KevinUp I added Valencian, Chakavian/Kajkavian/Torlakian, the three Central Nicobarese varieties, Kinyarwanda/Kirundi, and the Sardinian varieties. I didn't add any of the Germanic or Inuktitut varieties as there are issues still to work out. Benwing2 (talk) 20:46, 28 September 2019 (UTC)
@Benwing2: Thanks for fixing the duplicate language codes for Provençal. I don't know why the language tree at Category:Occitan language was temporarily disabled 12 hours ago but I think the situation is resolved now.
Thanks for creating the etymology codes for these languages. I've updated the codes you used to the table above so that others may refer to it. As for the Germanic and Inuktitut varieties I suppose those will have to wait for another day. KevinUp (talk) 07:53, 29 September 2019 (UTC)
Can we make the Occitan language codes easier? oc-lnc? Usually dialects are just the first 3 letters, unless you need to disambiguate. --{{victar|talk}} 08:31, 4 October 2019 (UTC)
As someone what works in Occitan, I went ahead and changed them, and are in line with KevinUp's original suggestions. --{{victar|talk}} 08:39, 4 October 2019 (UTC)

Including language name in translation templatesEdit

User:DTLHS suggested I look into translation tables after Descendants. Currently we have translation table entries formatted like this:

  • French: {{t|fr|foo}}, {{t|fr|bar}}

This seems wasteful as the language name is duplicated. I'm thinking of creating a new template {{tr}}, so that this can be replaced with this:

  • {{tr|fr|foo|bar}}

This template will automatically display the language name at the beginning, similar to {{desc}}. It will also allow multiple translations to be specified; the extra numbered params in {{t}}, which refer to genders, will be moved to |g= or |g1= (for the first entry; comma-separated if there's more than one), |g2= (for the second entry), etc. I don't think this will add to the Lua load as we already have to load the language tables to retrieve the properties of the language in question. {{tr+}} and {{tr-check}} can be created to mirror {{t+}} and {{t-check}}. Thoughts? Benwing2 (talk) 02:51, 26 September 2019 (UTC)

There are many things that initialize translation lines that aren't language names. Like "Roman" or "Cyrillic". DTLHS (talk) 03:12, 26 September 2019 (UTC)
Those are script names. {{desc}} handles this using the |sclb=1 param, which says to display the script name in place of the language name; I'd implement the same. Benwing2 (talk) 03:39, 26 September 2019 (UTC)
Some translation lines have both {{t}} and {{t+}}, e.g. “Roman: nȁdūt (sh), flatulentan” found at flatulent. Won’t mixing {{tr}} and {{tr+}} duplicate the language name? Perhaps encode this as {{tr|sh|nȁdūt|+|flatulentan}}? Template {{label}} is an example of a template processing its parameters sequentially but not entirely independently, so this should be possible. The issue also arise with lines that have both {{t-check}} and {{t+check}}, e.g. “Esperanto: (please verify) kabano (eo) (1,2), (please verify) kajuto (on a ship)” found at hut.  --Lambiam 10:53, 26 September 2019 (UTC)
I'm quite happy with the format of our translation sections as it is now. I like that the translations are atomic and that you can add a literal translation and other data (e.g. a qualifier) per translation. Furthermore the TranslationAdder Gadget would need to be changed to support such a change.--So9q (talk) 13:51, 26 September 2019 (UTC)
@Lambiam That's a good point. The alternatives are either to use a format like you mentioned (or alternatively something like "{{tr|sh|+nȁdūt|flatulentan}}"), or use the format "{{tr+|sh|nȁdūt}}, {{t|sh|flatulentan}}" similarly to what is currently done with {{desc}} and {{l}}. Benwing2 (talk) 14:26, 26 September 2019 (UTC)
@So9q If we were to combine the translations as I suggested, literal translations and qualifiers per translation would be supported with |lit=/|lit1= for the first one, |lit2= for the second one, etc. and |q=/|q1= for the first one, |q2= for the second one, etc. I'm aware that the TranslationAdder Gadget would need fixing; if the format change is agreed upon, I (perhaps with User:Erutuon's help) could make the change. Benwing2 (talk) 14:31, 26 September 2019 (UTC)
Nice to hear you are committed :) (I made my own improved version of TA FYI). I now see this as a possible improvement because the template would prevent people from adding weird stuff to translation sections like qualifiers before the term etc.
As an aside currently some translations have <\!-- html-notes and hidden translations also which we should also decide if we want to do something about. @erutuon could you make me a list of articles with <\!-- in the translation sections to have a statistic of how many there are?--So9q (talk) 14:45, 26 September 2019 (UTC)
@So9q: Here's the list of mainspace titles in which a Translations section contains a HTML comment. (Edit: Here's a version with the comments shown, if you want to search for translation templates inside of them.) — Eru·tuon 17:57, 26 September 2019 (UTC)

Tools for copying translationsEdit

Swedish Wiktionary's entry 'europeisk' had 16 translations, but Ukrainian (uk) was not among them, so I went via English Wiktionary's entry European and found it to be європе́йський. While I was there, I also added some other translations (be, bg, cs, el, eo, et, gl, hu, sk, sl, sq, sr, tr), increasing the number from 16 to 30. This is something I've been doing several times a week for the last couple of years. Sometimes en.wiktionary is missing a translation, which I can add.

Perhaps I should have copied the whole section from {{trans-top}} to {{trans-bottom}}? And not only to sv.wiktionary but also to a couple of other languages? Or a network of bots could do that and also discover any inconsistencies. This sounds, of course, exactly like the old interwiki link bots that were running on Wikipedia less than a decade ago. So, should we do what Wikipedia did and move the entire translations sections into Wikidata? Or are there any other tools (bots? toolserver tools?) that can assist in spreading translations?

(Of course, having done this manually already, I fully understand the difficulties in determining whether the translation sections are for the same sense of the word. For a word like 'European' (adjective, pertaining to Europe) this is the case.) --LA2 (talk) 13:28, 26 September 2019 (UTC)

Don't blindly copy translations if you have no idea about the language in question. No bulk-copying, please. Will you be able to tell if a word is used in the wrong form? E.g. feminine, instead of masculine (lemma), or a noun instead of an adjective? --Anatoli T. (обсудить/вклад) 13:39, 26 September 2019 (UTC)
Bad idea, if you do that the dictionary becomes like any translation memory. Bots don’t see the nuances, don’t see if a term becomes slightly different across languages. Fay Freak (talk) 13:50, 26 September 2019 (UTC)
I also cannot support this. Please revert your changes and add only translation your are sure of one by one.--So9q (talk) 13:54, 26 September 2019 (UTC)
That's why I only added translations to 14 European languages using Latin or Cyrillic script, where I can determine what is reasonable and not, and refrained from adding translations for various Asian languages. But still, it was a lot of copy-and-paste that could have been made easier by a tool. If you see any translations that you know are wrong in the entry European, you can remove them at any time. If you see any that are wrong in the Swedish Wiktionary, you are of course welcome to discuss this in the Swedish Wiktionary. --LA2 (talk) 15:12, 26 September 2019 (UTC)
A tool, but not a bot (you said “network of bots”), since humans have to assess whether it is safe to copy. Fay Freak (talk) 20:51, 26 September 2019 (UTC)
The answers I get here indicate a lower level of maturity than I had hoped for. I conclude that no such tools exist today. Well, fine. --LA2 (talk) 21:42, 26 September 2019 (UTC)
Are you giving up on us? 😃 Actually I like the idea of a tool to do this as I have done similar copying myself. I suppose we could start with all all words having only one sense in the translation section of both wiktionaries. Could you help me write some pseudo code for this?
customTargetWikt = en
customSourceWikt = sv
languages = [list of 14 languages you feel confidence with]
articles = [European, Caucasian]
for i in articles do
 look up target translation section
 check only one section
 split into array by line
 look up source tr section
 check only one section 
 split into array by line
 compare the two and show form
end

form
 give user checkboxes for each line to be copied

submit
 for i in checked box do
  add the lines to targetwikt section
--So9q (talk) 04:18, 27 September 2019 (UTC)
Another consideration beside the ones mentioned above is that copying without attribution is technically a violation of Wikimedia's Creative Commons license, though normally it's too minor to make any practical difference. Systematic, mass copying might be pushing the boundaries a bit, though.
To do it properly you would need to mention that it's copied in your edit summary and the page you got it from. That way it's possible to trace the edit history back to the original contributor. Chuck Entz (talk) 04:51, 27 September 2019 (UTC)

Vandalism on Thai entriesEdit

This week, I found that some IPv6 has vandalized Thai entries. They usually add misinformation and delete partial content on the entries. Sometimes, they vandalize other language that uses Thai script. Please keep an eye on them. --Octahedron80 (talk) 03:09, 28 September 2019 (UTC)

Example: 2601:8A:4103:1440:FD60:C954:C3A0:5058 (talk) 2601:4A:C480:2F60:9CF1:F103:E82B:2B65 (talk) 2601:4A:C480:2F60:F0DB:15A:FF76:9DE4 (talk)

Dispute resolution and reporting bad behaviourEdit

I've consulted Help:Dispute_resolution.
It is very short, and quite lacking.
Wikipedia:Dispute resolution is quite good and thorough. The options are clear, and it goes through all the possibilities. Here on Wiktionary, however...

First of all, no one uses discussion pages (why are they called "discussion" here, but "talk" o Wikipedia, BTW?). What's the point of them being there, if they aren't used? Also, this means that you don't have a clue as to if/what discussion has happened, in regards to an entry, unless you comb through the Beer Parlour (and maybe some other places?)
...and how/why is anyone supposed to even be aware of those options? Not only does this make discussion needlessly convoluted, but also essentially hidden (to any who do not make the effort to make themselves educated about the system)
...which is completely contrary, to the whole idea of Wikimedia's sites. Granted Wiktionary is a somewhat different thing to Wikipedia, but I fail to see how that could possibly explain this, or any other reason for why this is a better way of doing things, here.

Help:Dispute_resolution is very brief. Brief to the point of barely saying anything. Also, concerning the line "talk to a friendly Administrator"... How would one even begin to do so? Who should you choose? On Wikipedia, you have noticeboards to bring things up on, that are checked by admins. That system it good and simple, and works perfectly well. Why isn't it like that, here?
...and that line is only about someone who keeps pestering an editor. What about the great multitude of other, bad/destructive/disruptive behaviours, none of which are mentioned or addressed? Does this mean that you are fine with (and thereby, in effect, encourage) any bad behaviour, outside of someone pestering another? Vandalism, edit warring, abuse of power... these are all perfectly acceptable things to do? After all, there are (apparently) no consequences.

You have nothing like Wikipedia's Dispute Resolution Noticeboard, which are there to mediate a discussion, when it is found to not work, with just a normal talk page discussion. An excellent tool, that performs its function quite well.

Also, nothing here that is the equivalent of Wikipedia:Dispute_resolution#Resolving_user_conduct_disputes, Wikipedia:Edit_warring, Wikipedia:Civility#Dealing_with_incivility or anything like that...--213.113.49.180 20:53, 28 September 2019 (UTC)

Yes, those are all correct observations. DTLHS (talk) 21:36, 28 September 2019 (UTC)
We have the occasional problematic editor, but nothing like on Wikipedia. Administrators tend to block trolls and edit warriors, and that is it. The more rules and defined processes you have for dispute resolution (like on Wikipedia), the more you invite wikilawyering and people trying experimentally to find the boundaries of acceptability. The disputes we have are about technical issues and the emotions do not run high (except for the occasional, rare, problematic editor). Given this actual state of affairs, I feel Help:Dispute resolution is adequate. For example, if you question how a term is defined, bring your concern to RFV. All administrators are friendly (if you are courteous and not trolling). I’ve never felt a need to contact one (in their capacity as administrator), but if the need arose I‘d pick one who is knowledgeable about the matter the dispute is about.  --Lambiam 00:28, 29 September 2019 (UTC)
"Administrators tend to block trolls and edit warriors, and that is it."
They can only do so, if they know about it ...and they can only know about it, if people report it. There is, however, no real way to do so.
"the more you invite wikilawyering"
As opposed to not having the faintest clue, as to how to go about things? No idea if what they do is okay or not? Wikilawyering is easy to deal with. Complete confusion isn't.
"and people trying experimentally to find the boundaries of acceptability."
If you have no boundaries, then people are guaranteed to breach them. Intentionally or not. Having boundaries lets good faith editors know, how to go about things. Those who would experiment around the boundaries, will do so either way.
"I’ve never felt a need to contact one (in their capacity as administrator), but if the need arose I‘d pick one who is knowledgeable about the matter the dispute is about."
...and how would you know, which one that would be? (if it's just if they know the language... you'll generally still end up with many different options) You, specifically, might, but not everyone does. Especially someone new, who would be completely clueless. If, on the other hand, there was a noticeboard admin issues (maybe just the one, where Wikipedia would have many separate ones, due to the smaller number of people, but nevertheless a noticeboard), then it would be a simple matter. How/why would it be better to ask people to chose an admin, from a long list, without having the faintest clue?--213.113.49.180 03:02, 29 September 2019 (UTC)
Old discussions of words in the Tea Room, RF… rooms, and other such pages are generally archived to the talk pages of the words in question after they’ve been resolved, so checking the talk pages should show you the relevant previous discussions, if there were any. Talk pages of entries are also sometimes used to ping specific editors for help with the word in question (using e.g. the {{ping}} template). What they are not used for is community-wide ongoing discussions of particular entries; those should be brought to the Tea Room or one of the various RF… pages, as appropriate. The reasoning behind this is that our community is small enough that posts on entry talk pages usually go unnoticed, so unless you’re pinging some specific editor, it doesn’t make much sense to post on entry talk pages.
As for how anyone is supposed to be aware of the options, the Main Page prominently links to the Community Portal and Discussion rooms, where all of them are listed — not exactly ‘hidden’ IMO.
We don’t have noticeboards because there’s never been much of a need for them; all the main discussion rooms are checked by admins anyway, and problems usually get brought up there if not on individual user talk pages. For newcomers we have the Information desk specifically set aside for them to raise questions and concerns. More experienced users typically know which admins are working in their field (once again, the community is very small). — Vorziblix (talk · contribs) 02:15, 30 September 2019 (UTC)
"What they are not used for is community-wide ongoing discussions of particular entries"
Community-wide? What are talking about? Who said anything about anything that is community-wide? Talk pages are never used for anything that is community wide. You appear to be completely ignorant of why they exist, why they were made (in both MediaWiki, which Wiktionary and other Wikimedia projects use, and other Wiki systems), and how they are used. (in pretty much all other Wikis, aside from Wiktionary)
Talk pages are for discussing how to edit the specific article/entry they are connected to, with the others who edit that specific article/entry. No more, no less. There is nothing community wide about it, in any way, whatsoever ...nor is there, usually, any need for it to be.
For the rare occasions when a discussion needs to go wider than that, or if certain dispute resolution system is used (such as a discussion being taken to a Dispute Resolution Noticeboard), then a link to it is added there, with the wider discussion being done in the relevant place for it. (though there are certain efforts for widening things, where it still remains in the talk page, such as Wikipedia's system for "request for comment", to take a relatively common example)
"The reasoning behind this is that our community is small enough that posts on entry talk pages usually go unnoticed, so unless you’re pinging some specific editor, it doesn’t make much sense to post on entry talk pages."
If more than one person edits an entry, then all those editors would notice edits to the talk page. Hence, the talk page would not go unnoticed. If there are no other editors, then there cannot possibly be any disputes, regarding how to edit the entry, and thus no need for discussion. Given that, your argument doesn't really make any sense. Granted, questions and requests, rather than discussions on how to edit the entry, would face the problem you mention ...but that could be solved with a notice on the talk pages (including when you create one), pointing that out, and pointing people to where such things should be stated. It does not, however, apply to discussions. At all.
"As for how anyone is supposed to be aware of the options, the Main Page prominently links to the Community Portal and Discussion rooms, where all of them are listed — not exactly ‘hidden’ IMO."
First of all: Who ever visits, much less reads, the main page of any Wikimedia site? People can, if they seek answers, look at the bar to the left, but the main page? No.
Secondly: Why do you expect that everyone does a thorough search, and plenty of reading? Why have such a steep learning curve? I'm relatively good at these things (and have plenty of experience in looking up such things, from Wikipedia, where I used to edit regularly. Something I should see about getting back to ...but that's not relevant to this discussion), and have looked up and read the articles you mention ...and I still don't get it. At all.
"We don’t have noticeboards because there’s never been much of a need for them"
You don't have need for one. That doesn't mean there isn't a need for one. When instructed to talk to an admin ...and then be faced with a long list of people, having not the faintest clue, as to who to pick, and thus be stranded not knowing what to do... (whereas having an admin noticeboard, where you put stuff for admins, and any admin who knows they are appropriate to deal with the particular issue, can chime in, would have solved the issue, perfectly well ...without making things any more complicated or effortful, for the admins, in the least)
How is that a situation, where "there is no need"?
"all the main discussion rooms are checked by admins anyway"
So you are saying that the instructions, at Help:Dispute_resolution, are wrong, then?
"and problems usually get brought up there if not on individual user talk pages."
If it is brought up in an individuals talk page, that doesn't mean that it gets proper attention ...and according to all the info I have been able to find, this is not the proper place, for reporting bad behaviour. Admins are. If you're saying that's not correct, then you are stating that the information on Help:Dispute_resolution, Wiktionary:Beer_parlour, and other articles on Wiktionary, is fundamentally wrong.
"For newcomers we have the Information desk specifically set aside for them to raise questions and concerns."
If people have to ask about basic issues, that means that the information is not clear and/or easily accessible/"findable".
"More experienced users typically know which admins are working in their field (once again, the community is very small)."
...and those who aren't "more experienced", should be left to fend for themselves, having no clue, whatsoever?
Wikipedia has the guideline of "Please do not bite the newcomers", and generally talks about striving, in both rules and the desired behaviour of editors, to make things easier for the new/ignorant.
Does Wiktionary have the opposite view: Do everything for the experienced editors, and don't give a damn about newbies? Make it difficult to know how things work? Surely not?--213.113.49.34 04:34, 1 October 2019 (UTC)
Our experience with newbie contributions, especially in English, seems to be trending toward lower and lower net value. We have fairly high levels of coverage of terms, so new definitions offered by newbies are increasingly wrong in some way. We have more and more templatization and standardization, both documented and undocumented. These make it a bit difficult for a newbie to help without taking the time to learn.
I suppose that we are much less interested in process than WP is. Many of us are escapees from WP. I certainly find WP not worth the time and aggravation, largely because of the elaborate processes and what feels like consequent wikilawyering. DCDuring (talk) 05:07, 1 October 2019 (UTC)
"Our experience with newbie contributions, especially in English, seems to be trending toward lower and lower net value."
That is a sign that you're clearly going in the wrong direction.
"We have more and more templatization and standardization, both documented and undocumented."
You say that you are, and wish to be, simpler than Wikipedia ...but also that you are, and should be, far more complex than Wikipedia? Also, if you have standards and templates, what justification is there, for any of these to be undocumented?
I suppose that we are much less interested in process than WP is.
Having less documentation and clear and efficient systems, does not mean that you get less process (in fact you get more). Just more confused process. Any less process you have, is purely due to having less people.
"Many of us are escapees from WP. I certainly find WP not worth the time and aggravation, largely because of the elaborate processes and what feels like consequent wikilawyering."
I fail to see anything all that elaborate, about how things work at Wikipedia ...and how/why does not having any rules, stop malicious behaviour? You probably get a lot less wikilawyering and other bad behaviours, but only because so few bother to edit Wiktionary. Especially those who just want to disrupt. (who are more prone to go to bigger places)--213.113.49.34 13:04, 1 October 2019 (UTC)
  • Do you have any specific complaints or do you just wish we had a process that was to your taste? DCDuring (talk) 05:09, 1 October 2019 (UTC)
"Do you have any specific complaints"
What exactly do you mean? (also: there are several specific complaints, in what I have written)
"or do you just wish we had a process that was to your taste?"
I deeply resent your suggestion. I see that Wikipedia's policy of "Assume good faith" (which I find, is a rule one should follow, not just on Wikipedia, but generally) is not shared by Wiktionary.
I am not asking for things to be to "my taste!" I am asking for things to make sense! To be intelligible. To be as good as they can.
If you think my arguments are invalid, if you think the current system (and the articles that describe/explain it) is better than what I suggest: You are free to argue against what I say. I always welcome honest and reasoned criticism. If anyone shows me that I am wrong/mistaken, I thank them, as they have not only taught me something, but also (far more importantly) rid me of a misconception.
However, I would ask you to refrain from making baseless ad hominems, or any other blatant and obvious fallacies.
As I said: I accept reasoned and honest criticism. (even if it is far from being civil ...except if I am in a place with rules against incivility, of course)
Criticism that isn't, however, isn't something I am willing to put up with. (and can anyone make a reasonable case, for why anyone should?)--213.113.49.34 13:04, 1 October 2019 (UTC)
  • This seems like an effort to generate something to complain to WMF about. DCDuring (talk) 13:44, 1 October 2019 (UTC)
I have no idea what you are referring to, by "WMF". The one single thing I can see, is that you are dismissing what I've said, out of hand, with mere baseless insults and ad hominems. I have done nothing to earn your insults (as you yourself implicitly admit, by making no arguments aside from ad hominems that you don't even bother to back up. Not that ad hominems are okay, even if the criticism in it has any basis, but still...), and even if I did, the way in which you make your accusations, still flies in the face of WT:Civility. Quite unlike any of the other replies, I've gotten here in the Beer parlour. Given how public this discussion room is, I would hope someone picks up on this fact, and acts accordingly. Either way, I see no reason to respond further, to anything you say, as long as you chose to behave in a clearly hostile/malicious manner, which only seeks to hurt/anger/dismiss, rather than resolve/explain/debate.--213.113.49.34 16:27, 1 October 2019 (UTC)
Ignoring all else which has been said, I agree that the Dispute Resolution page language is unhelpful and not particularly relevant to the current state of the project. I have started a draft for new language aiming to be clearer and more helpful to those not as familiar with the project at Help:Dispute resolution/update. - TheDaveRoss 15:08, 1 October 2019 (UTC)
Looks a lot better.
Solves the problems with the "contact an admin"-bit and a lack of a notice board(s) for issues, by making the Beer Parlour serve those functions instead, which I guess could work.
I notice, though that your draft instructs people to use talk pages, just as other wikis do. This would involve a fundamental change in policy. One I would approve of, as is obvious from what I've written, but given how that goes against how things have been done here, for so long (not to mention the comments about it, in this section), I'd say getting that through may take some doing. (on that note: When I created this topic, I didn't exactly expect to convince everyone, straight away. I expected to mainly get opposition. To start with, and maybe also at the end ...but even then, I might push some a little, plant a few seeds... "A journey of a thousand miles begins with a single step". Well, hopefully not a thousand miles) Also, as it is not, at least not yet, the policy of Wiktionary, I'm unsure if I should have made this reply on the talk page of the draft, or here, so... I figured the safest option was to comment here. For now, at least.
Furthermore, your list and description of discussion rooms fails to mention where one should bring up discussions of how exactly a given word should be defined. (see the other topic I made here. I'm not entirely confident that the response I got, was completely correct. The responses to my RFV hasn't been that it's in the wrong place, but I'm not sure if that's because it is the right place, or because people let it slide, or something)
Still: As I said, it's a clear improvement, IMO. A good start.
...but given how every one else has responded here, I'm not too hopeful about it getting implemented. (I hope I'm wrong about that) After all, it would appear to be completely counter to the opinions of everyone else who has responded to me, so far.--213.113.49.34 16:27, 1 October 2019 (UTC)
It should be based on "sticks and stones may break my bones but words will never hurt me". If somebody starts "doxing" (posting real-world info) then shut it down fast. Otherwise, who cares. Equinox 16:11, 1 October 2019 (UTC)
So bad actors should just be ignored? Be allowed to behave badly? Granted, it may be sensible for normal individuals to ignore bad actors, but if admins don't at least tell them off, in some way, if it's not marked as unacceptable behaviour, how is that not an implicit approval of the bad behaviour? (and also implicitly telling the victims, that you're fine with them being treated like that? That, if anything, they are the ones at fault?) ...and does that not make a complete lie, out of WT:Civility? Making it a rule that isn't worth the paper it isn't written on?
Furthermore, while bad actors can easily be ignored, in a place like here in the Beer Parlor, what about when editing a Wiktionary entry?
...oh, and I've always found the saying "sticks and stones may break my bones but words will never hurt me", to be a rather ridiculous saying. After all, physical injuries heal easily. Injuries to the heart, however... not that I'm hurt by foolish insults from some random guy on the internet. Why would I care what they say? ...but the principle still stands: Emotional pain is no less serious than physical pain. On the contrary, it can often be far more long lasting and harder (or, indeed, impossible) to heal.--213.113.49.34 16:41, 1 October 2019 (UTC)
I'm glad you find us ridiculous. How about you do some actual work of any kind here, then we will take you seriously, if you're lucky. Equinox 16:50, 1 October 2019 (UTC)
Sarcasm is not exactly civil. You have yet to show any evidence, that I have been malicious or unconstructive, in any way (not that you've even attempted to try), or at all uncivil (unless you count the instances, where I point out clear evidence of incivility, in others ...but if you can't do that...), nor have you made any attempt to rebut any of my arguments. (though I am, of course, not claiming that no one has. There are others who have engaged in discussion ...not counting DCDuring, as he/she has clearly chosen to abandon that, in favour of mere ad hominems and slander. Not that DCDuring's tone sounded particularly civil before then, though it wasn't enough to surpass the assumption of good faith ...other than in retrospect)
And then you baselessly, and very falsely, claim that I haven't done any work on Wiktionary.
There is my edit to なぎなた (which was apparently reverted, for no apparent reason), なた (which, to be fair, Eirikr replaced with having it take its info from , which is a clearly better solution ...but that was clearly spurred by my edit, and it was one, that did the same as mine did, just that it now automatically updates to reflect any future changes), 提げ物 (which was reverted, for no apparent reason ...though the current version does contain my change of meaning, if not my exact wording. Thus my edit was apparently deemed legitimate, and led to improvement), , 薙刀, 蜜柑 (even if you should chose to dismiss what got reverted, I still got some stuff through), 長刀, 余り, 全然, みかん, 眉尖刀, shield, 太刀, billhook, my creation of 鉈鎌 and 剣鉈 (I've been tempted to also create 大鉈 and 腰鉈, but at the moment, that would have to wait for the RFV on 鉈 to be settled first)... I had forgotten a bunch of those, until I looked them up. I'm kinda surprised at the length of that list. I think that covers all of my recent edits. (I've done a rare few edits before. Dunno what happened to those. Don't remember them)
As I told DCDuring: Make a proper response, and I'll happily reply, but I will not dignify any further responses from you, if they continue to be like this. It would be pointless.--85.229.232.82 09:03, 2 October 2019 (UTC)
I fail to see how an anonymous person posting lengthy arguments about how everything should be different, and dismissing various major editors on the site, is particularly constructive. We're not going to change everything, and by being so broad, you're just causing distraction instead of ultimately producing change.--Prosfilaes (talk) 09:58, 2 October 2019 (UTC)
"I fail to see how an anonymous person"
My being anonymous, is completely irrelevant. It's no more than an ad hominem.
"lengthy arguments"
Well, it's big issues. I do want to be concise, but... it's not really possible to cut these things down, and still keep the important bits. (also, I'll freely admit that I'm not that good at being concise)
"dismissing various major editors on the site"
You are talking about people who clearly dismissed me, out of hand, whilst blatantly going against WT:Civility and WP:Assume_good_faith. People who show that they are not, at all, interested in honest reasonable debate ...and, as I've stated clearly, I only dismiss any further comments they make if they continue to display such behaviour. As soon as they behave in a civil, honest, and rational manner, I'll be fine with responding. Anyone who behaves in such a manner, will never have any problems with me. Disagreements, sure, but not problems. (maybe a bit of problem, due to misunderstandings, but as long as they're civil, honest, and rational, those should be easy to clear up, so...)
"We're not going to change everything"
Not until/unless you are convinced, no ...and as the system has been this way, for a long time, you're (note: plural you) no doubt set in your ways and convincing you would take some doing. Probably won't happen overnight ...but the number of issues that have gotten changed, over time, due to it being brought up for debate, now and again, is astronomical.
Also, I assumed that bringing it up, would make you explain your reasons. (which is good, both for me to learn them, and for you to think about them more) Something that has only partially happened. I've mostly not gotten any further responses, beyond a first one, from editors, which makes things very superficial. You don't really get any proper understanding of things, that way, nor does any actual discussion get started, in any meaningful way.
"by being so broad, you're just causing distraction instead of ultimately producing change."
I don't agree about it being broad.
As for a distraction... All debate is distraction. If you want to create change, you have to distract. That's just how it is. That is no excuse for shutting down debate, at the first sign of disagreement. If you disagree, make counter-arguments. Present contrary evidence. (concerning the issue at hand! Not the people involved) When you've talked through all issues, go with the consensus. Even if you think it's wrong. (there have been some Wikipedia discussions that didn't end up with the result I wanted ...but as long as everything was done properly, I've always been fine with it) That's how things should go.
Dismissing a persons arguments, out of hand, for no other reason than a baseless assumption of bad faith or just strongly disagreeing/disliking, is not a valid approach.
An echo chamber, does not lead to progress. It leads to stagnation.
...and how could you be sure if this hasn't done anything to lead to change? Sure, change hasn't happened, straight away, but that's just the short-term result.--85.229.232.82 12:04, 2 October 2019 (UTC)
If it's not worth your time to be concise, it's not worth my time to respond.--Prosfilaes (talk) 13:19, 2 October 2019 (UTC)
I clearly stated that I try to be concise.--213.113.50.173 01:34, 3 October 2019 (UTC)
Everything about this discussion is bad, I vote that we end it now. - TheDaveRoss 13:30, 2 October 2019 (UTC)
What discussion? None of you, has bothered to actually try to discuss. "Two monologues do not make a dialogue." - Jeff Daly
It takes two to tango ...and so far, I'm the only one trying to "dance". (i.e. discuss ...and no, taking just one or two steps, doesn't even begin to be a dance) ...or can you demonstrate otherwise?--213.113.50.173 01:34, 3 October 2019 (UTC)

WT:RFC vs WT:RFV vs WT:RFV. What does WT:RFV do? Where can you get help with discussing the definition of a word?Edit

Help:Dispute_resolution states that WT:RFC is for stuff like formatting, example sentences and trivia. WT:RFV is for things like content, definitions, synonyms, pronunciation and etymology ...and WT:RFD is for determining if the word exists or is relevant. (notable?) Thus I must conclude that if one seeks help with a discussion/dispute concerning the definition/meaning of a word, it should be posted on WT:RFV.

...but on WT:RFV, however, it states "This page is for disputing the existence of terms or senses. It is for requests for attestation of a term or a sense, leading to deletion of the term or a sense unless an editor proves that the disputed term or sense meets the attestation criterion as specified in Criteria for inclusion/.../", which is a direct and complete contradiction of what Help:Dispute_resolution says
...and leaves anyone seeking help with a discussion/dispute concerning the definition/meaning of a word, with nowhere to go.--213.113.49.180 21:05, 28 September 2019 (UTC)

For an initial discussion (without officially nominating a word as challenged) try WT:TR. Equinox 21:12, 28 September 2019 (UTC)
But what if you actually want the definition (but not the existence of the word) challenged?
And why isn't that (nor the other discussion rooms one can see, when you go there) mentioned, at Help:Dispute_resolution? (not to mention the very confusing contradiction, between Help:Dispute_resolution and WT:RFV)--213.113.49.180 22:20, 28 September 2019 (UTC)
WT:RFV is the right place if you would like to dispute that a term has a given definition. It says "[This page] is for requests for attestation of a term or a sense". Sense is another word for a definition. — Eru·tuon 22:51, 28 September 2019 (UTC)
The tea room is where you would discuss definitions for senses that you don't want removed, including challenges to their validity. If you want to challenge whether a definition matches usage (as a descriptive dictionary, we go by usage rather than authoritative references), you would use rfv As for your other comments/questions: Wiktionary has far fewer active users and admins than Wikipedia, so we can't afford the kinds of bureaucracies and detailed procedures that Wikipedia has. Also, we're a dictionary, so there's a far smaller amount of allowable variation in content: definitions are concise, streamlined and and more telegraphic in style. Your original version of the definition was a fairly decent short encyclopedia paragraph, but rather wordy for a good dictionary definition.
As for the nature of your interactions: the edit comment that says "please leave a message on my talk page" is automatically inserted by the rollback tool and can't be changed for an individual edit, so you shouldn't read too much into it. Please bear in mind that we have 6 million entries and 97 admins- both active and inactive (which explains why posting on talk pages isn't very effective). Of course, most of those pages aren't being edited, but we have well over a thousand new edits that need patrolling every day. Only a handful of the people who patrol those edits know enough Japanese to spot problems, and Eirikr is the only admin among those who regularly spends a substantial amount of time on it.
All of our patrollers are overworked, but it's especially bad with Japanese. Eirikr could have handled your case better, but it's a bit much to leave an outraged, screaming "ALL CAPS" message after less than a day. Not only that, it dramatically raises the stakes, and therefore the amount of time and attention needed for a proper response. Leaving a wall of text on someone's talk page after that and then characterizing a fairly detailed reply as a non-response because it didn't categorically and decisively refute every point doesn't strike me as very helpful, either.
The bottom line: regardless of whether you're right or wrong, the amount you've contributed so far vs. the amount of time spent by a knowledgeable editor responding to your demands leaves me wondering if you'll ever be worth the time taken away from improving our Japanese entries- right now, you're more of a time-sink than an asset. I hope I'm wrong, but it doesn't look too promising. 00:36, 29 September 2019 (UTC)
So it would appear that WT:RFV is right, then? Then the article should be rewritten, to be less confusing. Again, I quote "Overview: This page is for disputing the existence of terms or senses. (original emphasis) It is for requests for attestation of a term or a sense, leading to deletion of the term or a sense (emphasis mine) unless..../". In other words, it indicates that it is about removing a term or sense. Not modifying, adjusting, or changing it. Just deleting.
"Wiktionary has far fewer active users and admins than Wikipedia, so we can't afford the kinds of bureaucracies and detailed procedures that Wikipedia has."
That makes no sense. Sure, it might explain why you don't have as many boards, as they do, but it does not explain having none ...and it does nothing to explain the non-use of discussion/talk pages.
Also, we're a dictionary, so there's a far smaller amount/.../"
Everything you mention hereafter, is completely outside of what I brought up. It is completely off-topic and irrelevant, to the issue at hand. Important, sure, but this isn't the place for it. You could have responded, where I have discussed it (or somewhere else more appropriate, and point me there), but certainly not in an unrelated topic, such as this. Therefore, I will not respond here. (do it at an appropriate place, and I'll be happy to)--213.113.49.180 02:50, 29 September 2019 (UTC)
When it comes to your comments on the bureaucracies/systems here, you can talk about that on the other section I made here. Not on this one.--213.113.49.180 03:07, 29 September 2019 (UTC)

GeminationEdit

I think it is better that we showed the gemination in Latin words, using ◌ː in their narrow transcription, instead of showing the phonemes one after another. For example, for bucca, what is /ˈbuk.ka/ in its broad transcription can be transcribed phonetically as [ˈbʊkːa]. Pinging @Fay Freak & @Brutal Russian for this. —Lbdñk (talk) 11:15, 29 September 2019 (UTC)

But then where does the syllable division go? —Rua (mew) 11:56, 29 September 2019 (UTC)
@Rua: It remains in the broad transcription; to make up for its loss in narrow transcription, we could show the hyphenation for the word (eg. buc‧ca) in the pronunciation section, as is done in entries of modern European languages, where, I have seen, the syllable division is not much cared about. —Lbdñk (talk) 12:19, 29 September 2019 (UTC)
More work for no added benefit. No thanks. --{{victar|talk}} 16:31, 29 September 2019 (UTC)
Curiously enough, the current Italian transcrption goes with the double spelling - bocca. Why does la-IPA have syllable division, any way? I think it either should go altogether, or the double consonant be left spelt double. Brutal Russian (talk) 15:19, 29 September 2019 (UTC)
Though it seems my proposal would not be accepted, @Brutal Russian, are you for or against using ◌ː for gemination in Latin words? —Lbdñk (talk) 18:01, 30 September 2019 (UTC)
@Lbdñk: I'd be fine with it, but only if all syllable divisions in the IPA are dispensed with. Otherwise consistency demands geminates be also divided. I don't feel particularly in favour of either option, but dividing syllables in the IPA doesn't seem like a terribly common practice, as you say, especially when it's already done in the phonemic transcription. Brutal Russian (talk) 20:49, 30 September 2019 (UTC)
For phonetic transcription, I don't care whether geminates are transcribed like [t.t] or like [tː]. While omitting the syllable division mark in this context and not in others is a little inconsistent, it doesn't remove information from the transcription, because Latin geminates predictably contain a syllable division. (For comparison, the stress mark also unambiguously indicates a syllable division, and so an explicit syllable divider is not used next to it.) I am in favor of keeping the syllable division mark in other contexts, and in all contexts in Latin phonological transcriptions (I don't think Lbdñk was suggesting removing them in this context, since the original post in this thread mentioned using /ˈbuk.ka/ alongside [ˈbʊkːa], but I'm not sure whether Brutal Russian was talking about replacing /ˈbuk.ka/). Syllabification of consonants is relevant to Latin scansion: a syllable with a short vowel is heavy if it ends in a consonant, but light otherwise.--Urszag (talk) 22:46, 30 September 2019 (UTC)
@Urszag: I agree with you that using ◌ː is not goin' to make the syllable division ambiguous, and your point is so good. And obviously, we cannot do without syllabification owing to its phonological and metrical significance. So, by my proposal, syllabification would always be shown in broad transcription; in narrow transcription, only geminates would make the exception. Therefor, with your backing, I am thinking of starting a vote for representing geminates with ◌ː in narrow transcription. @Brutal Russian, I will need your backing too. —Lbdñk (talk) 18:02, 1 October 2019 (UTC)
@Urszag, Lbdñk: If one knows how to divide Latin words into syllables, nothing will make syllable division ambiguous to them since as you say, it's by and large predictable - two consonants make two syllables unless mūta cum liquidā. If one doesn't, there's nothing obvious about a geminate consonant being split - Southern Italian even has initial geminates. But forget that, I'm not suggesting we dispense with them in the phonological transcription - just in the phonetic one, if and since we're at it. I simply don't see for the sake of what we want to introduce the inconsistency, while on the other hand consistency is a reason onto itself. Is there a single other language on wiktionary whose IPA features syllable division? Brutal Russian (talk) 23:10, 1 October 2019 (UTC)
(edit conflict) @Brutal Russian: Yes, Ancient Greek ({{grc-IPA}}) comes to mind immediately (though the module isn't completely accurate), as well as some others that I found looking through Category:Pronunciation templates: Arabic ({{ar-IPA}}), Catalan ({{ca-IPA}}), French ({{fr-IPA}}), Polish ({{pl-IPA}}), Sanskrit ({{sa-IPA}}). — Eru·tuon 23:21, 1 October 2019 (UTC)
@Erutuon: No-no-no, I mean the IPA phonetic transcription, not the IPA module. Brutal Russian (talk) 23:24, 1 October 2019 (UTC)
@Brutal Russian: Well, the Sanskrit template is the only one listed above that generates a phonetic transcription with syllable division. The others generate phonemic transcriptions. [Edit: I went through Category:Pronunciation templates and the only other example was {{mnc-IPA}}. So it's pretty rare. There were many more phonemic transcriptions with syllable division than phonetic.] — Eru·tuon 23:30, 1 October 2019 (UTC)
@Erutuon: I see, thanks. I don't know whether Manchu has double consonants, but the Sanskrit transcription - seemingly the ony one with syllable divisions both in phonemic and phonetic - likewise spells them double: सत्त्व (sattva). Brutal Russian (talk) 12:09, 2 October 2019 (UTC)
@Brutal Russian: So coming back to the original topic, whatever steps be taken, I am against simply using phonemes one after another for geminates in narrow or phonetic transcription. Even {{sa-IPA}} is doing much better in this aspect in that, in words like सत्त्व (sattva), while transcribing diagraphs (here त्त्) the first consonant is being shown with an unreleased stop, thereby making the transcription perfectly phonetic, while at the same time not getting rid of syllable break. Whereas in Latin the transcription is misleading: [k.k] is too scanty in phonetic transcription, and is acceptable only in broad or phonemic transcription. —Lbdñk (talk) 18:44, 2 October 2019 (UTC)
@Lbdñk: First, I don't think that the phonetic transcription that we give for Latin has to be narrow. In fact, I'm a little leery of the goal of giving a narrow transcription for Classical Latin pronunciation, because since it is reconstructed, there are always going to be gaps in our knowledge of the phonetics. It's not as problematic as trying to give a phonetic transcription of, say, Proto-Indo-European, but there are a few places where I think the current transcription for Latin is already too narrow—specifically, the transcription of [kʷ] as fronted [kᶣ] before front vowels (Allen's argument for this fronting is not implausible, but I think it's inconsistent to transcribe this detail while leaving out other plausible details of similar specificity, like the "sonus medius", or the probable existence of allophonic fronting of /k/ and /g/ before front vowels). Second, I don't really understand what's misleading about using [k.k]. This transcription doesn't imply that the initial [k.] is released; it doesn't specify that it is unreleased either, but that ambiguity is not a flaw in my opinion because we can only guess about whether Latin speakers ever used a distinct release for the first half of a geminate.--Urszag (talk) 23:34, 2 October 2019 (UTC)
Why isn’t there a sign for syllable break inside a segment, if syllable break is suprasegmental and can occur during a segment? We need some kind of combining character below geminates. It is ugly and distracting anyway to have that dot (U+002E FULL STOP) between syllables. It appears to be baggage of the typewriter age. In suprasegmental analysis one uses many signs below and above a line if one is able to, as particularly in handwriting. If you see the full-stop in published works denoting a syllable break it is not unrestrictedly a standard. Fay Freak (talk) 12:50, 2 October 2019 (UTC)
@Fay Freak: Why are you against using the dot for syllabification? —Lbdñk (talk) 18:44, 2 October 2019 (UTC)
@Lbdñk As you see the syllable break can be inside a geminate consonants signified by Xː but we cannot put the dot inside it. It is suprasegmental anyhow! Hence there must be a mark below or above the line. Also, doesn’t one normally use a middle dot for this purpose? At least that’s what I have seen in many dictionaries though not necessarily in IPA. Fay Freak (talk) 19:06, 2 October 2019 (UTC)
@Fay Freak: I had already stated above my proposal for such problems as this. Repeating it again: syllabification would always be shown in phonemic transcription, so if we cannot show the syllable break inside a geminate in phonetic transcription, then there is no problem, as the phonemic transcription would be showing it anyway (/ˈbuk.ka/ versus [ˈbʊkːa]). —Lbdñk (talk) 19:34, 2 October 2019 (UTC)
@Lbdñk That is confounding, though not necessarily for me. I have seen people passing by and removing syllabification from phonemic transcriptions because of it being a feature of narrow transcription. Fay Freak (talk) 19:36, 2 October 2019 (UTC)
@Fay Freak: On the other hand, whenever I find English words being shown without syllable division (and the transcription used is mostly phonemic), I syllabify them. Be that as it may, we should go ahead and modify {{la-IPA}}, showing Latin geminates by ◌ː in phonetic transcription. —Lbdñk (talk) 20:02, 2 October 2019 (UTC)
@Fay Freak: The IPA's symbol for a syllable break is the period .; using the middle dot · is a non-IPA convention. — Eru·tuon 19:51, 2 October 2019 (UTC)
What speaks against putting the dot between ◌ː? Maybe that’s intended, because they shouldn’t have overlooked the fact that one can’t use the dot otherwise when there is a syllable break during a geminate consonant. (For kk is not synonymous to kː.) Fay Freak (talk) 20:06, 2 October 2019 (UTC)
You mean why not do ◌.ː? I don't recall seeing that done intentionally; ː is meant to be placed directly after the consonant or vowel that it modifies. ( looks like it represents a "long syllable break", which is nonsensical.) It's not just a problem with long consonants; we can't put syllable breaks in the middle of single consonants either, but English is supposed to have ambisyllabicity. — Eru·tuon 01:46, 3 October 2019 (UTC)

Capitalization of proper nouns in languages using scripts without a lower/uppercase distinctionEdit

Sorry for the long post. TL;DR below.
Focusing on Germanic initially - as that is the primary domain in which I edit - I would like to draw attention to the issue of how we represent proper nouns on Wiktionary. I have recently been working on Continental Germanic attestations of Germanic theonyms a bit, and I noticed we had donar at lowercase (per an earlier deleted edit, it was moved from the uppercase version to its current lowercase spelling in late 2010), whereas our entry at *Þunraz lists the OHG name with a capital among its descendants.

This reveals a tension in the way we handle ancient scripts on Wiktionary: on the one hand, capital letters as a means of distinguishing proper nouns is an early modern innovation and their use is an anachronism, so using only lowercase letters would be more true to the sources, which in the case of Germanic typically use some variant of the Carolingian minuscule, Gothic minuscule, Gothic alphabet or runic - all scripts lacking an upper/lowercase distinction. On the other hand, a lot of scholarly literature describing ancient and medieval texts and even some editions of those texts use initial capital letters to distinguish proper nouns from other nouns in ancient and medieval languages.

I have since created an entry for OHG wodan and wigidonar at the lowercase spelling, as my view is that staying as close as possible to the source script in the representation (and especially transliteration) of ancient and medieval words is desirable. This is inconsistent with how proper nouns are handled in Old Norse currently, which on Wiktionary consistently uses capitalization in entry titles (see Category:Old Norse proper nouns). One may contrast this to Gothic, where I have been using lowercase initial letters in the transliteration of proper nouns consistently (see Category:Gothic proper nouns).

It seems to me that forming some coherent policy or at least a stronger guideline is desirable within the Germanic languages and perhaps further afield too (see below). Do we take the Old Norse precedent, preferring to use initial capital letters in entry titles despite their absence in the source scripts, or the Gothic one, where I have consistently used lowercase transliterations? Or a compromise?

As a final point to consider I would like to adduce the case of proper nouns in non-Germanic languages, especially those using non-Latinate scripts (which pretty much all lack an upper/lowercase distinction, except Greek). The case of Latin, which until the early modern period similarly lacked an upper/lowercase distinction (using the same scripts as abovementioned Germanic languages), is a weird one, as it has continued to be in use into the modern period and thus experienced the shift from minuscule or majuscule-only orthography to an orthography with an upper/lowercase distinction while it was still actively being used and it thus has initial capitalization in its Category:Latin proper nouns. A small overview of non-Latinate languages:

  • For Ancient Greek, which gained an upper/lowercase distinction in running text sometime in the modern period as well (the same as Latin), we have initial capital letters.
  • For Mycenaean Greek we seem to consistently forgo capitalization of initial letters, e.g. 𐀀𐀖𐀛𐀰 (a-mi-ni-so), 𐀡𐀮𐀅𐀺𐀛 (po-se-da-wo-ni).
  • Less similar are Arabic and Hebrew, which both lack capitalization in how we transliterate them. See هُولَنْدَا(hūlandā) and הוֹלַנְד‎, for example.
  • Further afield, for ancient languages we have languages using cuneiform. The picture is mixed.
  • Another ancient language is Egyptian, which consistently forgoes initial capitalization, e.g. at mꜥnḏt.
  • Classical Mongolian similarly is transliterated without capital letters: ᠪᠠᠷᠠᠭᠤᠨ
    ᠲᠥ᠋ᠪᠡᠳ
    (baraɣun tö᠋bed)
  • Chinese seems to consistently transliterate with capital letters: 三十年戰爭; same goes for Japanese.

These examples suffice for now.

TL;DR:

  • The distinction between upper/lowercase to distinguish proper nouns from other nouns in Latinate scripts only arose during the early modern period, and its projection onto earlier scripts through its inclusion in transliterations is something of an anachronism.
  • There are currently Old Germanic entries which lemmatize proper nouns at their lowercase-initial spelling (which is true to the sources, which being Carolingian minuscule, Gothic minuscule or Runic all lack an upper/lowercase distinction).
  • There are currently also Old Germanic entries, especially in the realm of Old Norse, which do use capital letters for proper noun entries, which goes against the scripts used in the sources but which accords with how they are referred to in modern scholarly literature most of the time.
  • Finally, Gothic - using a non-Latinate script (not the Gothic minuscule, but the Gothic alphabet!) based on Greek uncials with some Latin and perhaps Runic influence - does not currently use capital letters in its transliterations of proper nouns.
  • As for non-Germanic:
    • In the cases of some modern languages such as Chinese and Japanese, which have elaborate systems of transliteration/romanization, we universally use capitalization of initial letters in proper nouns despite the lack of an upper/lowercase distinction in those languages which may justify it.
    • In other modern languages such as Hebrew and Arabic, we forgo capitalization of transliterations.
    • In the cases of ancient languages using non-Latinate scripts, the picture is mixed: Ancient Greek uses initial capitals, Mycenaean Greek does not. Akkadian and Sumerian present a mixed picture. For Egyptian and Classical Mongolian we consistently forgo capitalization.

Given that many modern languages with non-Latinate scripts have their own established ways of transliterating/romanizing which we should probably not tamper with (Arabic, Chinese, Hebrew, Japanese, etc.), a general Wiktionary-wide policy seems undesirable to me. However, for ancient and medieval languages specifically we are currently being very inconsistent - often within the same language - and we would probably benefit from considering the issue and deciding on some coherent approach. I would consider it a victory if we had a clear approach to Germanic languages - especially those written in Latinate scripts, such as Old Norse and OHG - but it would of course be great if we could clear up the inconsistency with the Cuneiform scripts and with Classical Mongolian (etc.) as well (and possibly others). Thoughts? — Mnemosientje (t · c) 14:07, 30 September 2019 (UTC)

When adding Middle Dutch entries, I've consistently used lowercase, and I think that should be applied in general to manuscript forms in other languages too. However, Old Norse in particular uses a normalised orthography that strongly differs from anything found in manuscripts, so it can be argued that not just the capitalisation is an anachronism, but the whole orthography is. Therefore, I don't think capitalisation can be seen separately from spelling normalisation. —Rua (mew) 14:24, 30 September 2019 (UTC)
I would support using upper case even if it's anachronistic, since it's follow the general rule of the Latin script. While old manuscripts lack this distinction, recent work seems to follow the convention of using upper case letters. I've seen the Wulfila project use them despite Gothic lacking it. There is also the case of we not writing some Latin words all in upper case despite it being used for inscriptions. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 01:37, 1 October 2019 (UTC)
No worries, I can carry on the length on par.
Classical Mongolian transcription should probably use upper-case letters because of the Cyrillic equivalents, else the languages look more different than they are. I assume the same for for example Azerbaijani, that Azerbaijani Arabic spellings should have their transcriptions as if it were the Latin (Latin is the main script of Azerbaijani, but formerly Cyrillic).
On Ottoman Turkish and Old Anatolian Turkish: Having a 1:1 correspondence of Ottoman transcriptions is handy, it dovetails with the Turkish descendants and sometimes also etymon capitalization (say when the word is from Greek or Armenian).
But looking on پاسقالیه‎, I see that Modern Turkish uses both Turkish paskalya and Turkish Paskalya, and Turkish büyük paskalya and Turkish Büyük Paskalya so on, so technically should we have once {{ota-proper noun|tr=Paskalya}} and once {{ota-noun|tr=paskalya}} on the same page as well as the alternative forms twice because of the transcription?
No. Majuscule-writing is not decided by a word being a proper-noun: Names of recurring feasts each refer to a class of feast hence are never proper nouns. Note, English editors, that I see the inconsistency that Easter is a “noun” and Ramadan is a “proper noun”. What’s proper about “Ramadan” for it to be a proper noun? It seems wrong, also for English Eid al-Fitr, although you might not find a plural. Day of Potsdam is a proper noun because it refers to a certain festival in history. D-Day is a proper noun because it occured once, but a noun in that transferred sense; Black Friday is a proper noun applied to several entities (like forenames are not unique) and is also a noun for the phenomenon. День Побе́ды (Denʹ Pobédy) is a proper noun for its original and a noun for its holiday. Who is gonna clean that up?
A problem to mention is also that Turkish and Azerbaijani differ in usage of demonyms. Turkish İtalyan but Azerbaijani italyan for ایتالیان‎‎. But gentilics are a special problem and even debatable for Latin, see Wiktionary:Tea room/2019/July § Boundaries of noun vs. proper noun in Latin, and use of capital vs. lowercase initial letters.
I reason that from this follows that we should never capitalize demonyms in Ottoman and Old Anatolian Turkish and never names of feasts. @Allahverdi Verdizade to wit. But proper nouns. Paskalya isn’t a proper noun.
It is interesting that current Old Church Slavonic proper nouns have no capitalization, but upper case versions hardredirect as for example Исоусъ. Maybe it is more naturally to capitalize entries in every Slavic language, because otherwise one starts from some terms in upper case and then goes over to lower case at an inconspicuous threshold, see Russian Иисус (Iisus). I find it hard to conceive that there is a diachronic barrier between Slavic languages where it makes click and suddenly I have to use only lowercase, and others will find it hard too, I predict – much easier it is to say “it’s done like one does in the modern languages”. Proto-Slavic proper nouns have uppercase again, and that seems comfy.
For Arabic, a part is based on the fact there is auto-transcription of vocalized text, which is also on some wishlists for Hebrew script. It would be booky if {{ar-proper noun}} only made upper-case transcriptions. Majuscules are not needed in the head-lines anyway. I do transcribe names in quote translations with upper-case letters, because the opposite in English text is very unnatural and it is distracting to have uncapitalized proper nouns if the language consistently employs majuscules in such kinds of words.
Don’t derive anything from the cuneiform spellings. These languages have been very wild and generally not edited by people who should work in cuneiform. And now further chaos ensues from that pernicious vote according to which Akkadian can be entered in Latin transliterations. Nobody will clean up and expand our Akkadian entries if not some exotic conspiracy.
And now from the more clear case let’s explain the Germanic: Since Slavic insinuates universal majuscule usage, let’s hold it like that at least in every Latin-script entry that is of the Germanic language family. Whether the other scripts should be treated thus I am in no position to assess. Fay Freak (talk) 02:03, 2 October 2019 (UTC)
This discussion is too long, I haven't read all of it. A few points from me. Generally, it's been agreed that capitalisations should not be used for transliterating languages where there is no such distinction BUT, as for Chinese and Japanese, the capitalisation applies to proper nouns and the start of the sentences based on specific policies for hanyu pinyin (Chinese) and standard Hepburn rōmaji (Japanese). Korean romaja followed the same but this can and probably should be disputed if no such policy really exist. I was one of the proponent of the capitalisation for Korean but I won't insist on keeping this policy. That said, Chinese (multiple romanisations) and Japanese capitalisation don't follow the English capitalisation, e.g. month names or names of weekdays are lower case, so are nationalities/language names (this is still being disputed!). It's also worth noting that capital letters may mean a completely different sound or represent a different letter (in the original script) in certain alternative transliterations, that's why it should definitely be avoided for Arabic, Hindi, etc. Most editors have agreed to use just the lower case, AFAIK. For languages with dual scripts, such as Kurdish or Mongolian, I think it's still correct to romanise the script with only lower case for the script, which doesn't distinguish between capital and small letters. --Anatoli T. (обсудить/вклад) 10:07, 2 October 2019 (UTC)

Coming back to the Germanic languages, I see that Fay Freak is of the opinion that majuscules should be used word-initially for proper nouns in Latin-script languages; Rua seems to favour the use of non-capitalized forms reflecting the manuscripts except for Old Norse and User:Holodwig21 again shares Fay Freak's opinion and seems to want to extend it to Gothic too.

I'm still on the fence: on the one hand, standardizing them all to use an upper/lowercase distinction is intuitive to the modern reader. On the other hand, as Rua mentions the manuscript forms just totally lack them, and it isn't certain that the degree of normalization found when ancient words are used in modern scholarly texts is necessarily the same degree that should be used in a descriptive dictionary such as ours.

I guess it is similar to the situation of scholars who create either a regular edition of a text (typically capitalizing proper nouns even in languages which had no case distinctions) or a diplomatic edition of a text (which seeks to stick as closely as possible to the manuscript sources). In the latter case, proper nouns are rarely capitalized when the edition reflects texts in scripts that lack capitalization, and I happen to see my work in chronicling old languages on Wiktionary as more analogous to that of creating a diplomatic edition than anything else: I want to represent the language as it is found in the sources (ad fontes!) as much as is reasonably possible.

As a last point, I guess I can say with certainty that for non-Latin ancient and medieval scripts which never in their history had an upper/lowercase distinction (such as the Greek and Cyrillic scripts developed later on) it makes no sense to me to use that distinction in transliteration; e.g. Runic and Gothic should not have capitalized transliterations (yes, Streitberg's edition of the Gothic Bible uses capitalization, but he is not providing a diplomatic edition nor is he compiling a descriptive dictionary). — Mnemosientje (t · c) 14:00, 13 October 2019 (UTC)

The consultation on partial and temporary Foundation bans just startedEdit

-- Kbrown (WMF) 17:14, 30 September 2019 (UTC)

@Kbrown (WMF): Your first two links are broken. - TheDaveRoss 18:02, 30 September 2019 (UTC)
Both links should have been to Wikipedia:Community response to the Wikimedia Foundation's ban of Fram/Official statements#Board statement.  --Lambiam 20:14, 30 September 2019 (UTC)
For readers unfamiliar with the background, read this “Special report” in today’s issue of the Wikipedia Signpost.  --Lambiam 20:38, 30 September 2019 (UTC)
Out of pure curiosity, how many of our users see "Wikimedia" as part of what we do, and not an external weird alien that sometimes deigns to dip its fingers into our hobby? Equinox 05:17, 1 October 2019 (UTC)
I, for one, do not care about WMF. --Vealhurl (talk) 16:31, 1 October 2019 (UTC)
Since I talked about fingers, I suppose one shouldn't bite the hand that feeds: they do provide server space. I can guarantee though that in the next six months or so there will be political incursions. I hope someone has a non-WMF backup in case we have to do the biggest fork ever. Equinox 16:34, 1 October 2019 (UTC)

October 2019

Cleaning up request templatesEdit

I went through and made a list of the request templates listed in WT:Templates with current language parameter, and found all the aliases and uses of them. Most of the aliases were obscure, hardly used and undocumented, so I went ahead and orphaned and deleted them. Some, however, need discussion:

Aliased template Canonical template #Uses Comments Outcome
Template:tea room Template:tea room 100 (including the 69 uses of {{rft}}, hence only 31 uses of {{tea room}}) Maybe replace with {{rft}} or {{tea}} (see also barely used template {{beer}}) Kept as canonical.
Template:rft Template:tea room 69 Maybe this should be canonical? Deleted.
Template:tea Template:tea room (unused, but mentioned in Template:tea room) Maybe this should be canonical? Although it's quite short for a not-very-used template. Deleted.
Template:rfdef Template:rfdef 65117 (including the 18481 uses of {{defn}}, hence 46,636 uses under this name) Canonical name. Kept.
Template:defn Template:rfdef 18481 Clearly not a good name, but has many uses; I propose orphaning and deprecating it, rather than deleting it. Orphaned in favor of {{rfdef}}.
Template:request for etymology Template:request for etymology 34200 (including the 34194 uses of {{rfe}}, hence only 6 uses under the longer name) Canonical name. I propose deleting this name in favor of {{rfe}}. Deleted in favor of {{rfe}}.
Template:rfe Template:request for etymology 34194 I propose making this the canonical and only name. Kept.
Template:request for references Template:request for references 21 (including the 20 uses of {{rfr}}, hence only 1 use under the longer name) Canonical name. I propose deleting this name in favor of {{rfr}}. Deleted in favor of {{rfref}}.
Template:rfr Template:request for references 20 I propose making this the canonical and only name. Deleted in favor of {{rfref}}.
Template:rfv-etymology Template:rfv-etymology 681 (681 - 46 = 635 uses under this name) This is more-used than {{rfv-etym}}, but I prefer the latter, shorter name and propose orphaning and eliminating this template in favor of {{rfv-etym}}. Deleted in favor of {{rfv-etym}}.
Template:rfv-etym Template:rfv-etymology 46 I propose making this the canonical and only name. Kept.
Template:rfv-pronunciation Template:rfv-pronunciation 164 This is more-used than {{rfv-pron}}, but I prefer the latter, shorter name and propose orphaning and eliminating this template in favor of {{rfv-pron}}. Deleted in favor of {{rfv-pron}}.
Template:rfv-pron Template:rfv-pronunciation 6 I propose making this the canonical and only name. Kept.
Template:sense stub Template:sense stub 836 (including the 446 uses of {{rfgloss}}, hence 390 uses under this name) Canonical name. I propose orphaning and eliminating this template in favor of {{rfgloss}}. Deleted in favor of {{rfclarify}}.
Template:stub-gloss Template:sense stub (none; formerly 3) I already orphaned this in favor of {{rfgloss}} and deleted it. Deleted.
Template:rfgloss Template:sense stub 446 I propose making this the canonical and only name. Deleted in favor of {{rfclarify}}.
Template:gloss-stub Template:sense stub (none; formerly 122) I already orphaned this in favor of {{rfgloss}} and deleted it. Deleted.

There are about 35 or 40 request templates, the vast majority of which begin with rf followed by an abbreviation or short word. I propose bringing the remainder under this scheme. There are two cases where the canonical name is long ({{request for etymology}} and {{request for references}}), but in both cases the long names are rarely used, and the shorter versions {{rfe}} and {{rfr}} are almost always found. I propose eliminating the long names in favor of the short names, consistent with the other request templates. Similarly, I propose eliminating {{sense stub}} (a misnomer in any case, as this template concerns glosses of foreign terms) in favor of {{rfgloss}}. I'm not quite sure what to do with {{tea room}} vs. {{rft}} vs. {{tea}}. Maybe {{rft}} should be canonical for consistency with the other request templates, and because it's the most used.

NOTES:

  1. My threshold for delete vs. deprecation is usually 1000 uses.
  2. In the table above, the counts for canonical names include uses under all aliases.

Benwing2 (talk) 01:37, 1 October 2019 (UTC)

This seems like a worthwhile effort and the specific proposals seem good. Could consideration be given to having the canonical templates default to a 'lite' display? Compare {{rfelite}} and {{rfe}}. Very few requests really warrant the warning function of the large request display boxes. Requiring more typing (longer alias or switch) to display the larger size seems appropriate to me. DCDuring (talk) 02:34, 1 October 2019 (UTC)
@DCDuring I agree with you that the big boxes are annoying. How should we proceed? One way is just to change the format of {{rfe}} and related templates that display a big box (e.g. {{rfp}}, {{rfap}}) to use the "lite" display by default, and take a |big=1 param to display a big box instead. This would change the way I lot of pages look though, so I'd want to make sure others are in agreement. Another possibility is to make the same change but also bot-add |big=1 everywhere so that the display doesn't change. Benwing2 (talk) 14:22, 1 October 2019 (UTC)
Whatever is acceptable. We are still "under construction", but the big boxes don't much help attract contributors at this point. The option of "big=1" will probably have its uses. DCDuring (talk) 14:32, 1 October 2019 (UTC)
I expect that box=1 will convey the intention more clearly.  --Lambiam 15:32, 1 October 2019 (UTC)
I also support lite boxes by default. Ultimateria (talk) 17:38, 1 October 2019 (UTC)
{{rfgloss}} is a bad name to distinguish from {{rfdef}}}, because there is no reason why the latter couldn’t be called the former. {{rfclarify}} looks good. The alias {{defn}} for {{rfdef}}} is gross.
Request for etymology templates: Note that the text in |2= in {{rfe}} has been displayed on pages so far. It often contains hypotheses or speculations which have been consciously made visible and are better than nothing until the etymology is cleared up by someone who knows better, often after years. If you change whatever don’t change the visibility of that text on the visibility of which editors have relied upon.
Tea room: Perhaps {{tea}} and {{tea sense}}
References: {{rfref}} better. You know as a programmer how common the clipping “ref” is, so this is catchy.
Request for verification of pronunciation: Note sure if {{rfv-pron}} is good, it’s a bit lewd, and maybe analogy makes {{rfvp}} (or {{rfv-ipa}}?) better. Apart from the issue that verification of pronunciation meets only limited resources so I wonder why anyone would want to use the template or what he would gain with it. Fay Freak (talk) 15:35, 1 October 2019 (UTC)
I agree with most of Benwing's suggestions, and with Fay Freak that {{rfref}} is better than {{rfr}}. "pron" seems fine to me; we use it in plenty of templates and modules already. Ultimateria (talk) 17:38, 1 October 2019 (UTC)
I think I'll go with {{rfref}} and {{rfclarify}} per Fay Freak, {{rfv-pron}} per Ultimateria and maybe {{rft}} for Tea Room links (but not sure about the last). I agree with Lambiam about |box=1. Benwing2 (talk) 00:46, 2 October 2019 (UTC)
I vote for {{tea room}}; {{rft}} confuses me because it looks like "request for tea", or for something else, like translation or transliteration. — Eru·tuon 01:29, 2 October 2019 (UTC)
@Erutuon There's also {{rft-sense}}; what would you call that? Benwing2 (talk) 01:45, 2 October 2019 (UTC)
The text for {{rft-sense}} says "Discuss this sense" so we could call it {{rfdiscuss}} or something. Note that there are only 17 uses so it could even be transformed into a param of some other template. Benwing2 (talk) 01:47, 2 October 2019 (UTC)
It isn’t a request for anything specific, but rather an invitation, and gets removed even if no contribution has taken place after some time, or no? Hence I suggested {{tea}} and {{tea sense}}. Maybe call it {{invitation for tea}} and {{invitation for tea-sense}}. Fay Freak (talk) 02:55, 2 October 2019 (UTC)
Maybe {{tea room}} and {{tea room sense}}? Benwing2 (talk) 04:50, 2 October 2019 (UTC)
None of the tea room templates get regularly or systematically removed by anyone. DTLHS (talk) 05:18, 2 October 2019 (UTC)
@Benwing2: Here are my suggestions (I wanted to submit these but forgot to do so):
  1. {{tea room}} - Keep as canonical. Rename rest.
  2. {{rfdef}} - Keep as canonical.
    {{defn}} is used primarily in Han character entries and can be converted to {{rfdef|lang}} without the sort key. {{defn|cmn|sort=}} and {{defn|yue|sort=}} can both be converted to {{rfdef|zh}} due to unified Chinese.
  3. {{request for etymology}} - Convert to {{rfe}}, as suggested.
  4. {{request for references}} - {{rfref}} is much better
  5. {{request for etymology}} - Convert to {{rfv-etym}}, as suggested.
  6. {{request for pronunciation}} - Convert to {{rfv-pron}}, as suggested.
  7. {{sense stub}} - Convert to {{rfgloss}}, as suggested. KevinUp (talk) 10:22, 2 October 2019 (UTC)
Updated thoughts:
  1. I've noticed that posting directly at the Beer Parlour and Tea Room attracts more traffic than using templates such as {{tea room}} or {{beer}}. Do we really need these templates? I'm not keen on seeing these boxes because they appear to be slightly intrusive and needs to be removed from time to time.
  2. I found only 146 uses of {{defn}} in Latin script entries compared to 13600 uses of {{defn}} in Han script character entries. I noticed that WingerBot has already converted {{defn|cmn|sort=}} and {{defn|yue|sort=}} to {{rfdef|cmn|sort=}} and {{rfdef|yue|sort=}} (with the sort key), but both of these can actually be converted to {{rfdef|zh}} without the sort key because sorting is now done automatically via Module:zh-sortkey/data.
  3. Matter resolved.
  4. Matter resolved.
  5. Matter resolved.
  6. Matter resolved.
  7. {{rfclarify}} is indeed much better compared to {{rfgloss}}. KevinUp (talk) 10:22, 2 October 2019 (UTC)
@KevinUp I've already converted over half of the {{defn}} templates to {{rfdef}}. I'll let it finish now, and do a separate pass sometime later to clean up the sort keys and unify the Chinese entries, as you suggest. For which languages can the sort key be unilaterally removed? You mention cmn, yue, zh; what about ko, ja, vi, ...? Benwing2 (talk) 11:04, 2 October 2019 (UTC)
@Benwing2: The sort keys for cmn, yue, ja, ko, vi, zh for {{rfdef}} involving entries in Category:Han script characters can be removed. I've confirmed this in this edit. KevinUp (talk) 21:55, 2 October 2019 (UTC)
@KevinUp: Just to confirm, is it okay that {{rfdef}} doesn't add a radical–stroke sortkey for cmn, ja, ko, yue if |sort= isn't provided? Here are testcases for this in Special:ExpandTemplates. — Eru·tuon 22:21, 2 October 2019 (UTC)
Okay. Turns out I was wrong. Vietnamese and Chinese does not require the sort key, but Korean and Japanese requires the sort key to work properly. cmn and yue also need the sort key, but they are to be converted into {{rfdef|zh}} which does not need the sort key. Thanks for pointing this out. KevinUp (talk) 22:40, 2 October 2019 (UTC)
@KevinUp Are we sure we want to convert occurrences of cmn, yue and hak to zh? See for example , which has occurrences of {{rfdef}} in separate "Mandarin" and "Cantonese" sections. Once we convert, the section headers will disagree with the language code of {{rfdef}}. Following is a table of the occurrences of various languages in {{rfdef}} usages that were converted from {{defn}}:
8508 ja
7216 ko
6689 cmn
5579 vi
4729 yue
27 zh
1 hak
Benwing2 (talk) 02:36, 3 October 2019 (UTC)

Merging {{rfe}}, {{rfelite}}, {{etystub}}Edit

These three templates do similar things and it's not obvious to me they deserve to be separate. Here's an example (from the Moksha синь (sinʹ) page):

1. Using {{rfe}}:

Cognates include Erzya сынь (synʹ), perhaps cognate with Northern Sami sii. (This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium.)

2. Using {{rfe|inline=yes}}:

Cognates include Erzya сынь (synʹ), perhaps cognate with Northern Sami sii. (This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium.)

3. Using {{rfelite}}:

Cognates include Erzya сынь (synʹ), perhaps cognate with Northern Sami sii. You can help Wiktionary by providing a proper etymology.

4. Using {{etystub}}:

Cognates include Erzya сынь (synʹ), perhaps cognate with Northern Sami sii. This etymology is incomplete. You can help Wiktionary by elaborating on the origins of this term.

User:DCDuring suggests that the default display box of {{rfe}} is too glaring, and I agree. I suggest that we synthesize the three wordings in some fashion and standardize on {{rfe}}, which is changed to display the text inline unless |box=1. Note that currently {{rfe}} has about 34,000 uses while {{rfelite}} and {{etystub}} have around 4,000 each, so standardizing on {{rfe}} will be the least disruptive (as well as the shortest name). Benwing2 (talk) 03:46, 2 October 2019 (UTC)

The wording for etystub is different because it is used where there is some etymological information, but arguably not a complete one. Also whatever template is used for an incomplete etymology, it should probably appear after the existing, incomplete etymology and a space appearing as follows:
Cognates include Erzya сынь (synʹ), perhaps cognate with Northern Sami sii.
This etymology is incomplete. You can help Wiktionary by elaborating on the origins of this term.
Otherwise the request is visually lost, IMO. DCDuring (talk) 04:32, 2 October 2019 (UTC)
@DCDuring I'm fine with putting the request on a separate line. But there's absolutely no consistency in the use of {{rfe}}, {{rfelite}} and {{etystub}}; people aren't observing the distinction between missing and incomplete etymologies. The Moksha page I reference above, for example, uses {{rfe}} with an incomplete etymology. I suggest that we choose some wording that works for both missing and incomplete etymologies, maybe like this:
This etymology is missing or incomplete. Please add to it, or discuss it at the Etymology scriptorium.
Benwing2 (talk) 04:39, 2 October 2019 (UTC)
It seems to me that there are differences that deserve to be recognized among missing etymology ({{rfe}}), incomplete etymology ({{etystub}}), and one that is being challenged ({{rfv-etym}}). I find that sometimes smaller tasks, like completing an etymology, fit my mood, sharpness, and energy level better than possibly larger tasks like providing a complete etymology. I also note that some users fail to remove these etymologies, even when the etymologies seem pretty good to me. Is it carelessness or are the requesting that someone else review it? I doubt that all such etymologies should go the the Etymology Scriptorum. It also seems like a task that requires a higher level of skill.
If some users use the wrong one for the situation, then someone seeing it can and should correct it. Some such needed corrections could be identified by regex searches.
I guess I am thinking that these tags should fit it into a more refined workflow than has been customary, probably requiring more maintenance categories. The implementation would ideally not require less-frequent users to learn brand-new ways of doing things, but would allow frequent and skilled users to work more effectively. My perceptions may just be wrong about this, but I'd like to hear thoughts of others about workflow. DCDuring (talk) 05:09, 2 October 2019 (UTC)
@DCDuring I see your point, although I still have the feeling that the current usage of {{rfe}} and {{etystub}} is a total mess. If you want to maintain this distinction, we should definitely rename {{etystub}}, as its current name gives no clear indication that it's for partial etymologies (in fact to me, "stub" suggests there's basically nothing there). Perhaps {{rfe-partial}} would be a clearer name. Benwing2 (talk) 06:19, 2 October 2019 (UTC)
The main difference between {{rfe}} and the other related templates is that an additional line of text can be displayed after the language code (Particularly: “...“) but this is sometimes misused to include possible theories and incorrect information that are more suitable for discussion pages.
As with {{tea room}} and {{beer}}, I don't think {{rfe}} actually attracts user traffic to the Etymology scriptorium. I would prefer for the box format to be replaced by short inline sentences and the particularly: “...“ statements completely hidden or moved to the talk page.
{{rfelite}} displays a short, clean sentence. I prefer this format as it can be added to almost any lemma entry that lacks an etymology header. Whenever I find {{rfe|lang}} without additional "particularly ..." statements, I would convert the template to {{rfelite|lang}} which is less intrusive. KevinUp (talk) 10:22, 2 October 2019 (UTC)
  1. {{rfe}} - Keep the name, but the current format needs to be revamped.
  2. {{rfe|inline=yes}} - Much better and less intrusive. Can be used without the |inline=yes parameter to replace the current boxed format. However, it is a bit wordy. I prefer the statement from {{rfelite}}.
  3. {{rfelite}} - Possibly delete after merging with {{rfe}}.
  4. {{etystub}} I would say keep. This template can be used to indicate that the information is incomplete, e.g. some intermediate ancestors were skipped. This templates categorizes entries into Category:Requests for expansion of etymologies by language. KevinUp (talk) 10:22, 2 October 2019 (UTC)
@KevinUp If you want to keep {{etystub}}, what do you think about an alternative name like {{rfe-expand}}? It's 3 more letters to type but I think it much more clearly describes the intention (request for expansion of an existing etymology). Benwing2 (talk) 11:08, 2 October 2019 (UTC)
@Benwing2: Good idea. {{etystub}} was meant to be similar with {{sense stub}}. Using {{rfe-expand}} for categorization into "Category:Requests for expansion" is a fitting choice. KevinUp (talk) 21:55, 2 October 2019 (UTC)
Could we add a switch to rfe to request a review for an etymology without forcing it to go to WT:ES? Presumably the switch could put the L2 (or L3?) in a maintenance category. DCDuring (talk) 15:00, 2 October 2019 (UTC)
@KevinUp Another possible name is {{rfe-exp}}, which is the same length as {{etystub}} and is analogous to the existing {{rfexp}} ("request for expansion"). Thoughts? Benwing2 (talk) 06:36, 3 October 2019 (UTC)
@Benwing2: I think {{rfe-expand}} would be better as the canonical name because "exp" can be interpreted as experimental, experience, etc. Anyway. a shortcut can still be created to redirect {{rfe-exp}} to {{rfe-expand}}. KevinUp (talk) 19:34, 20 October 2019 (UTC)
@DCDuring Sure, we can add that switch. Although, I don't think there's a problem with mentioning the WT:ES by default as an option (not something forced), using text like this (duplicated above):
This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium.
I think this text is succinct enough to be the replacement text for both {{rfe}} and {{rfelite}}. Benwing2 (talk) 06:39, 3 October 2019 (UTC)
OK, but it would be helpful if regulars at WT:ES chimed in. DCDuring (talk) 15:10, 3 October 2019 (UTC)
As a regular user of those templates, I don't have a problem with a complete merge. Canonicalization (talk) 19:29, 3 October 2019 (UTC)
@DCDuring, KevinUp, Canonicalization I switched {{rfe}} to display inline unless |box=1. If you specify |noes=1, it suppresses the reference to the Etymology scriptorium. You can see the various possibilities at User:Benwing2/test-rfe. If the display is acceptable, I will redirect {{rfelite}} to {{rfe}}. Benwing2 (talk) 18:33, 20 October 2019 (UTC)
The current display without the box is much better compared to the previous display. I would suggest for the default output of {{rfe|LANG}} to become "You can help Wiktionary by providing a proper etymology." if no additional parameters are specified.
I would also suggest for the displayed message to become "This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium." if |es=1 is specified or "This etymology is missing or incomplete. Please discuss it at the talk page, particularly "..."." if |2= or |talk=1 is specified.
This is because some editors use talk pages to discuss word origins rather than the etymology scriptorium. I'm also good with the current display. Anyway, {{rfelite}} can be deprecated. KevinUp (talk) 19:34, 20 October 2019 (UTC)
@KevinUp Thanks for your comments. I'm fine with changing the wording but your suggestions seem a bit odd in that using |es=1 completely changes the wording in ways that aren't obviously related to the presence or absence of "Etymology scriptorium". Benwing2 (talk) 21:28, 20 October 2019 (UTC)

Templates for coined wordsEdit

I couldn't find an existing template that expresses that a word was coined by a certain person. Especially for newer vocabulary (neologisms), this information may be well documented and worth presenting under the Etymology section (not just under a quotation). I'm thinking of implementing such a template - perhaps {{coinage}} or {{coined}} that would take

  1. the person who coined it
  2. (optionally) whether to add a Wikipedia link for that person (and the language to add it in)
  3. (optionally) the date of coinage, if known.

In addition, the template could then be used to categorize entries under the person who coined them, which could be particularly useful for some European languages (such as Finnish) where words have been coined in larger numbers by some people, such as Elias Lönnrot, who put together a very sizable Finnish-Swedish dictionary and coined a lot of the Finnish words himself, many of which are still used today.

My initial proposal is {{coinage|fi|Elias Lönnrot|w=en}} showing up as "Coined by Elias Lönnrot." (with a nodot and year/date option available as well) and adding the page into a category called "Category:Finnish words coined by Elias Lönnrot" (which also needs some kind of category structure). Any thoughts? — surjection?〉 13:29, 2 October 2019 (UTC)

I now have an initial version available in my userspace. Some small changes: the date parameter exists and is called "in", so {{coinage|und|coiner|in=year}}, and w now also accepts the article name if given in the format lang:Article name. — surjection?〉 15:11, 2 October 2019 (UTC)
See Wiktionary:Beer parlour/2017/December#Template for coinages. —Μετάknowledgediscuss/deeds 15:56, 2 October 2019 (UTC)
As well as Wiktionary:Tea room/2018/January § Netflix and chill. Canonicalization (talk) 16:01, 2 October 2019 (UTC)
I've taken a look at both - because of them, I changed the categorization structure a bit, allowing both "(language) coinages" and "(language) terms coined by (coiner)" and the latter to be disabled separately. — surjection?〉 16:15, 2 October 2019 (UTC)
Although I don't feel that the parameter name litecat is the best, so I'm open to suggestions. — surjection?〉 16:30, 2 October 2019 (UTC)
We'll need a strict definition in our glossary, copied over to the template documentation, and perhaps even have the word "Coined" in the template display link to the glossary entry. Obviously, first attestations are not necessarily coinages, but we also need to clarify the status of words like weeaboo (coined as a nonce word, but used with an unrelated meaning), flan (a slip of the tongue humorously repurposed as a coinage), Imogen (a misprint later accepted as a coinage), and cdesign proponentsist or medireview (misprints humorously repurposed as a coinage). —Μετάknowledgediscuss/deeds 18:42, 2 October 2019 (UTC)
I have added a link to Appendix:Glossary#coinage, but the entry needs to be defined. I feel the definition should take into account that the word was intentionally created (not just first written down) by a person or another entity in order to describe something definite. — surjection?〉 19:02, 2 October 2019 (UTC)
I support the creation of this template. I think it can be used to categorize entries in Category:English terms first attested in Shakespeare. My only concern is the categorization of entries into "Category:English words coined by author XX". Will this be done automatically or manually? I've seen coinages that were eventually deleted due to lack of widespread use. I think categorization for specific authors can be manually added to the module/template, rather than appear automatically, to prevent incorrect categories from popping up. KevinUp (talk) 21:55, 2 October 2019 (UTC)
Categorization for specific coiners is done automatically, but can be disabled with a certain parameter given to the template. I don't see how the "lack of widespread use" really affects anything here, as the creation of this template wouldn't somehow subvert WT:CFI. — surjection?〉 22:23, 2 October 2019 (UTC)
Just a note that most words first attested in Shakespeare are not thought to have been coined by him, so this would not change that category at all. The idea of having the module know a list of people to make categories for is an appealing way to avoid lots of categories with one entry, though. —Μετάknowledgediscuss/deeds 22:27, 2 October 2019 (UTC)
A more flexible approach is to occasionally check for such categories with only one entry by using a bot or the like. — surjection?〉 22:36, 2 October 2019 (UTC)
I will soon be releasing this out as {{coinage}}, with alias as {{coin}}. The final change is renaming |litecat= to something else. This of course doesn't mean the template can't have any further changes done to it, but they should from now on aim to be backwards compatible. — surjection?〉 09:31, 3 October 2019 (UTC)

Further reading and References at L3 or L4 or L5Edit

The heading level for ===Further reading=== and ===References=== appears to be inconsistent. Statistically, I found the following:

Further reading:

  1. 405 entries (0.24%) with "Further reading" at L5.
  2. 18,421 entries (10.77%) with "Further reading" at L4.
  3. 152,293 entries (89.00%) with "Further reading" at L3.

References:

  1. 1,572 entries (0.59%) with "References" at L5.
  2. 50,178 entries (18.77%) with "References" at L4 - 29,418 entries (11.00%) are from Han script characters.
  3. 215,636 entries (80.65%) with "References" at L3.

Are there official guidelines regarding the heading levels of these two headings? If not, can we start a vote or mini-vote regarding this matter? Previous discussion can be found here. KevinUp (talk) 21:55, 2 October 2019 (UTC)

Option 1 - Both headings at L3 by default
Option 2 - Both headings at L4 by default
Option 3 - Headings at L4 if multiple etymologies or POS exist
Comments
  • My only comment is that the ====References==== header under the "Translingual" section for Han script characters can be replaced by ===Further reading=== at L3 level as previously suggested by Justinrleung here. KevinUp (talk) 21:55, 2 October 2019 (UTC)
In general, I think that ===References=== should always be at L3 and always at the bottom of an entry. Individual references should be inlined using <ref>...</ref>.
I can see use cases for Further reading being at L4 or even L5, depending on the structure of the entry and what the "further reading" is intended to apply to. For instance, a multi-etym entry would have POSes at L4, and a Further reading section intended for that etym would thus also be at L4. A Further reading section intended for a specific POS would then be at L5. (This assumes that Further reading sections are allowed for specific POSes.)
‑‑ Eiríkr Útlendi │Tala við mig 22:04, 2 October 2019 (UTC)
I treat References as a per-entry section (L3) while Further reading is a per-term section (L4 if POS is at L3, L5 if POS is at L4). —Rua (mew) 14:59, 3 October 2019 (UTC)
Although I almost always put References at L3, I’d support having both be flexible, so that they can be used per-entry or per-term as needed. — Vorziblix (talk · contribs) 15:57, 3 October 2019 (UTC)
I agree with this, lets keep them flexible.--So9q (talk) 19:36, 7 October 2019 (UTC)
It depends on what the further reading or references section refers to, it seems, particularly if to support the etymology or maybe the pronunciations or representing further information on the senses. And then again there is no gain with unifying as everything is moved one level down if there are multiple etymologies as there are references sometimes for separate etymologies and sometimes for all. كپنك‎ started with references for all and then they have been separated. There was a discussion started by Rua (talkcontribs) on whether there should be etymology groupings but I hardly find it.
And actually there is no real distinction between “Further reading” and “References”. It is just what is left over after in the past there have been a lot more headers. Yet I cannot wholly distinguish these two. Fay Freak (talk) 19:48, 7 October 2019 (UTC)
See Wiktionary:Entry_layout#References. "References" are for verifying specific claims such as a pronunciation or etymology with an outside source. "Further reading" is broader and directs you to reference works for more information (or, I guess, if you don't trust Wiktionary until you see the word is in a "real" dictionary). But this distinction is almost never carried out in practice. Ultimateria (talk) 16:17, 8 October 2019 (UTC)

Eirikr's clear misbehaviour and breach of rulesEdit

Aside from Eirikr's behaviour, shown on User_talk:Eirikr#Nata...

When there is a dispute in regards to a matter, which is therefore being discussed, I would assume that everyone agrees, that people should concentrate on the discussion, and not make any edits, whilst the discussion is underway (unless it is to make edits, that a consensus approves of) Anything else would be quite chaotic, and disorderly, and would make a mockery of the discussion.

is currently under discussion. (at first at User_talk:Eirikr#Nata, which went nowhere, but now at Wiktionary:Requests_for_verification/Non-English#鉈) This discussion is ongoing and has not concluded. It is directly concerned with whether or not 鉈 can or should be defined as hatchet, machete, billhook, and/or froe.
Whilst this is happening, Eirikr added a translation of "鉈" to billhook. (in addition to the existing translation of 鉈鎌, which I put there)
If 鉈 can be defined as billhook (which is what the above mentioned RFV is about), then that is a valid to include it as a translation. If, however, 鉈 can't/shouldn't be defined as billhook, then it is clearly not valid to include it as a translation.
Thus Eirikr edit, in billhook, was an edit in regards to a matter that is currently under discussion. Hence I reverted it, whilst pointing out this issue, in my edit summary.
In answer, Eirikr reverted it back, with the justification that billhook isn't the entry under discussion.
This is a clear case of Wikilawyering. Using technicalities to justify your actions. Using the letter of the law, to pervert the spirit of the law. Going against the point/purpose of the rule ...as I pointed out in my further revert, of this clearly invalid edit ...which was answered with no justification, but just a revert and an indefinite protection, blocking me from further editing.

Now on Wikipedia, this would have had to go differently.
Eirikr made an edit, I reverted it ...and then, given Wikipedia rules, he would not have been allowed to re-revert, as that would constitute edit warring. (a concept that apparently doesn't exist here) He'd have to follow the process of the Wikipedia:BOLD, revert, discuss cycle. In other words, he made an edit (bold), got reverted ...and then would have had no choice but to discuss, to make his case. Because Wikipedia tries to make sure to avoid destructive squabbles, and keep things somewhat civilized and reasonable, preferring communication and clarification, to mere emotion and obstinacy.
...but as this is apparently a lawless anarchy (or rather more of a kratocracy, or the proverbial "law of the jungle"), where even the few rules that exist are ignored and broken, even by admins, and where no discussion is tolerated, it went the way it did.
Is this acceptable behaviour, on Wiktionary?--213.113.50.173 03:49, 3 October 2019 (UTC)

There is no such rule. Eirikr is an established Japanese editor. You are not. Maybe your edits will be respected one day, but right now they will be reverted. DTLHS (talk) 04:01, 3 October 2019 (UTC)
There is no such rule? Well no, there are no rules at all ...but is it sensible? Is it not the opposite of constructive?
Also, when you talk about who makes an edit, rather than about the edit itself... Someone who reverts to spouting ad hominem and argument from authority fallacies, thereby reveals that they have no case. That they have zero confidence, in being able to make their case, with honest or rational arguments.--213.113.50.173 04:19, 3 October 2019 (UTC)
Indeed, argumenta ad hominem show how wrong some people (including some WT admins) are. --2003:F8:13C7:59D1:1DCB:D847:CFD:6A02 06:46, 3 October 2019 (UTC)
Can you cool it down a bit? One relevant data point: wikidata:Q708852 unites billhook and . The question is really, whether a Japanese speaker, in describing a billhook, or translating an English text that uses the term, would be inclined to use the term “鉈”. A Japanese すり鉢 does not look anything like a Western mortar, but they serve the same function, so it is reasonable to offer one as a translation of the other.  --Lambiam 16:03, 3 October 2019 (UTC)
"Can you cool it down a bit?"
Everyone else is assuming bad faith of me, attacks me (not argues against or disagrees. That's fine. Attacks!), regularly make personal attacks etc etc. All the while refusing to even make any arguments, or to debate anything, at all. As is clear and obvious, to anyone who can see
...and you tell ME to cool down!?
You have no words to say to the others (who are clearly not "cool", though that is the least of their issues), but you tell ME off!?
That is clear proof, that you are an utterly disingenuous hypocrite.
"One relevant data point:"
No, that data point is not relevant, at all. How could/would it be?
It is circular (pointing to Wiki, to defend what is written on Wiki) and also does nothing to demonstrate that it is accurate usage.
"The question is really, whether a Japanese speaker, in describing a billhook, or translating an English text that uses the term, would be inclined to use the term “鉈”."
No it isn't.
First of all, how a Japanese speaker would translate an English word, is completely irrelevant, due to the fact that the Japanese speaker cannot be assumed to have a proper understanding of English. (not to mention the example I like to cite, of how EVERY English-Japanese dictionary in existence, translates "hip" as "尻")
A more sensible question would be if they would, when presented with a billhook (as in be handed a physical specimen, of the tool itself) and be shown how it is made and used, be inclined to use the term “鉈”.
They might, sure, but...
Would an English speaker, in describing a naginata (薙刀), or translating a Japanese text that uses the term, be inclined to use the term "weapon"?
Yes. Yes they would.
Does this mean that "weapon" is a valid term to add as a translation, for the entry "薙刀"?
No, obviously not. Just as using "鉈" to describe a billhook, doesn't mean that it is valid to include as a translation.
Hence your argument is clearly invalid, in multiple ways.
"A Japanese すり鉢 does not look anything like a Western mortar"
...
They look identical. (Japanese すり鉢, "Western" mortar and pestle. Yes that specific mortar is shaped a bit differently, but there is variation in "Western" mortars, including ones that have exactly the same shape. The one my parents have, for example. There is, of course, also variation in Japanese すり鉢. Yet another thing they have in common)
No sane and honest person would deny, that they look identical ...except for one tiny little detail, that many might miss: The groves in the Japanese すり鉢, which differentiates it from normal 乳鉢.
In other words: a Japanese "すり鉢" does not translate as "mortar and pestle" (that would be "乳鉢"). It is a specific sub-type, of mortar and pestle.
Much the same as "柳刃包丁" doesn't translate as "knife", seeing as it's not merely a knife, but a specific type of Japanese kitchen knife. (with Japanese kitchen knives, being a specific sub-group, of kitchen knife, kitchen knives being a specific sub-group of knives. All of this being known and obvious to pretty much everyone. Not yanagiba, as most people aren't familiar with specific Japanese kitchen knives, but rather the groupings and how all that works)--85.228.52.161 08:16, 5 October 2019 (UTC)
You wrote: how EVERY English-Japanese dictionary in existence, translates "hip" as "尻" — I don’t know what your definition of every is but that’s clearly wrong: [3], [4]. — TAKASUGI Shinji (talk) 15:29, 8 October 2019 (UTC)
Okay, so you found two dictionaries that don't technically translate it as "尻", specifically ...but one still give a translation, that points to the exact same part of the body., whilst is kinda close to accurate, translating it as "腰(回り", and has a more in-depth explanation, that is kinda correct. You are technically correct. My statement is, apparently, not exactly correct. Thanks for the (nit-picky) correction ...but my point still stand. My argument is unaffected: Just about EVERY English-Japanese dictionary in existence, translates "hip" as (in Japanese) buttocks. (there is one confirmed exception. A thorough search, among net and book versions, might result in one more) Not that this is necessary, to point out that Japanese dictionaries are not infallible. Dictionaries are not infallible. You're supposed to go with usage, not what a dictionary dictates ...but Japanese dictionaries are especially prone to error, and this is an especially obvious example, of a huge and obvious error.--85.229.234.72 10:43, 10 October 2019 (UTC)
Actually, I shouldn't have given those answers. All of this is off-topic: The issue is Eirikr's behaviour. Not the facts or validity of any part of any entry, but purely how Eirikr handled things.--85.228.52.161 08:18, 5 October 2019 (UTC)
I do not see a breach of rules, but I do see a lot of fuss about a relatively unimportant issue. When there is a dispute about a sense of an entry, it is normally upon the person disputing the sense to raise the issue at Requests for verification. Definitions will rarely be perfect, and aiming at perfection may ultimately even defeat the purposes of a dictionary by making definitions incomprehensible.  --Lambiam 11:36, 5 October 2019 (UTC)
You don't see any problems with the fact that he regularly assumes bad faith (i.e. without there being any evidence of it, of any kind), edit wars, refuses to discuss, and makes edits on things, despite the fact that they are under active discussion? You don't think any of this breaks any of the rules? WT:Civility is just a joke, is it? The notion that one should be constructive, and try to act in ways that improve Wiktionary, is of no importance?--85.228.52.161 12:00, 5 October 2019 (UTC)
In this very response you are insinuating bad faith. And if you think you were civil while Eirikr was not, you have a different understanding of civility than me. If you get so upset when your precious contributions are reverted, then I suggest for your own sake that you find something else to do.  --Lambiam 12:35, 5 October 2019 (UTC)
To quote en:WP:ASSUME "Unless there is clear evidence to the contrary, assume that people who work on the project are trying to help it, not hurt it." (or, indeed, take a look at WT:Assume good faith)
Where there is clear and obvious evidence, I have pointed it out.
When you say I insinuated bad faith, whom are you saying I did so against? I would challenge you to show me any instance, where I have done so.
If you mean Eirikr, I have repeatedly and clearly stated that Eirikr has acted in bad faith, many times. Those were not insinuations.
If you mean against you... Again, I have no insinuated that you have bad faith. The statement "That is clear proof, that you are an utterly disingenuous hypocrite", is not an insinuation.
"And if you think you were civil while Eirikr was not, you have a different understanding of civility than me."
So you think assuming bad faith isn't uncivil? Reverting an edit, which, in its edit summary, contains a mention of where the editor has started a discussion about the subject and proceeding to ignore the discussion, isn't uncivil? (and edit warring?) Yes, it would appear that my understanding of civility, is very different from you, and everyone else on Wiktionary. Most people, in general, as well as Wikipedia as a whole, however...
"If you get so upset when your precious contributions are reverted"
Without any justification, discussion, or any form of a coherent process. Just reverted.
"then I suggest for your own sake that you find something else to do."
So you are saying that Wiktionary is hopeless? That what is said in its Main Page, is a complete lie? Everything in Help:Interacting_with_other_users is a lie? ...and when the Wikimedia Foundation states (emphasis mine) "The Wikimedia Foundation is the nonprofit that hosts Wikipedia and our other free knowledge projects. We want to make it easier for everyone to share what they know. To do this, we keep Wikipedia and Wikimedia sites fast, reliable, and available to all. We protect the values and policies that allow free knowledge to thrive. We build new features and tools to make it easy to read, edit, and share from the Wikimedia sites. Above all, we support the communities of volunteers around the world who edit, improve, and add knowledge across Wikimedia projects", that should not apply to Wiktionary?
Also, in WT:Civility, it says (again, emphasis mine):
"Most of the time, insults are used in the heat of the moment during a longer conflict. They are essentially a way to end the discussion. /.../ In other cases, the offender is doing it on purpose: either to distract the "opponent(s)" from the issue, or simply to drive them away from working on the article or even from the project, or to push them to commit an even greater breach in civility, which might result in ostracism or banning. In those cases, it is far less likely that the offender will have any regrets and apologize."
This sounds like a perfect description of the approach, that people have taken against me. (and, it would seem, any other newcomers as well)--85.228.53.143 08:55, 6 October 2019 (UTC)
What a waste of everyone's time. Canonicalization (talk) 21:11, 6 October 2019 (UTC)
Just within the past week or so, I've seen Eirikr say "I'm sorry, I was mistaken- you're right". You, on the other hand, seem to be only interested in winning your argument, and demolishing anyone who contradicts you. You've obviously gone through all of our pages on rules, policies and procedures, but only in search of ammunition, not to try to understand. You have this nasty habit of dismissing anything other people say that doesn't directly address the points you've made in the manner and the place that you deem such points should be addressed. A wiki is a community, not a set of rules. You need to pay attention to what people are trying to tell you, and stop wikilawyering. You don't have to agree with us, but you do have to listen. Chuck Entz (talk) 06:25, 10 October 2019 (UTC)
You have zero basis, for any of your claims/insinuations, which are purely assumptions of bad faith. As for your accusation of wikilawyering... That's laughable. How could anything I've said, even begin to qualify? Eirikr has engaged in it, certainly, but me?
"A wiki is a community", you say? Usually, that's true, but I've seen no evidence of that, here. Quite the contrary.
"You don't have to agree with us, but you do have to listen."
You've made no arguments, for me to listen to. Just accusations and bitching.
I've made arguments ...that none of you have listened to, addressed or acknowledged.
The degree of projection, in your comment is mind-boggling.--85.229.234.72 10:35, 10 October 2019 (UTC)
Which is, of course, itself an entirely content-free ad hominem argument- one of many. By the way, I probably won't have access to a computer until Tuesday, so don't think I'm ignoring you. Chuck Entz (talk) 04:47, 11 October 2019 (UTC)
Which is, of course, itself an entirely content-free ad hominem argument- one of many. (making an accusatorial statement about someone, when that statement itself is an example of the same thing, from you, is never a good idea)
Also, that is a complete misunderstanding of what an ad hominem is. An ad hominem is saying that "Person A made argument X, person A is bad, therefore argument X is wrong". (i.e. the argument/evidence/position is deemed wrong, not because of anything to do with the argument/evidence/position, but purely because of the person stating it)
Saying "Person A made argument X, this counter-argument Y shows why it's wrong, and also person A is bad" (or if the order of the last two bits are flipped) is in no way an ad hominem. Nor is merely badmouthing or insulting someone, unless it is done for the purpose of dismissing that person.
At no point, have I made any argument here, that even approaches being an ad hominem. You may want to check en:Ad_hominem, especially en:Ad_hominem#Non-fallacious_types ...or this (specifically the bit from the timestamp in the link, and 15 seconds on) or this video from PBS.--213.113.51.51 19:03, 13 October 2019 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── Are you here to do something useful or only to argue with everyone in order to gain points for your tally of "arguments won on the Internet"? — surjection?〉 19:32, 13 October 2019 (UTC)

Well... Wiktionary needs some guys to do the dirty jobs for it, while it turns a blind eye and give them the pleasure of abusing power to some extent in return. This is just how politics works in the world... ᾨδή (talk) 11:15, 5 October 2019 (UTC)

No it isn't. Not entirely. Yes, people with more power often have a bit more of an ability to get away with thing, but usually no one is fully above the law. Able to get away with a bit more, be given more of a benefit of the doubt? Sure ...but able to blatantly go against it? No. There is usually some form of accountability. I can certainly say that none of this would/could ever happen on Wikipedia.--85.228.52.161 12:00, 5 October 2019 (UTC)
Also, you say "Wiktionary needs some guys to do the dirty jobs for it", but if they do it in a way that is more detrimental to Wiktionary, than it is positive...--85.228.53.143 08:55, 6 October 2019 (UTC)
***Sigh*** Canonicalization is right – what a waste of time and quite derivative (reminds me of a certain "cunning" user). --Robbie SWE (talk) 11:47, 10 October 2019 (UTC)
Derivative?--85.229.233.209 20:55, 11 October 2019 (UTC)

Adding |unc=1 and moving labels in {{desc}}Edit

I'd like to propose, 1. adding an |unc=1 to {{desc}} to replace manually adding {{q|possibly}} at the end of entries, and 2. moving labels such as (calque) after the borrowed arrowed. Please see below.

Current system

Using {{q|possibly}}:

  • English: example (possibly)
  • English: example (possibly)
  • English: example (calque) (possibly)
  • English: example (semantic loan) (possibly)
  • English: example (semi-learned) (possibly)
  • English: example (possibly)
Proposal #1Proposal #2Proposal #3

Using |unc=1:

Using |unc=1:

Using |unc=1:

  • English: example (possibly)
  • English: example (possibly)
  • English: example (possibly, calque)
  • English: example (possibly, semantic loan)
  • English: example (possibly, semi-learned)
  • English: example (possibly)

Another nice thing about having a parameter is that we could create a category for entries with uncertain descendants. What are people's thoughts? --{{victar|talk}} 04:26, 3 October 2019 (UTC)

@Victar I think this is a great idea. Of your possible formattings, I like #2 best as I think the superscript question mark should be fairly self-explanatory. I'm ok with #3 as well, but not so much #1, as abbreviations like "clq", "slb", etc. are fairly obscure. Benwing2 (talk) 06:23, 3 October 2019 (UTC)
@Benwing2: Glad to hear. The idea is inspired from mathematics, where we actually have Unicode ⩼, but it's rather hard to make out. I would say that most people using en.Wikt wouldn't even know what the c in example c is, which is why the tooltips are crucial. --{{victar|talk}} 06:38, 3 October 2019 (UTC)
I think it should be unk=1 to match the template {{unk}}. —Rua (mew) 15:31, 3 October 2019 (UTC)
I disagree. Uncertainty is not the same as something being unknown. —Μετάknowledgediscuss/deeds 16:08, 3 October 2019 (UTC)
Yeah, I thought about that too because I often find myself doing {{unk|lang|Uncertain}}, but "uncertain/unclear" is more appropriate language than "unknown" in all cases involving {{desc}}. One could also probably argue we need an {{unc}} template. --{{victar|talk}} 16:35, 3 October 2019 (UTC)
I like 2 and 3, although in 2 the > sign before inherited terms looks very ugly; I’d leave it out and just have the question mark there. — Vorziblix (talk · contribs) 15:52, 3 October 2019 (UTC)
> is what's used in linguistics to denote a direct inheritance, so it is the appropriate symbol to use. I think just using a question mark looks strange. Maybe they would look better in monospace font > → ⇒ vs. > → ⇒. --{{victar|talk}} 16:35, 3 October 2019 (UTC)
Ugh, Consolas doesn't support ⇒. --{{victar|talk}} 22:00, 3 October 2019 (UTC)
@Victar Is it necessary to have Consolas support for that arrow? Also, this is a larger can of worms but I've always thought that {{desc}} should take a list of descendants; that would obviate problems with text on the right ending up in the wrong place. Benwing2 (talk) 18:06, 6 October 2019 (UTC)
@Benwing2: Consolas is the default monospace font on many Windows machines, so using Consolas doesn't make all the arrows equal width, kinda defeating the point. But anyone can apply their own font using their common.css. You mean using multiple terms? Yeah... that's been suggested before (for {{cog}} too), but I think that would be a fervent battle. --{{victar|talk}} 03:43, 7 October 2019 (UTC)
I've implemented proposal #2. --{{victar|talk}} 19:41, 7 October 2019 (UTC)

Unique calque symbolEdit

While I have people's attention, does anyone have any objections to giving calques a unique arrow? They're pretty different from other types of borrowings. Perhaps English: example? --{{victar|talk}} 18:26, 10 October 2019 (UTC)

I agree with using another symbol, but the two-way arrow doesn't really support the nature a calque's evolution; it reminds me too much of twice-borrowed terms. I googled Unicode arrows and thought these might be interesting for showing indirectness: ⥤ ➾ ⥲ ⤏. Ultimateria (talk) 05:18, 15 October 2019 (UTC)
I beg to differ. could connote a borrowing (), that the returned () with a native construct. In math, and are called equivalence (biconditional) arrows. We could also just use =, but that would imply a cognate in my mind. --{{victar|talk}} 06:10, 15 October 2019 (UTC)
That's interesting. I never learned those logic symbols so I was only looking at the shape. I think the two-way arrow is fitting then. Ultimateria (talk) 07:24, 15 October 2019 (UTC)
I still like dotted or wavy arrows (↜⇜⇠ or <≈≈ ), though, since most of our readers won't have a math background. Chuck Entz (talk) 08:06, 15 October 2019 (UTC)
Many symbols in linguistics are based on mathematical ones, and we are a dictionary, after all. --{{victar|talk}} 08:20, 15 October 2019 (UTC)
To also note, is used in chemical formulas as an equilibrium symbol. --{{victar|talk}} 16:01, 15 October 2019 (UTC)
If we’re following math symbols, a more closely analogous one might be ⥲, in that it’s used to indicate isomorphism — isomorphism being, to my mind at least, much more like the relation between calques than logical equivalence is. Not sure if it renders clearly for most people, though. — Vorziblix (talk · contribs) 17:22, 15 October 2019 (UTC)

Cleaning up or deleting Template:jumpEdit

I recently came across {{jump}}. This is a weirdly named template with strange syntax and I'm not sure it's needed at all. It's used on about 700 pages (which BTW is below the 1000-use deletion vs. deprecation threshold that I've been using). The vast majority of uses appear to be in Icelandic entries. I'd like to see what people think about the following:

  1. First, is it needed at all? The vast majority of uses were put there long ago (e.g. about the 600th chronological usage, on page stafli, was put there in 2011). The stated purpose of the template is to assist in connecting synonyms/antonyms/etc. to definitions on long entries, but (1) this is better achieved by using {{syn}} etc. to place the synonym directly below the definition line, and (2) many (perhaps most) of the uses are on small pages, e.g. stafli, where three lines separate the cross-references between synonym and definition.
  2. Secondly, the name {{jump}} is fairly opaque. The purpose appears to be a cross-reference; hence I think it should be named something like {{xref}} or maybe {{seclink}}; or better yet, since it serves two purposes (cross references to a section like synonyms, and cross references from such a section back to the definition), split it into two templates {{xref-to}} and {{xref-from}}.
  3. Also, the syntax is very weird. On stafli, for example, the cross-reference from defn to synonym is formatted as {{jump|is|computing stack|s}}, while the cross-reference from synonym to definition is formatted as {{jump|s|is|computing stack}}: Same params, just different order. By convention, when there's a language code it should normally go in the first param and be mandatory, but here it (a) floats around, and (b) is optional; if nothing that looks like a valid language code is seen, English is assumed.
  4. Finally, the section codes are non-obvious: for example, s = synonyms, which is better named syn.

My instinct is to deprecate this with an abuse filter to prevent new entries that use it, and to eventually rewrite uses to use {{syn}} etc. under the definition. Alternatively, we could just remove these template calls entirely, since they don't really seem to be used the way they should be in most cases. As a last resort, change the syntax as described above.

Comments?

Benwing2 (talk) 17:50, 6 October 2019 (UTC)

Deprecate and delete. Canonicalization (talk) 22:21, 7 October 2019 (UTC)
I vaguely remember something about it being used at one time to provide an anchor for offwiki links to land on as a kludgy alternative to {{sense}}. If that's the case, you might need to search via Google to see if there are links to the anchors. Chuck Entz (talk) 04:19, 8 October 2019 (UTC)
I saw that template, it’s almost only used in Icelandic entries, and yes, delete. Those superscript links are more confusing than everything, and yes, the syntax is terrible. There must be few who ever click em, and if they do it is mostly without technical effect, at least when there aren’t many senses or the screen is large (the screens have become larger since!). Fay Freak (talk) 16:57, 8 October 2019 (UTC)
Delete. The few times I encountered it it was often pointing to invalid targets. – Jberkel 19:40, 8 October 2019 (UTC)

retrospective tenseEdit

What's "retrospective tense", referred to in the entry of 've, mean? --Backinstadiums (talk) 18:51, 6 October 2019 (UTC)

Ancient Greek in translation tablesEdit

Hi, I suggest we change to the language name we have in module:languages. Do we need a vote for this change?--So9q (talk) 19:20, 7 October 2019 (UTC)

To clarify, this means nesting Ancient Greek in translations as
* Greek:
*: Ancient Greek:
rather than the current
* Greek:
*: Ancient:
Eru·tuon 19:26, 7 October 2019 (UTC)
Funny, I have already written it like that and have seen people change such usages to Ancient:, and I still do not know why. I don’t see what is related in Module:languages or its data. I get that it has something to do with the translation adder adding Ancient:, but has there been a reason not to add nested names that the translation adder does not add? What I have also done is nest Modern Turkish: and Ottoman Turkish: under “Turkish”, and then come the endless Aramaic lects, as in bed. Fay Freak (talk) 19:35, 7 October 2019 (UTC)
The line grc: 'Greek/Ancient', in MediaWiki:Gadget-TranslationAdder-Data.js (part of the nesting variable) makes the TranslationAdder gadget use the nesting "Greek: Ancient:". That's what User:So9q's proposal would change. As far as I know, that's the only place where nestings are enforced automatically, though, as you noted, people have been imposing the same nesting for consistency's sake. — Eru·tuon 19:46, 7 October 2019 (UTC)
It would make sense if the gadget were flexible enough to use the name “Greek” when there is only modern Greek and “Modern Greek” when the same Greek is nested when there is also Ancient Greek, same for Turkish. Fay Freak (talk) 19:51, 7 October 2019 (UTC)
I support changing from Ancient to Ancient Greek in this circumstance. Benwing2 (talk) 03:28, 8 October 2019 (UTC)

Jutish or Jutlandic?Edit

jut is the iso code for a collection of dialects of Danish. Translation-adder defaults to Jutish but maybe Jutlandic is more correct? WDYT?--So9q (talk) 21:24, 7 October 2019 (UTC)

Both terms are found in the literature. Wikipedia uses Jutlandic, but according to Google Ngram Viewer Jutish is far more common.  --Lambiam 22:17, 7 October 2019 (UTC)
Thanks for taking the time to look.--So9q (talk) 22:15, 8 October 2019 (UTC)
There is also the erstwhile Jutlandish, which is today as rare as Jutlandic Leasnam (talk) 00:06, 10 October 2019 (UTC)
I looked through a few search pages for "Jutish", and most of them are references to the historical people called the Jutes.__Gamren (talk) 20:01, 10 October 2019 (UTC)

Nest Jutish under DanishEdit

WDYT?--So9q (talk) 06:31, 8 October 2019 (UTC)

Merging translations of poop and pooEdit

I went ahead and did it without discussing first. I see now that there is a slight difference in meaning namely that poop is also informal. WDYT?--So9q (talk) 09:33, 8 October 2019 (UTC)

That's fine, the difference is more geographic than semantic. One might even be an alternative form of the other. Ultimateria (talk) 16:08, 8 October 2019 (UTC)

I did good workEdit

Just thought you'd like to know that I did some good work. I have gone through all the entries of Category:Spanish idioms and linked their component parts to them. It took 18 days to do. Obviously, it's not as awesome as So9q's Epic 2019 Poo/Poop Translation Merge, but you're welcome in any case. What have been your main wiki-achievements in the last 12 months? --Vealhurl (talk) 07:04, 9 October 2019 (UTC)

Wow, this WF flew under the radar. I didn't realise it was you. Achievements? Nearly finished with Chambers 1908. Down to about 5,000 entries to review (mostly short but annoying or obscure ones to deal with). w00t. Equinox 08:48, 9 October 2019 (UTC)
Achievements:- Added hundreds of words that nobody is ever going to look up - but it keeps me busy. Keep taking the tablets. SemperBlotto (talk) 09:46, 9 October 2019 (UTC)
I deleted a few hundred old IP talk pages. This doesn't actually accomplish anything of benefit to the project but it annoys me that they exist. - TheDaveRoss 12:19, 9 October 2019 (UTC)
User:Ashley Pomeroy would have words with you... Equinox 15:50, 13 October 2019 (UTC)
I suppose helping build up Category:en:Rail transportation to 686 entries (currently) is an achievement, not to mention adding to Category:nb:Rail transportation. DonnanZ (talk) 14:00, 9 October 2019 (UTC)
Same old same old. Up past 25,400 pages with {{taxlink}}, 15,700 with {{taxoninfl}}, and 12,400 with {{vern}}; just a wee bit short of the estimated 1.3 million described species. DCDuring (talk) 00:40, 10 October 2019 (UTC)
BTW, I'd like to thank all of those who have used these templates to build up the total that Wiktionary has. There are quite a few now. DCDuring (talk) 23:15, 10 October 2019 (UTC)
We love our Sisyphean labors, don't we. My big thing has been adding etymologies, derived terms, related terms, and further reading to many hundreds of Romance entries, then adding descendants to their Latin ancestors. Have I inspired Wonderfool to do something tedious with my interlinking mania? :o Ultimateria (talk) 16:01, 10 October 2019 (UTC)
Ult, you always inspire me. --Vealhurl (talk) 11:42, 11 October 2019 (UTC)
Creating and maintaining wanted entries lists. Nice to watch older lists slowly turning blue. – Jberkel 20:17, 10 October 2019 (UTC)

Template:blend categorizationEdit

Currently, {{blend}} adds the entry under "...terms borrowed from..." if the language is one of the two components is different, such as with romppu having {{blend|fi|ROM|korppu|lang1=en}}; the term is under "Finnish terms borrowed from English", as opposed to just "Finnish terms derived from English", even though a blend is a form of derivation and the word wasn't directly borrowed in this case. Is there a reason the categorization is as it is? — surjection?〉 10:12, 9 October 2019 (UTC)

Hmm, {{affix}}, {{compound}}, and the other etymology templates handled by Module:compound do this as well. It probably dates from this edit by Rua in May 2016. In March of this year {{blend}} followed suit with this edit by Benwing2. — Eru·tuon 18:36, 9 October 2019 (UTC)
I agree, this should be changed. If no one objects, I'll change it. Benwing2 (talk) 03:39, 10 October 2019 (UTC)
@Erutuon, Surjection Changed. Benwing2 (talk) 07:59, 15 October 2019 (UTC)
Thank you, the new categorization makes a lot more sense. — surjection?〉 08:41, 15 October 2019 (UTC)

Deletion of rel-top and related templatesEdit

Please see Wiktionary:Requests_for_deletion/Others#Deletion_of_rel-top,_der-top_and_related_templates and join the discussion.

As for deletion of templates with more that 1000 transclusions, is it then a better idea to deprecate it? In that case {{rel-top}} and {{rel-bottom}} is the only ones qualifying for that.--So9q (talk) 11:41, 9 October 2019 (UTC)

label "(rare)"Edit

Why isn't the oft-added label "(rare)" added to Appendix:Glossary? --Backinstadiums (talk) 16:04, 9 October 2019 (UTC)

Don't think there's a reason (other than no one has bothered to do so). How exactly "rare" is defined in terms of lexicography, I don't know. Maybe a certain percentage threshold in a corpus? – Jberkel 09:00, 19 October 2019 (UTC)

Suppressing verb inflectionsEdit

The auto-generated verb inflections for call one on one's shit (presently nominated for deletion) are haywire. Obviously this can be fixed by hand-typing "calls one on one's shit", "calling one on one's shit", "called one on one's shit", but to me including the inflections at all looks highly silly, like way too much information. What is the recommended way to suppress the inflections altogether while still using the correct template? Mihia (talk) 14:08, 10 October 2019 (UTC)

One simple way is to use {{head|en|verb}}. DCDuring (talk) 16:10, 10 October 2019 (UTC)
Thank you! Mihia (talk) 17:05, 10 October 2019 (UTC)

What is the standard of 当て字?Edit

The use of kanji chosen primarily for their phonetic (narrow sense) "OR" semantic (broad sense) value to represent foreign or native Japanese words, or the kanji so used.
Ateji (当て字) – use of kanji for phonetic value (sound) "RATHER THAN" semantic value (meaning), such as 寿司 (すし, sushi, “sushi”). Opposite of jukujikun.

ᾨδή (talk) 11:42, 11 October 2019 (UTC)

Lack of macrons on Classical Nahuatl pages.Edit

We should change the current practice of having no macrons on Classical Nahuatl pages titles. I think it's just an inconvenience to leave out a part of the language's phonology like that on the site. --Jaydreams (talk) 17:46, 10 October 2019 (EDT)

I dunno, we omit vowel length markings in titles for several other languages. For instance, Latin (well, Classical Latin) has phonemic vowel length, but is usually written without macrons or breves, so we don't include macrons and breves in titles. Ancient Greek similarly. In both cases though, not all varieties of the languages had vowel length. I don't know if this is true of Classical Nahuatl. — Eru·tuon 21:53, 11 October 2019 (UTC)
While I'm pretty sure some varieties of Modern Nahuatl have lost vowel length, Classical Nahuatl was pretty centralized phonologically; you don't worry about dialects when writing for it. I am aware of the lack of length marking on Classical Latin, but it has a standardized script that explicitly did not mark length; Greek has η and ω that do. Nahuatl, meanwhile, has several divergent orthographies. The only one which marks both vowel length and the glottal stop seems to be the orthography shared between Frances Karttunen (An Analytical Dictionary of Nahuatl) and J. Richard Andrews (An Introduction to Classical Nahuatl). I think it would be best to just go for phonological accuracy with page titles, since there isn't any standardization to go off of. Also, maybe not as relevant, but Classical Nahuatl is phonologically pretty simple. Japanese and Hawaiian don't get that treatment when it comes to macrons, and Nahuatl definitely shouldn't either. --Jaydreams (talk) 20:12, 10 October 2019 (EDT)
[In regard to Ancient Greek, I was referring not to η (ē) and ω (ō), but to monophthongal α (a), ι (i), υ (u). Length is not typically marked for those except in some grammars or textbooks, and it was lost sometime in the Koine Greek period, so is not present in all varieties of what Wiktionary calls Ancient Greek.] — Eru·tuon 00:21, 12 October 2019 (UTC)
Why do you need it in the pagetitle, in the header it is enough. Also for rarer words it is not known, as in Latin, so it is better to omit it there, in addition to that it is harder to type if diacritics are present in the page title. I can well write macrons with my keyboard, but do you imagine that people always do that to look up words? No, people won’t look up words with macrons. There is nothing to improve here. By speaking a language without long vowels and using a script without long vowels Spaniards have permanently poisoned the tradition of Nahuatl. Perhaps one can fix that for the Modern dialects by using Devanāgarī or Kashmiri Arabic script to write Nahuatl as it deserves by breaking with the tradition but Classical Nahuatl has its order petrified. Fay Freak (talk) 00:27, 12 October 2019 (UTC)
  • Technical consideration and curious query: are there any contrastive pairs of words in Classical Nahuatl that would differ in spelling only by the presence or lack of macrons? That might be one factor in determining how important it is to include macrons in page titles. In Japanese, for instance, 三度 (sando, three times) and 参道 (sandō, pilgrim's path) are very different things. Likewise in Hawaiian, where we have lolo (brains; marrow) contrasting with lōlō (paralyzed, numb; crazy). ‑‑ Eiríkr Útlendi │Tala við mig 00:49, 12 October 2019 (UTC)
    By looking through {{head}}, {{nci-noun}}, {{nci-proper noun}}, and {{nci-phrase}} from the last dump, I found cua and cuā, patla and pātla, āhuatl and āhuātl, metztōntli and mētztōntli. — Eru·tuon 05:28, 12 October 2019 (UTC)
    There's also tōloa (bend) vs. toloa (swallow), piloa (hang) vs pīloa (shorten), -pān (flag) vs. -pan (on top), māca (negative imperative) vs. maca (give) etc. Like Eiríkr Útlendi showed with the languages I mentioned, it's just too important a feature. Also, I don't agree with Fay Freak at all. People are going to care about having entries be actually accurate, and all you have to do is click on the See Also section to find similar spelled words. Classical Nahuatl's "order" is not without macrons. As I said earlier, there's no standards orthography, so for main entries we should use the Karttunen form as it's the most accurate and most used. — User:Jaydreams 02:06, 12 October 2019 (UTC)
@Jaydreams IMO it could go either way. Latvian, for example, includes length marks in page titles, and various languages (e.g. Ancient Greek, Spanish, Portuguese) include accents in page titles. But many other languages don't. The general principle followed is to go by the standard orthography (although a small exception is made for Russian, which includes ё in page titles instead of е even though the diaeresis is normally omitted in standard writing; note that Russian page titles don't include stress, even though it's highly unpredictable and often lexically relevant and found in the header). In this case, my impression is that the original orthography used by the Spanish lacked macrons, so we should probably do the same. The "orthography shared between Frances Karttunen (An Analytical Dictionary of Nahuatl) and J. Richard Andrews (An Introduction to Classical Nahuatl)" that you mention is exactly parallel to the situation in Latin, where dictionaries and introductory books include macrons but the standard orthography doesn't. Note that we also don't include macrons in Old English entries even though nearly all critical editions these days do include them, again based on the argument from standard orthography. Benwing2 (talk) 15:12, 12 October 2019 (UTC)
  • I cannot agree with the suggestion to use an orthography invented centuries ago by non-English-speakers. We don't use the original Portuguese orthography from the late-1500s / early 1600s for Japanese romanization for similar reasons -- this is the English Wiktionary for one, and we have different ideas about spelling now than we would have had hundreds of years ago. Perhaps more so for orthographies coming from Spanish or Portuguese speakers, in light of sound changes that have occurred in both languages. Another aspect is that the Spaniards that invented the first Latin-alphabet orthography for Nahuatl were also likely not fluent Nahuatl speakers, and they may not have understood the importance of marking vowel length when inventing their orthography.
It also seems to me that the Classical Nahuatl situation is closer to Hawaiian or Japanese than it is to Latin with regard to macrons, as they appear to be contrastive in Nahuatl in a way they aren't in Latin (NB: I am no Latinist, however, and I could very well be wrong about vowel-length contrastiveness in Latin). Moreover, if what I've read above is correct, there isn't a standard orthography for Nahuatl, whereas there is a very long-standing orthography for Latin.
2p from the sidelines, in the hopes of being helpful. :) ‑‑ Eiríkr Útlendi │Tala við mig 18:13, 14 October 2019 (UTC)
Actually, macrons are very contrastive in Latin; compare e.g. legō (I choose) vs. lēgō (I dispatch), levis (light, quick) vs. lēvis (smooth), etc. Benwing2 (talk) 03:56, 15 October 2019 (UTC)
Aha, thank you for setting me straight on that. :) Considering then that we have the option now of determining an orthography for Classical Nahuatl and we are not saddled with millenia of convention, why would we not want to distinguish such a contrastive element? ‑‑ Eiríkr Útlendi │Tala við mig 19:24, 15 October 2019 (UTC)

Change {{cat}} to mean {{categorize}} not {{topics}}Edit

{{topics}} has an overabundance of short forms and aliases, including {{top}}, {{topic}}, {{catlangcode}}, {{cat}}, {{C}} and {{c}}, the latter three weirdly named (cat/c = topics???). Meanwhile {{categorize}} has no short forms. Logically, {{cat}} especially should mean {{categorize}}, and there are less than 1000 current uses, so I'd like to bot-replace {{cat}} with {{top}} and then repurpose {{cat}} as a short form of {{categorize}}. Thoughts? Benwing2 (talk) 14:46, 12 October 2019 (UTC)

  SupportRua (mew) 13:18, 13 October 2019 (UTC)
On a slightly separate note, do we really need a three-letter abbreviation for topic, already a short word? It seems to lend itself to confusion with concepts like "top of page". Equinox 13:36, 13 October 2019 (UTC)
@Equinox I agree, it's quite confusing. If others agree, we can deprecate this alias. Benwing2 (talk) 16:18, 13 October 2019 (UTC)
A common user of {{top}} reporting in, yeah we need it, I use it to add topics on definition lines like so:
  • #{{top|foo|Barring}} {{lb|foo|colloquial}} to [[bar]]
Expanding it to {{topics}} could crowd the line further Crom daba (talk) 20:16, 13 October 2019 (UTC)
Can we get more input and a consensus? If "crowding the line" is a problem then I feel that's more of a problem with our layout than with the choice of words. Templates and computer programming can be redesigned if needed. See for example the Python programming language which got rid of the famous {...} curly brackets and used indentation instead. Equinox 22:23, 13 October 2019 (UTC)
@Crom daba There's also {{topic}}, which is only two chars more than {{top}}. Benwing2 (talk) 22:47, 13 October 2019 (UTC)
BTW here's a table of all uses of {{topics}} and aliases:
Aliased template Canonical template #Uses
Template:topics Template:topics 86729 (47644 not including the aliases)
Template:catlangcode Template:topics 2365
Template:C Template:topics 25455
Template:c Template:topics 7946
Template:topic Template:topics 481
Template:cat Template:topics 597
Template:top Template:topics 2241
I actually propose making {{topic}} be the canonical name, even though it's not so commonly used now. It's shorter and (to me) more logical than the plural {{topics}}. Benwing2 (talk) 23:32, 13 October 2019 (UTC)
I have deprecated {{cat}} as a shortcut for {{topics}} and redirected it instead to {{categorize}}. Now, the three categorization templates ({{categorize}}, {{catlangname}} and {{topics}}) each hard three-letter short forms. I haven't done anything with {{top}} because there appears to be some objection from User:Crom daba to deprecating it (what do you think of using {{topic}} or {{C}} instead?). I actually think the canonical name of {{topics}} should be {{topic}}, with {{topics}} and {{T}} as the only aliases; all the other aliases are too confusing. Benwing2 (talk) 05:49, 17 October 2019 (UTC)
I would prefer keeping {{topics}} as the proper name. The template can take multiple topics as arguments after all, not just a single one. The name should hint at this usage to avoid people doing stuff like {{topic|xx|a}}{{topic|xx|b}}{{topic|xx|c}}. —Rua (mew) 20:47, 17 October 2019 (UTC)

Rename categories like Category:French learnedly borrowed terms to Category:French learned borrowingsEdit

IMO, "learnedly borrowed" sounds extremely awkward. Google shows only 8 hits, of which 6 are to Wiktionary. I want to create a {{semi-learned borrowing}} template, and "semi-learnedly borrowed" sounds even worse. Benwing2 (talk) 18:45, 12 October 2019 (UTC)

I support the renaming; "learned borrowings" sounds much better. — Eru·tuon 17:04, 13 October 2019 (UTC)
I support the renaming too (see Wiktionary:Requests for moves, mergers and splits § Template:semantic loan and Template:learned borrowing). Canonicalization (talk) 17:40, 13 October 2019 (UTC)
I support anything that doesn't have learnedly in it. NES do not use that word. Equinox 02:05, 14 October 2019 (UTC)
Done. Benwing2 (talk) 08:21, 15 October 2019 (UTC)

matchlessEdit

I just created this entry. At the bottom where the categories are there is this redlink: "English terms spelled with". Does anyone know what's going on here? Cheers. ---> Tooironic (talk) 04:51, 13 October 2019 (UTC)

@Tooironic: Yeah, the title has two invisible characters at the beginning, the zero-width no-break space or byte order mark (U+FEFF). It is not a standard character in English, so the "spelled with" category is added. There's already an entry for matchless without ZWNBSP characters.
Probably I should come up with an abuse filter to warn editors if there are invisible characters in the title, except where they're wanted, as in فارسی‌زبان(fârsi-zabân, Persian speaker). (The invisible character is after فارسی, though there it prevents the joining of the letters.) — Eru·tuon 06:06, 13 October 2019 (UTC)
Actually, for this character, only administrators can create the entries because the character is on the title blacklist, so a filter wouldn't be very useful. But tagging edits that add the character to wikitext might be helpful. — Eru·tuon 06:18, 13 October 2019 (UTC)

Deprecate {{docparam}}Edit

I don't really understand why we have {{docparam}} as well as {{para}}. They display the same; the only difference is that {{para}} supports specifying the value of the param, whereas {{docparam}} supports indicating whether the param is required, optional, etc. I propose adding an argument to {{para}} to support the required/optional/etc. use case and converting cases of {{docparam}}. There aren't very many uses of {{docparam}} anyway, only about 300-400. What I'm thinking of is adding a third numbered param to specify arbitrary info, and also adding boolean params |req= and |opt= for the common use cases of specifying required and optional params. Benwing2 (talk) 20:35, 13 October 2019 (UTC)

Can we rename one of {{para}} and {{param}} while we're at it? —Rua (mew) 20:37, 13 October 2019 (UTC)
@Rua Yes, we can rename {{param}} to something like {{pararef}} or {{paramref}}. It's used on < 100 pages in any case, while {{para}} is used on ~ 3000 pages. Benwing2 (talk) 21:55, 13 October 2019 (UTC)
@Rua I renamed {{param}} to {{paramref}} ({{pararef}} sounds like it could refer to paragraphs). I haven't deprecated {{param}}; I'll wait somewhat longer for comments on this. Benwing2 (talk) 00:12, 14 October 2019 (UTC)
The reason Template:docparam exists was the very large number of existing {{para}} calls on talk pages. I wanted a different formatting for use in template documentation and did not want to change any existing {{para}} calls. Only later was the template changed to actually call {{para}} instead of using its own formatting. At that point it became redundant. - dcljr (talk) 02:33, 14 October 2019 (UTC)
Deprecated and deleted {{docparam}}. Benwing2 (talk) 05:44, 17 October 2019 (UTC)
Deprecated and deleted {{param}} in favor of {{paramref}}. Benwing2 (talk) 05:58, 17 October 2019 (UTC)

Please avoid using template calls in section headingsEdit

I'd like to request that people here (and on other discussion pages, of course) try to avoid using template calls in section headings. It makes wikilinking to those sections from other pages unnecessarily difficult (for some users, anyway). For example, if I "naively" tried to link to the above section using any standard linking method, it wouldn't work:

  • normal wikilink, visible section title:
    [[Wiktionary:Beer parlour/2019/October#Deprecate {{docparam}}]]
    → [[Wiktionary:Beer parlour/2019/October#Deprecate Template:docparam]]
  • normal wikilink, actual section title:
    [[Wiktionary:Beer parlour/2019/October#Deprecate {{temp|docparam}}]]
    → [[Wiktionary:Beer parlour/2019/October#Deprecate {{docparam}}]]
  • external-style link, visible title:
    [https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2019/October#Deprecate_{{docparam}}]
    Template:docparam
  • external-style link, actual title:
    [https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2019/October#Deprecate_{{temp|docparam}}]
    {{docparam}}

None of these can be fixed by using nowiki tags, BTW. (Note that browsers do successfully follow the appropriate link shown in the Table of Contents at the top of this page, but trying to copy/paste that link location into your own wikilink results in the problems shown above.)

These URL-encoded versions do work:

but the encoding apparently needs to be done "manually" by the user (it does on my broswer, anyway: FF 69.0.2), so it's not a practical solution for most users. (I'm not sure if this changed in a recent browser upgrade, because I know FF used to URL-encode text copied from the location bar, when necessary.)

Before I remembered to try the URL-encoding solution, I added a 'span' tag to allow for an alternate link target to that particular section (so I could link to it from a talk page), but the better solution is to simply avoid creating such section headings in the future. Thanks. - dcljr (talk) 06:15, 14 October 2019 (UTC)

Do you get the same problems with a manual link, e.g. Deprecate [[Template:docparam]] ? Benwing2 (talk) 06:33, 14 October 2019 (UTC)
No, because of the way MediaWiki handles such headings. Witness: Wiktionary:Beer parlour/2019/October#Cleaning up or deleting Template:jump (and the shorter version, when used on the same page: #Cleaning up or deleting Template:jump). Those links work fine for everyone. - dcljr (talk) 04:59, 15 October 2019 (UTC)
Sounds more like a problem that should be fixed, than a habit changed. --{{victar|talk}} 06:13, 15 October 2019 (UTC)
@Victar Funny… I see it the other way 'round. In any case, do you have a suggestion as to how it could be fixed? - dcljr (talk) 03:28, 19 October 2019 (UTC)
Sounds like the work of a sanitation filter. --{{victar|talk}} 04:23, 19 October 2019 (UTC)

"Wikipedia has articles on" + listEdit

Can anyone think of a good reason why we should have a fake Wikipedia box at Orange County with links to eight Wikipedia articles? All of them are covered by Wikipedia's disambiguation page, which is now linked to by a real Wikipedia box that was added last year. I was tempted to just remove it, but it's been there for 13 years. Chuck Entz (talk) 09:57, 15 October 2019 (UTC)

I say delete it. Wikipedia should handle disambiguation and linking themselves.--So9q (talk) 10:36, 15 October 2019 (UTC)
Agreed, remove that. We could also adjust the {{wikipedia}} template to accommodate more links, right now it only handles two. - TheDaveRoss 12:03, 15 October 2019 (UTC)
Sometimes the WP dab page has so many items that the ones relevant to our entry are hard to find in the clutter. That said, I prefer the use of the in-line template for such links. DCDuring (talk) 15:46, 15 October 2019 (UTC)

Idiomacity of danish termsEdit

I'm unsure how to label these 2:

What about idiomatic?--So9q (talk) 05:17, 16 October 2019 (UTC)

Idiomatic, just like middle of nowhere. Ultimateria (talk) 18:08, 16 October 2019 (UTC)
Please don't use the idiomatic tag, it's completely pointless. Canonicalization (talk) 19:26, 16 October 2019 (UTC)
Can you elaborate? --So9q (talk) 05:11, 18 October 2019 (UTC)
Using our typical definition of idiomatic, all phrases included in Wiktionary should be idiomatic – that is, not sum-of-parts (WT:SOP) – unless they're translation hubs (WT:THUB). You're probably thinking of some other definition of idiomatic for these phrases though. — Eru·tuon 05:32, 18 October 2019 (UTC)

Feedback wanted on Desktop Improvements projectEdit

07:15, 16 October 2019 (UTC)

Merge Template:unreferenced into Template:rfrefEdit

I just discovered {{unreferenced}}, which seems to serve exactly the same purpose as {{rfref}} but categorizes differently. It's used on maybe 10-15 pages; I propose rewriting {{unreferenced}} to {{rfref}} and deleting the former. Benwing2 (talk) 14:44, 16 October 2019 (UTC)

Merge. Canonicalization (talk) 21:50, 16 October 2019 (UTC)
Done, and deleted Category:Entries lacking sources, which was populated solely by that template (see Category:Requests for references by language). Benwing2 (talk) 19:29, 19 October 2019 (UTC)

Phrasal verbsEdit

Can anyone explain to me how phrasal verbs work on English Wiktionary? First of all, I see that there is both Category:English verbs and Category:English phrasal verbs. However, most entries in the latter seem to use the header Verb. Should they not use Phrasal verb, or is this not used as a header? Does this mean that the phrasal verbs are only marked as such in the categorization, and this is done through the normal verb templates? I notice for example for blåsa upp that this word is included in both Category:Swedish verbs and Category:Swedish phrasal verbs, and I suspect this is done by the parameter particle= in the verb templates. Is this done correctly?

Second of all, what is the preferred way of linking to phrasal verbs from the main verb entry? I wrote it under Usage notes in styra upp, does this work, or should it be under Derived terms?

Third of all, since the parts are blue linked in the entry title, I suspect no Etymology section is needed? I.e. the one I used in styra upp is superfluous and might as well be removed? If not, is there a preferred etymology template to be used?

Thanks for taking the time to have a look at my questions, I’d really like to clean up among Swedish phrasal verbs. --Lundgren8 (t · c) 21:32, 16 October 2019 (UTC)

To answer just one of your questions: WT:POS gives a limited list of parts of speech allowed as headers. “Verb” is on the list; “Phrasal verb” is not. With the exception of “Prepositional phrase” and “Proper noun”, headers of the form “(attribute) (POS)” are even explicitly disallowed.  --Lambiam 22:19, 16 October 2019 (UTC)
It would seem that {{sv-verb}} works differently from {{en-verb}}, so questions that can't be answered in the documentation (See WT:ASV.) should be directed to active Swedish admins (eg, User:Mike, User:Robbie SWE} or other active contributors to Swedish entries, eg, User:LA2. DCDuring (talk) 03:07, 17 October 2019 (UTC)

Past participlesEdit

I’ll post this as a separate question. I was wondering whether there is a policy on past participles. My question is whether they are to be treated as verb forms, adjectives or both. Where do we draw the line? I created e.g. blankspolad and uppstyrd today, where I defined it under adjective and used the declension templates there, but then added the verb form under a separate verb heading. It seems a bit tautological, but is this the way it’s done? --Lundgren8 (t · c) 21:51, 16 October 2019 (UTC)

An (imperfect) adjectivality test for English is whether the term can be used attributively, and whether it allows comparative and superlative, or more in general gradation (hardly, very, too, ...). You can say, “his time is come”, but not *“is it his come time already?” or *“his time is very come, but her time is even more come”. But you can say, “the disappointed candidate is open to a mandamus”, and “she was bitterly disappointed”. So the past participle “come” cannot fill the role of an adjective, but “disappointed” can; hence it is listed twice. (Aside: it would make sense to me to allow combining parts of speech in a heading. In many languages, most adjectives can also be used as adverbs, but they are typically only listed now as being adjectives. Why can’t we say ==Adjective/Adverb==?)  --Lambiam 22:43, 16 October 2019 (UTC)
The OED tends to have adjectives for "most" of them (haha, I haven't read most of the OED, but they do frequently add adjectives for what were only verb forms before). Equinox 05:53, 17 October 2019 (UTC)
Adding an adj sense with "the obvious meaning" to a verb pa.p is, at any rate, no stupider than creating entirely separate entries for noun plurals. Equinox 05:54, 17 October 2019 (UTC)

@Lundgren8 I do the same for Hungarian: adjective with declension and verb form. See akkreditált. Panda10 (talk) 18:20, 17 October 2019 (UTC)

Wikidata integrationEdit

Hi, I just visited our and the wikidata page for atomic clock. They are not linked. What would it take to link them? Has anybody worked on this? —This unsigned comment was added by So9q (talkcontribs) at 04:06, 18 October 2019 (UTC).

@So9q: You may want to take a look at WT:Wikidata. — justin(r)leung (t...) | c=› } 04:13, 18 October 2019 (UTC)
Thanks for the link. That page mostly talks about lexemes in wikidata. I saw now that Yurik in RU:WT added linking but only to the lexemes his bot extracted and imported there. Example. --So9q (talk) 05:08, 18 October 2019 (UTC)
I had previously linked several Wiktionary entries to their corresponding items in Wikidata but these were removed by bot. I had also attempted to link Wikidata items to corresponding pages in Wiktionary but this did not go well. It seems like both projects do not favor one another. Nevertheless, I have created Template:wikidatalite as a replacement of Template:wikidata based on the discussion here. KevinUp (talk) 09:12, 18 October 2019 (UTC)
So we can create a one-way link from here to Wikidata, which will look like “  Q227467 on Wikidata.Wikidata ”. But even if someone adds the lexeme atomic clock to the Wikidata Lexeme space, I don’t see how there would be a (possibly indirect) two-way link with Wikidata:Q227467.  --Lambiam 10:28, 18 October 2019 (UTC)
I'm happy to hear you are positive to this Lambiam. We were only discussing links to non-lexemes in the Q-namespace. I see no reason why a link to wiktionary articles in different languages for "atomic clock" could not be added to Wikidata:Q227467, but that is a discussion best done there.--So9q (talk) 18:12, 18 October 2019 (UTC)
I tried adding the link to our atomic clock at it was refused because of Notability. In the guidelines it is stated: "On Wiktionary, items for citation pages are not allowed. Main namespace is also excluded because interlanguage links are automatically provided by Cognate." source.--So9q (talk) 18:35, 18 October 2019 (UTC)
Thanks KevinUp for these templates. That was exactly what I meant. Do you know the rationale for removing one-way wikidata-links? Seems weird to me seeing that we have links to multiple other sources including non-WMF ones. I think the more linking the better.--So9q (talk) 18:12, 18 October 2019 (UTC)
The reason why Wikidata does not link to Wiktionary is because Wikidata items represent "concepts" rather than words. A single word can have multiple meanings, e.g. orange can refer to both the color and the fruit, so linking the orange fruit (  Q13191 on Wikidata.Wikidata ) to wikt:orange is not ideal.
If you're interested to link Wiktionary senses to the Wikidata Q-namespace, please use {{senseid}} instead. I don't think it is a good idea to use {{wikidatalite}} to link Wiktionary lemmas to Wikidata, because lemmas can have multiple senses whereas concepts tend to be more precise. I think {{wikidatalite}} can be used for scientific names and Unicode characters, but not words in general. KevinUp (talk) 08:42, 19 October 2019 (UTC)

Request for wikibase for WiktionaryEdit

Hi, recently Jura proposed that we request a wikibase instance from WMF to integrate our cross language senses and other stuff into. If I understand correctly this avoids the licensing CC0 issue completely. WDYT--So9q (talk) 04:58, 18 October 2019 (UTC)

I am not sure what the CC0 issue is; who does not like which license for which project, and what license would be proposed for the new “wikibase” – whatever that means –, and what would be its remit? An ontological database of semantemes?  --Lambiam 10:44, 18 October 2019 (UTC)
The CC0 issue is that Wikidata is CC0 and that license is not compatible with Wiktionary's current licenses, so we can't just put Wiktionary data into Wikidata. A new Wikibase instance could have a compatible license so that data could be legally moved to that structure. - TheDaveRoss 12:45, 18 October 2019 (UTC)
And our license is not compatible how? Does it have to do with our citations, our external links? DCDuring (talk) 20:20, 18 October 2019 (UTC)
How is it that a new instance can ignore the existing CC-BY-SA 3.0 License? Doesn't it still have to have hyperlinks or URLs to Wiktionary for each element of information? DCDuring (talk) 20:32, 18 October 2019 (UTC)
The new instance would have a compatible default license (CC-BY-SA), and not CC0, as used on Wikidata. I'm in favor of Jura's proposal: It's not just the lexicographical data, we also have language and category data stored in Lua modules (with ugly split hacks to avoid memory issues). And we could do more radical UI improvements/experiments, when the data is no longer stored/tied to specific templates. – Jberkel 21:33, 18 October 2019 (UTC)
And the wikibase instance is to be automagically kept in sync with current Wiktionary? Will any syncing degrade performance for contributors or passive users? DCDuring (talk) 00:39, 19 October 2019 (UTC)
We would probably slowly move more and more data into the wikibase and generate entries partly from there. No need for synching in other words. This would also mean editing some things directly in the wikibase.--So9q (talk) 19:40, 21 October 2019 (UTC)

Place and given names in other languagesEdit

In light of discussions such as this and this, I realized that while in practice we have followed the principle that place names are acceptable in other languages too, while given names are so if they are used for people speaking that language specifically, while primarily not when talking about people of other nationalities. But is this actually codified on any existing policy? I couldn't find it in the CFI, for instance, and I'm fine with codifying the system as it is now. — surjection?〉 07:36, 18 October 2019 (UTC)

It seems like WT:CFI#Given and family names did not elaborate much on names that were borrowed or romanized from other languages. I don't think it is a good idea to consider all romanized forms of given names and surnames as English lemmas unless it is backed up by statistical evidence.
Currently, there are many false positives in Category:English surnames from Japanese and Category:Portuguese surnames from Japanese. If these are allowed, then we will be seeing lots of similar entries for Dutch, French, German, Italian, Portuguese, Spanish, Tagalog, etc. with the same spelling. KevinUp (talk) 08:56, 18 October 2019 (UTC)
I think we should consider policy and practice for personal names of non-historical figures and for toponyms separately. Toponyms are usually subject to translation; Turkish İstanbul (with a dotted ⟨İ⟩ becomes dotless English Istanbul and even Dutch Istanboel. These names have the same referent. Except for historical figures (English John Lackland, German Johann Ohneland, Azeri Torpaqsız İoann), a given name like John may be transliterated, but is not translated; “John F. Kennedy” does not become “Jean F. Kennedie” in French or “Иван Ф. Кеннедий” in Russian; it remains “John F. Kennedy” in French and German alike, and becomes “Τζον Φ. Κέννεντυ” in Greek and “Джон Ф. Кеннеди” in Russian. However, “John” has become a common given name or nickname for (e.g.) Dutch men, such as Dutch reality-TV tycoon John de Mol, which makes it reasonable to also list it as a Dutch given name. We can list transliterations of Language X names under the L2 of Language X, using “A transliteration of the name”. For toponyms I’d recommend to only list a name from a non-Anglophone country under an L2 of English if (next to attestability) it was originally not in Latin script, or has historical variant names in other languages using Latin script (like Dutch Parijs for the French capital).  --Lambiam 10:10, 18 October 2019 (UTC)
Well, I have suggested to not sort nomina propria under language headers altogether. People begin to see why. Fay Freak (talk) 14:36, 18 October 2019 (UTC)
I agree that separate consideration will be needed for personal names and toponyms. I don't have much issues with toponyms because toponyms tend to have nativized spelling, such as Istanbul losing the dotted İ as mentioned above. Moreover, the English entry of the toponym can still serve as a translation hub if it does not pass RFV/RFD.
However, different considerations will be needed for personal names, because they tend to preserve the same spelling among different languages that share the same script, as mentioned above. Proper guidelines will be needed to identify what qualifies as a borrowing and what qualifies as a transliteration. For personal names in English, statistical evidence of non-Anglophone names being used among citizens of Anglophone countries is a good indication of borrowing into English. See entries such as Nguyen, Tamura, Abdulla for example. KevinUp (talk) 14:54, 18 October 2019 (UTC)
If they are spelled the same or not is of little relevance. If a Pakistani moves to England his name will just continue being used, English or not. This is not a borrowing. There will never be borrowing. In the 7th generation there will not have been a borrowing, and if by family reunification he lets his whole inbred village follow there will still be no borrowing even if there are now thousand bearers of his name in England. It does not depend on how “English” his descendants are since we do not want to get into these nation questions which aren’t linguistic and names and languages are not as a principle bound to areas. Editors just have to bid farewell to the notion that usage is that whereby a word is assigned to a language. Fay Freak (talk) 15:06, 18 October 2019 (UTC)
inbred......? —Suzukaze-c 21:46, 18 October 2019 (UTC)
If there will never be borrowing, I wonder how names like John (Greek) and Isaac (Hebrew) got listed as English names. The number of names that can trace their history back to the Germanic substrate of English through Old English is pretty small. These continue; Liam is now the second most popular name in the US, and I'm pretty sure that's mainly among English speakers.
It doesn't matter how “English” his descendants are... then what? What names can be English, and how are you making that distinction? The only thing you actually said was that a Pakistani's name will never been English, with no reason I can see.--Prosfilaes (talk) 20:47, 19 October 2019 (UTC)
I explained the reason and you missed the point. Personal names and place names are not used like other words, they travel independently of languages and aren’t members in the sets of individual languages, albeit part of the phenomenon of language in general. Isaac isn’t an English name, nothing is an English name. When you say “Isaac is part of the English language” you use “language” in a different way than I imagine it, in a way that is not necessary and problematic here because of the Sorites paradox. People only ever come up with arbitrary criteria like Anglicized spelling or ”no Romanizations” to exclude names, and “Englishness” is also one, especially if tied to extralinguistic factors. I have suggested to restrict the assignment of personal and place names to languages altogether. I can very well forgo German sections of Isaac and Isaak, as well as Hermann and Heinrich, and whether pronunciations in each language should be listed then under a translingual header is also debatable, there are fields where a dictionary should not get into or it will stay a kludge. No 100 sections for Srebrenica. It is a problem traditional dictionaries did not have because they were restricted by language and personnel and could arbitrarily include but few personal names they deemed interesting. Now you won’t find criteria except “muh it has a macron so it is not English”, “only Asian migrants bear this name”. Nothing convincing people argue for the names, all start from wrong premises. Why not have one section for all languages, talking about frequencies of spellings by country or what data there is? Other pages will point to such a page saying things like “predominantly German spelling of”, “Persian encoding of”, usage notes with particular frequency information, etc. whatever, it’s the details. This can be introduced for some names to start. But calling every name ever used (durably …) in an English-language context English by whatever rubbery restriction is false. It wasn’t false when market demand and limited manpower regulated the inclusion of names in dictionaries but now that the whole world can spread every name to the whole world for free it is wrong, has been wrong to begin with, for the concept of a Wiki dictionary, but people were not farsighted enough to see it. This problem is peculiar to names as words in languages are naturally limited, names are not, are more arbitrary than the common nouns, with looser relation to any individual language. Fay Freak (talk) 22:16, 19 October 2019 (UTC)
A message which gets missed when you start talking about inbred Pakistanis.
One one hand if you have a different model, then demo it in user space. Show us what you want to do.
On the other, throwing up our hands on doing what traditional dictionaries are doing because it's too hard is lazy and not helping our users. We need a section for Srebrenica in every language where people talk about Srebrenica and need to know how it's pronounced in their tongue. The line between names and other words in the language is more blurry, but not absolutely different. Isaac, as the name of Abraham's son, is as clearly English as any name for any other thing. Isaac as an English name may be more complex, but there is a tight association. Names spelled the same way are frequently pronounced differently in different languages. It's a mess and needs to be handled differently.--Prosfilaes (talk) 20:10, 20 October 2019 (UTC)
I think pronunciation needs to be taken into account too. Paris or Trois-Rivières should be listed as English words even though the spelling doesn't vary from the original, because they have well-established English pronunciations and are frequently used in English texts. Often, foreign place names have non-intuitive pronunciations because they aren't spelled like typical English words, so people often look to dictionaries to see how to pronounce them. I would argue that the same attestation rule should apply for toponyms as for regular words (perhaps excluding atlases and maps), since obscure places are unlikely to be ever mentioned outside of the dominant language in the area anyway. But I wouldn't be opposed to a stricter citaton requirement. Andrew Sheedy (talk) 21:15, 19 October 2019 (UTC)
I think it should also be kept in mind that historically, people's last names were often modified when they came to English-speaking countries, because census-takers, their parish priests, etc. would often record their name phonetically rather than the way it would be spelled in the original language. There are a half dozen spellings of my mother's maiden name, but only one spelling in the original Polish. At that point, those last names are obviously English words, because they certainly aren't Polish anymore.
This no longer happens as much, just because of the nature of modern-day documentation (the exception being when transliteration is necessary), so it's a lot more difficult to pinpoint when an immigrant family's name becomes English. However, I would suggest that it occurs when there are people with that last name who no longer speak the language it is from, since they can't be considered to be code-switching, and because they will almost certainly pronounce it differently than in the source language. It will then be valuable to record how the name is typically pronounced in English. Especially with names like Nguyen, which have non-intuitive pronunciations, (often loosely) based on the original. Another indication is when names are stripped of their diacritics.
Category:Navajo surnames might be of consideration: they are all Anglicized. —Suzukaze-c 21:49, 18 October 2019 (UTC)
See Wiktionary:About given names and surnames#The language statement of a name. In my opinion given names and surnames should have slightly different criteria. Toponyms are another matter. --Makaokalani (talk) 12:04, 19 October 2019 (UTC)
What I would propose is that the reasoning there be honed and then included into WT:CFI. — surjection?〉 08:38, 22 October 2019 (UTC)

Cleaning up request categoriesEdit

There was a vote awhile ago, spearheaded by User:Daniel Carrero, to clean up the names of request categories so they more-or-less consistently begin with "Requests for X". This was very helpful and much better than the old names (e.g. Category:English entries needing definition or Category:English terms needing attention or Category:English requests for example sentences); now, you can use autocomplete to search easily for request categories. But there are still some inconsistencies. This became apparent to me when I went through and documented all the request templates I could locate; this results of this can be seen at Category:Request templates. Most of the categories have the form "Requests for X in LANG entries" but there are several that don't, e.g.:

What do people think about renaming some of these categories to be more consistent?

Oops, need signature. Pinging User:Daniel Carrero again. Benwing2 (talk) 19:25, 19 October 2019 (UTC)
Agree. What was it with unhiding them? There is little gain to have well-named request categories if no one finds em. Fay Freak (talk) 22:33, 19 October 2019 (UTC)
  Support Consistency is nice.--So9q (talk) 04:03, 20 October 2019 (UTC)

Category:Entries needing inflection by language (e.g. Category:Afrikaans entries needing inflection)Edit

@Rua I notice these categories still exist, and are redundant to Category:Requests for inflections by language (e.g. Category:Requests for inflections in Albanian entries). They are populated by the undocumented |fNrequest= parameter to {{head}}. Anyone object to renaming the "Entries needing inflection" categories to "Requests for inflections ..." and then documenting the |fNrequest= param? Benwing2 (talk) 19:48, 19 October 2019 (UTC)

Template:term-context vs. Template:term-labelEdit

Both templates appear to do exactly the same thing except that {{term-context}} allows the language code in |lang= (but displays a deprecation warning for this) whereas {{term-label}} doesn't allow this. Consistent with {{context}} vs. {{label}}, I propose deprecating {{term-context}} in favor of {{term-label}}. Benwing2 (talk) 05:01, 20 October 2019 (UTC)

Upsilon with tildeEdit

In the etymology of rheumatism Greek letter "ῦ" is automatically transliterated as "û". Wouldn't it be better using "ũ" instead? --188.76.241.115 11:02, 20 October 2019 (UTC)

The diacritic on is a Greek circumflex, so we transcribe it with a Latin circumflex. In some fonts it somewhat resembles a Latin circumflex (more precisely, an inverted breve), but in your browser's font and mine it looks like a tilde. We transliterate it according to its meaning and don't try to match the exact form (which would be impossible, because it varies). — Eru·tuon 17:22, 20 October 2019 (UTC)

Split Template:1 into Template:uc, Template:ucl and Template:ucmEdit

@-sche, Sgconlaw, msh210, DTLHS, Pious Eterino The horribly-named {{1}} creates a link to a lowercase term, with the display form being the uppercase form. It currently takes a language code in |lang=, defaulting to en, which is widely misused, mostly by being forgotten. In several Albanian entries, {{1}} by itself is misused to refer to the current pagename (which happens to be uppercase already), without properly specifying the language (and hence you get an #English link). Most uses are inside of other templates such as {{initialism of}}, but some are in lists or running text. I propose to split this into three templates:

  1. {{uc}} takes no lang code and directly generates a link for use inside another template, e.g. {{uc|foobar}} expands to [[foobar|Foobar]]. A longer example is found on NJQSAC (initialism of "New Jersey Quality Single Accountability Continuum"), which could be defined as
    # {{initialism of|en|[[New Jersey]] {{uc|quality}} {{uc|single}} {{uc|accountability}} {{uc|continuum}}}}
  2. {{ucl}} takes a lang code and generates a link like {{l}}; hence e.g. {{ucl|en|association}} is equivalent to {{l|en|association|Association}}.
  3. {{ucm}} takes a lang code and generates a link like {{m}}; hence e.g. {{ucm|en|association}} is equivalent to {{m|en|association|Association}}.

Thoughts? Benwing2 (talk) 04:03, 21 October 2019 (UTC)

Most of the time I see {{1}} used, it's in imitation of neighboring uses of {{l}} by new users who aren't paying attention. That probably explains your Albanian examples Chuck Entz (talk) 05:28, 21 October 2019 (UTC)
The template documentation says it must always be substed. So (contrary to the claim that it's "widely misused") we really don't know how it's usually used: presumably as intended, by substing, for editors' convenience instead of repeating typing a word. I see no problem with renaming it "uc" with a redirect from "1" so people can continue to use it (and with backward compatibility).​—msh210 (talk) 22:24, 21 October 2019 (UTC)

The issue with Westrobothnian and Scanian on WiktionaryEdit

Previous discussions:

I am very skeptical regarding the inclusion of certain Swedish linguistic varieties on English Wiktionary, primarily Category:Scanian language and Category:Westrobothnian language. Westrobothnian has been discussed on Wiktionary before, e.g. here in a discussion which resulted in that many entries were deleted, but since then @Knyȝt has added new entries, and the last three years or so this hasn’t been further discussed.

My issues can be summarized in a few points:

  • The orthographies used in Westrobothnian and Scanian entries are not in any way established.
  • The orthography for Westrobothnian is inconsistent.
  • The entries generally do not cite any sources, and the sources used do not use the same orthography.
  • Entries have previously been deleted but readded

1. The orthographies used in Westrobothnian and Scanian entries are not in any way established.

Westrobothnian is written by Knyȝt using their own developed orthography which is not found in any literature or elsewhere on the internet. Wiktionary is not the forum for original research or personal inventions (see also WP:FORUM), and we should not have entries in an orthography which is not found elsewhere.

Similarly, all the Scanian entries stem from a single source, a personal proposal for a Scanian orthography by a local enthusiast Mikael Lucazin from 2010, otherwise not used elsewhere. Surely this single proposal by one enthusiast can’t form the basis of Wiktionary inclusion? The Scanian orthography is very etymological, romantic, and archaistic, it uses etymological graphemes borrowed from Old Norse such as ⟨þ⟩ and ⟨ð⟩ and looks very peculiar overall. It looks very similar to the Focurc project on the internet, which is a Scots dialect with a very non-English inspired orthography in order to highlight its uniqueness.

2. The orthography for Westrobothnian is inconsistent

It is clear that Wiktionary is being used as a platform to launch a personal orthography, as the orthography is inconsistent and has changed over time.

The entry vâtn cites the definite form as vâtne, but here an earlier spelling vætnĕð is used, on Swedish Wiktionary vætne, and in the example sentence in spūt the spelling vattnä is used. The Westrobothnian project originally started on Swedish Wiktionary, where several entries were moved after the orthography changed, e.g. hähjänna to heþ hérna to heðhérnă, which on English Wiktionary corresponds to a fourth form: he + hjänna. Similarly, the Westrobothnian spellings used under garðr#Descendants are gål~gɑl, but gárþ here and gǫ́rð on Swedish Wiktionary. The entry auge has seven different spellings, with variations used in various example sentences and no sources for any of the forms or example sentences. The original orthography can be found on sv:Användare:Knyȝt#Vokaler.

Obviously a language can have many spelling variations and alternative forms, that’s not the issue, but it seems to be like Wiktionary is being used as a sandbox and spellings change over time as the personal orthography changes. Again, Wiktionary is not the platform for this, we are to use attested spellings found in reliable sources.

3. The entries generally do not cite any sources, and the sources used do not use the same orthography.

Another issue is the lack of sources. Sure, not every English entry contains a source, but I would argue it is more important when a variety is not as attested and easily double-checked. Many entries (such as frȯijen) contain a definition and information about the etymology and the pronunciation, but with no sources, and I don’t think there is a published etymological dictionary for Westrobothnian which means that many of the etymology sections are original research. Where do all the example sentences come from with different orthographies under e.g. eye? What are the sources for the regional pronunciation in frööys? These questions are relevant for a majority of the entries.

Sometimes certain sources are used such as Svenskt dialektlexikon (1862–1867) by Johan Ernst Rietz, but these do not contain the same orthography used in the entries.

If there were established orthographies and dictionaries on Westrobothnian and on Scanian, I would not think as much of it. We can compare it to other Nordic varieties like Elfdalian, Gutnish and Jämtlandic (Jamtish is by the way not used in published works) which have orthographies that are attested outside of Wiktionary as well as published dictionaries. These three varieties have more standardized orthographies than e.g. Scanian and Westrobothnian which lack orthographies altogether. In my opinion this would be enough for inclusion, as one could use these published materials as sources. The problem with Scanian and Westrobothnian on Wiktionary is rather than using existing published materials, they are an attempt to launch a personal project which should be done elsewhere than on Wiktionary.

Variety Dictionary Orthography Organization behind orthography
Elfdalian Dictionary Orthography Råðdjärum
Gutnish Dictionary Orthography Gutamålsgillet
Jämtlandic Dictionary Orthography Jamtamot

4. Entries have previously been deleted but readded

Westrobothnian has been up for one WT:RTD and one WT:RTV almost 3 years ago which resulted in that all entries were deleted, as the orthography was considered to be idiosyncratic and unattestable. This was ignored and entries were readded, and there are now 2761 Westrobothnian entries. I find it odd that the problems I bring up in this post have already been brought up before which resulted in actions (@Korn, @Mahagaja), yet the problem was not resolved.

Summarized, I don’t think that Wiktionary is the platform to launch a personal project such as for Westrobothnian or Scanian. I find the project impressive and it’s a good cause, but it can be done on a personal website as the orthography cannot be attested outside of Wiktionary. --Lundgren8 (t · c) 17:25, 21 October 2019 (UTC)

Most of the things you have stated are false. Anyone who has actually read more than a handful of my articles (or any of the cited works) would know this. I find it very bothersome to have to respond to all these accusations you make out of either ignorance or inability to understand what happens on Wiktionary and how any random post in time doesn’t necessarily directly and literally relate to some other action you randomly come across somewhere else at some other point. Maybe I will go into each thing in detail if you persist with this nonsense. Also, I assume we can restrict this discussion to the version of Wiktionary we actually are on, since Swedish Wiktionary would have their own beer parlour or somesuch where I’m sure you can post as much as you want about whatever you find troublesome there. Now, back to English Wiktionary again:
Let me ask you, am I the author of the cited works, the orthographies whereof I actually use in the articles (you were reading an obsolete post on my talk page):
  • Rietz, Johan Ernst Svenskt dialektlexikon: ordbok öfver svenska allmogespråket
  • Larsson, Evert, Söderström, Sven Hössjömålet : ordbok över en sydvästerbottnisk dialekt
  • Stenberg, Pehr, Gusten, Widmark Ordbok över Umemålet
  • Fältskytt, Gunnar, 2007, Ordbok över Lövångersmålet
  • Valfrid Lindgren, Jonas, Orbok över Burträskmålet
  • Marklund, Thorsten, 1986, Skelleftemålet: grammatik och ordlista : för lekmän - av lekman
  • Nyström, Jan-Olov, 1993, Ordbok över lulemålet
  • Sandberg Herny, Sandberg Ingrid, ed., I åol leist: ordlista på kalixmål, sådant det talades på 1990-talet
I will not be your servant any further and link you a bunch of articles citing these sources. You can yourself simply type these titles in the search bar and find many of my articles easily. And I can assure you, most of my articles can be sourced through several of these works, but it seems a bit daft that I should have to type out the sources for every single little article I write, when it is the same sources again and again, and I am not coming up with some new strange information that cannot be derived therefrom. Easily, if you familiarise yourself even slightly with a couple of these works, you will start to recognise their orthographies as used in my articles, and can stop making up lies about it.
Also, here is the orthographic reference for Scanian: http://docplayer.se/10011690-M-lucazin-utkast-till-ortografi-over-skanska-spraket-med-morfologi-och-ordlista-mmx.html
Do I have to explain every single detail you mention, how you are reading something out of date or not actually checking what the facts are, or can you look around a bit yourself and think? — Knyȝt 19:04, 21 October 2019 (UTC)
Hi, I agree with Lundgren we need at least 1 quotation for each article or else it should be deleted. I welcome you to put the content somewhere else e.g. on wikibooks. See details in the relevant policy.--So9q (talk) 19:31, 21 October 2019 (UTC)
See Wiktionary:Criteria for inclusion/Well documented languages. They don't need quotations but they do need references. DTLHS (talk) 19:33, 21 October 2019 (UTC)