Wiktionary:Beer parlour/2019/June

Addition of {{rootsee}} to general entries edit

Discussion moved from Wiktionary:Tea room/2019/June.

@Ankitdimania has been applying {{rootsee}} to general entries such as exclamation in the "Derived terms" section. Just wanted to check if this is appropriate, as I thought {{rootsee}} was intended for use on entries concerning roots only, such as "Reconstruction:Proto-Indo-European/kelh₁-". — SGconlaw (talk) 19:07, 2 June 2019 (UTC)[reply]

Thank you SGconlaw for getting general feedback on this. My intention of using {{rootsee}} template is that all words associated with a root are clubbed in one place and then can be used to show etymologically related words with just one edit. Also, any addition/subtraction is dynamic, i.e. if I add a new word to the category, it will get reflected in all the places where related words with root is used. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)[reply]
I'm also open to finding any better way to achieve this. Please help. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)[reply]
I found one flaw in using rootsee though. The list is expanded by default, and as such, the page is long and difficult to comprehend. If the list could be unexpanded by default, or some other way, it would be easier to read the entry and then expand the list to find etymologically related words. Ankitdimania (talk) 00:37, 3 June 2019 (UTC)[reply]
I enjoy seeing this information, but I think it shouldn't be expanded by default. Same with cognates, I prefer a collapsed list. Ultimateria (talk) 02:32, 3 June 2019 (UTC)[reply]
Perhaps it makes sense for {{rootsee}} lists to be expanded in root entries? I don't know. — SGconlaw (talk) 03:53, 3 June 2019 (UTC)[reply]

Wiktionary:Votes/2016-07/Adding PIE root box. —Suzukaze-c 03:59, 3 June 2019 (UTC)[reply]

I'm glad somebody remembered this. This seems like an open-and-shut case. The removal should begin. DCDuring (talk) 08:21, 3 June 2019 (UTC)[reply]
So, for clarity's sake, the suggestion is that the previous vote on PIE root boxes suggests that PIE root descendants should similarly not be added to entries via {{rootsee}}? Because the vote didn't actually touch on the current issue. — SGconlaw (talk) 10:38, 3 June 2019 (UTC)[reply]
I would interpret it that way, though the wording was absurdly narrow. As worded, it would not forbid a yellow and red display of each PIE-derived related term at different random locations in the entry flashing at seizure-inducing intervals. DCDuring (talk) 12:03, 3 June 2019 (UTC)[reply]
Thank you for feedback on this. Also, if we find this information still relevant, I can update the {{rootsee}} template to have the box collapsed by default. Alternatively, we can just put links similar to "English terms derived from the PIE kelh₁" — Ankitdimania (talk) 17:42, 3 June 2019 (UTC)[reply]
Yes, we already have the category pages with precisely this information. Or is it just a matter of presenting the information directly in the entry? – Jberkel 05:06, 4 June 2019 (UTC)[reply]
I support the use of this template, as it's bound to give a more complete picture of all related terms. Moreover, it avoids duplication of related terms across entries. —Rua (mew) 19:10, 3 June 2019 (UTC)[reply]
I checked how to collapse the box, we can change the depth value in https://en.wiktionary.org/w/index.php?title=Template:rootsee&action=edit, to 0. The usage is documented in CategoryTree. I can test and make the update if it is acceptable. Ankitdimania (talk) 03:22, 4 June 2019 (UTC)[reply]
  • What is the point of presenting this information in lieu of lists curated by humans? Some of the items included are just silly, eg, blends that don't contain the root.
In any event the extensive information is available to anyone who cares at the entry for the PIE term. There is already a category link in the entries. If it is too exhausting for those few who are interested in PIE cognates, a link to the recontructed PIE term could be inserted in an Etymology section. DCDuring (talk) 13:58, 4 June 2019 (UTC)[reply]
  • To give an example how this information is structurally better, I have a recent edit as an example vs pugnacious#Related_terms (here we can see it takes less effort, is quite compact and is dynamically updatable at other places like pugilism's related entries).
While other approaches give a bit of pain, e.g.
1.) Human curated list is not always up to date, or extensive. A related list is present in one place, but not in other places. Some words are added in a few entries but skipped in other related entries, etc.
2.) Also, Human curated list will require more manual effort to add a new word across all related pages.
3.) Category link in the entries are at the bottom of the page (sometimes after a long scroll through other language's entries, which is not intuitive). e.g. pen is interestingly related to feather and pinion, which is esoteric due to the long scroll on pen's entry. We can, though, add the category link in related entries itself and that would be preferable to me.
Link to reconstructed PIE terms in Etymology section is a good middle ground here. Another benefit here is that the etymology section would have to be a bit more detailed, which would be nice.
Also, I agree that the list is a bit intimidating and tedious, but listing weird entries such as blends or composites give beautiful insights into the relationship of words. e.g. Insights by Norman Lewis are an interesting read on this. I'm still in favor of listing the {{rootsee}}, just the list should be collapsed to give user an option to expand it if relevant to him/her. With a collapsed list, user can just skip the section and it's not intimidating anymore.
Please LMK, how you wound want to structure the page? Ankitdimania (talk) 19:25, 9 June 2019 (UTC)[reply]
By reverting to the human-curated material. What is the problem with simply having a link to the PIE root and having all the {{rootsee}} there or on subpages of there or hidden under the direct derivatives in each language? The romance of the "beautiful insights into the relationship of words" is of no appeal except to the amateur linguists. I am concerned that some of them who apparently have no sense of responsibility for making Wiktionary useful to normal users and instead are using this project to indulge themselves. DCDuring (talk) 22:56, 9 June 2019 (UTC)[reply]
The related terms for calyx are: apocalypse, calyx and occult. Not very helpful to show that the entry is related to itself. – Jberkel 06:54, 12 June 2019 (UTC)[reply]

Poll edit

For ease of reference, what follows is a list of editors who support and do not support the use of {{rootsee}} in ordinary entries. Please add your names to the poll after you have participated in the above discussion to your satisfaction. — SGconlaw (talk) 09:43, 10 June 2019 (UTC)[reply]

Support
Do not support
Abstain

A proposal for WikiJournals to become a new sister project edit

Over the last few years, the WikiJournal User Group has been building and testing a set of peer reviewed academic journals on a mediawiki platform. The main types of articles are:

  • Existing Wikipedia articles submitted for external review and feedback (example)
  • From-scratch articles that, after review, are imported to Wikipedia (example)
  • Original research articles that are not imported to Wikipedia (example)

Proposal: WikiJournals as a new sister project

From a Wikipedian point of view, this is a complementary system to Featured article review, but bridging the gap with external experts, implementing established scholarly practices, and generating citable, doi-linked publications.

Please take a look and support/oppose/comment! Evolution and evolvability (talk) 04:24, 3 June 2019 (UTC)[reply]

Request for rights edit

Hi there. I came here to request autopatrolled rights. I used to edit here as user:Diego Grez-Cañete, but no longer have access to that account. I also used to be an autopatrolled an rollbacker, but lost these rights long time ago. The autopatrolled right would allow me to create entries faster, as I am forbidden to create more than two or three entries per minute. My interest, atm, is to create entries for gentilicios of Chile. Nothing that can't be cited. Thanks in advance. --Cuatro Remos (talk) 19:01, 3 June 2019 (UTC)[reply]

  Done. If you have retired the User:Diego Grez-Cañete account, please update your current user account so that it does not redirect to it. Thanks. — SGconlaw (talk) 19:13, 3 June 2019 (UTC)[reply]
Thank you. Have done so. --Cuatro Remos (talk) 19:26, 3 June 2019 (UTC)[reply]

User Stephen G. Brown edit

User:Stephen G. Brown hasn't been active since the 10th of Feb this year - one of our most active long-time editors who contributed in a big number of languages and scripts. I was connected with him outside Wiktionary. He hasn't responded to any contacts. It makes me worry about his health. --Anatoli T. (обсудить/вклад) 04:22, 4 June 2019 (UTC)[reply]

I hope he's alright. I searched (briefly) for obituaries of people with that name and didn't spot any (except one from 2018, clearly not him). - -sche (discuss) 05:13, 4 June 2019 (UTC)[reply]
I always wondered why somebody with such an outstanding command of languages across multiple continents would waste his time here. Maybe he got a hobby. It's our loss. Equinox 06:56, 4 June 2019 (UTC)[reply]
Last WP contribution also February. DCDuring (talk) 14:07, 4 June 2019 (UTC)[reply]

{{ja-spellings}} doesn't work well with wago at kanji edit

As is shown by the vote, many editors do not support lemmatizing all wago at kana entries. This means that a large number of wago would be lemmatized at kanji, such as 戦う, and consequently require both {{ja-spellings}} and {{ja-kanjitab}}:

{{ja-spellings|たたかう|h=たたかふ|戦う|闘う}}
{{ja-kanjitab|たたか|yomi=k}}

This complicates the entry layout because floating elements are laid right to left, as shown at 敷居. It would be possible to make them stack vertically by using the floatright class, but as Eirikr explains, this causes other problems.

Moreover, the kanji spellings in {{ja-spellings}} are shown in a larger size than the kana spellings. This works fine if wago are lemmatized at kana and the reader wants to look up by kanji, but not if wago are lemmatized at kanji and the reader wants to look up by reading (kana).

Therefore {{ja-spellings}} doesn't work well with wago entries at kanji. Given that a lot of wago entries would be lemmatized at kanji, I would like to remove the template and propose the following scheme instead:

  1. Move the kanji spellings to {{ja-kanjitab}}. Extend {{ja-kanjitab}} to accept "alternative kanji spellings", like this:
    {{ja-kanjitab|たたか|yomi=k|alt=闘う}}
    
    Kanji in this term
    たたか
    Grade: 4
    kun’yomi
    Alternative spelling 闘う


    {{ja-kanjitab|たたか|yomi=k}} // followed by a {{ja-see}}
    
    Kanji in this term
    たたか
    Grade: S
    kun’yomi
    Alternative spelling
    鬭う (kyūjitai)


    {{ja-kanjitab|alt=然して}}
    
    Alternative spelling 然して
  2. Move the modern and historical kana spellings to {{ja-pron}}. This might not be feasible at the moment and we can keep them first in the headword templates, but in the long run we can modify {{ja-pron}} to accept both modern and historical kana spellings:
    見違ふ (「見違える」の文語形。「みちがふ」/「みちがう」で立項してもOK)

What do you think of this approach?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 16:15, 6 June 2019 (UTC)[reply]

(1) I still think that option 1 of the vote is conceptually nicer and simpler. But if we must, perhaps this is alright. Or we could revive the "Alternative spellings" header, keeping more in line with standard entry format (although we do have things like لباس#Persian).
(1, 2) As long as it doesn't get too confusing. —Suzukaze-c 19:05, 6 June 2019 (UTC)[reply]
@Suzukaze-c: Um..., actually I never liked headers. They're fine for Wikipedia, but using them in Wiktionary entries is like writing XML like
<key level="1">Note</key>

<key level="2">To</key>
<value>George</value>

<key level="2">From</key>
<value>John</value>

<key level="2">Heading</key>
<value>Reminder</value>

<key level="2">Body</key>
<value>Don't forget the meeting!</value>

for what's usually

<note>
    <to>George</to>
    <from>John</from>
    <heading>Reminder</heading>
    <body>Don't forget the meeting!</body>
</note>
--Dine2016 (talk) 16:17, 13 June 2019 (UTC)[reply]

CFI issue edit

I raised a question about inclusion of hyphenated compounds on the CFI talk page here, but now I look again at that page, there seems to be surprisingly little activity, so I am mentioning it here too just in case no one ever sees it in that place. Mihia (talk) 22:03, 6 June 2019 (UTC)[reply]

Vote: Language code into reference template names edit

FYI, I created Wiktionary:Votes/2019-06/Language code into reference template names.

Let's postpone the vote as much as discussion needs, if at all. --Dan Polansky (talk) 08:00, 7 June 2019 (UTC)[reply]

According to the deletion message, this was deleted per WT:RFDO. But where is the deletion discussion? There's nothing on the talk page and nothing among the pages that link to it either. —Rua (mew) 14:23, 9 June 2019 (UTC)[reply]

I assume MK deleted it during a deletion spree. Who cares though, right? --I learned some phrases (talk) 14:12, 10 June 2019 (UTC)[reply]

{{top3}} in descendants section (e.g. Proto-Slavic) edit

(Notifying Rua, Wikitiki89, Atitarev, Benwing2, Guldrelokk, Bezimenen, Jurischroeer, Greenismean2016, Chignon):

Is it useful or useless? Compare descendants with and without {{top3}}. From my view, advantage is that it shortens long narrow lists via filling empty space in right side, thus making entry easier to read. —Игорь Тълкачь (talk) 15:20, 9 June 2019 (UTC)[reply]

I think it looks ugly, especially when the columns don't line up with the three Slavic subgroups, which is often the case. We should take into account mobile users, for which the single list is better than the columns. Also, just from a customary point of view, single lists are most definitely the norm on Wiktionary, with columns only used in a few cases. —Rua (mew) 15:21, 9 June 2019 (UTC)[reply]
From what i see: 1) It's problem of some browsers: correct in Google Chrome, incorrect in FireFox, Internet Explorer, unknown in Opera. 2) In mobile version columns are single. 3) It's hard to count in Main namespace, but in the Reconstruction there are ~118 cases (e.g. *ćwíšah, *xātun): ~80 (Iranian), ~14 (Turkic), 9 (Germanic), 6 (Algonquian), 5 (Semitic), ~5 (other). Anyway it's just extrapolating from other sections (e.g. Translations, Derived terms, ...). —Игорь Тълкачь (talk) 15:27, 10 June 2019 (UTC)[reply]
Correct in Opera. —Игорь Тълкачь (talk) 22:37, 11 June 2019 (UTC)[reply]
Exception (incorrect in all 4 browsers): More brokenness in {{top3}}, {{mid3}}. —Игорь Тълкачь (talk) 15:17, 16 June 2019 (UTC)[reply]
On mobile, the list reverts to a single column. It would be nice if {{mid3}} actually worked to force breaks. @Erutuon --{{victar|talk}} 02:58, 12 June 2019 (UTC)[reply]
@Victar: Columns are overridden by .derivedterms, .term-list { -moz-column-count: 1 !important; -ms-column-count: 1 !important; -webkit-column-count: 1 !important; column-count: 1 !important; } in MediaWiki:Mobile.css. User:DTLHS added that rule in these edits. I'm not confident that mobile screens can always show three columns, so would rather not make a decision in the matter. — Eru·tuon 03:06, 12 June 2019 (UTC)[reply]
Yeah, that was added at my behest. I pined you about {{mid3}} though. --{{victar|talk}} 03:08, 12 June 2019 (UTC)[reply]

I see little response here, so now i tried to count users who added/removed {{top3}} in Proto-Slavic (the list below is incomplete):

This discussion is not new, the earliest probably was in 2015/04/15, but it didn't get any objections. 4 years have passed and now i notice that Rua (2019/04/15) started removing {{top3}}. Such actions can lead to numerous edit conflicts, because {{top3}} is used in 1000+ Proto-Slavic entries. —Игорь Тълкачь (talk) 16:43, 16 June 2019 (UTC)[reply]

@Rua, Useigor: User talk:Wikitiki89/2018 § top3, mid3, bottom in Proto-Slavic entries. I think I support Rua's proposal to remove the template, because it's broken all too often for me. (I'm Chignon) Canonicalization (talk) 11:37, 17 June 2019 (UTC)[reply]
If i'm not mistaken, it's broken since using autobalancing (2017). There are solutions: revert edits (unlikely), create another template (possibly), fix via CSS (uncertain). —Игорь Тълкачь (talk) 18:00, 19 June 2019 (UTC)[reply]
@Canonicalization: There was no proposal to remove {{top3}} from all Proto-Slavic. If you start doing so en masse, as you suggest, I will revert your edits. That said, there may be some individual cases in entries where it shouldn't be used, i.e. when only found in 1/3 or 2/3 of branches (as Useigor exampled above). --{{victar|talk}} 19:34, 19 June 2019 (UTC)[reply]
Just realized who you are now. Congrats on the 50's username change. --{{victar|talk}} 19:36, 19 June 2019 (UTC)[reply]
@Victar: Please don't put words into my mouth: I never said I was going to do anything of the sort unilaterally. Anyway, I see @Useigor has been doing some work on fixing the template (I think?). Canonicalization (talk) 20:17, 19 June 2019 (UTC)[reply]
And you the same, citing proposals when non such were made. --{{victar|talk}} 20:24, 19 June 2019 (UTC)[reply]
Possible solutions:
  • First solution gets complicated, because it requires additional code to handle tag ol. It's easier to revive the old template, because columns without breaks are half-useless. Therefore i created {{xtop}} and soon i will start replacing incorrect cases with it.—Игорь Тълкачь (talk) 21:08, 13 July 2019 (UTC)[reply]
Solution (create another template): Template:acol (suitable for lists with 2-5 top elements).—Игорь Тълкачь (talk) 14:35, 12 July 2020 (UTC)[reply]
Now {{acol}} doesn't require {{bottom}}, because it uses CSS selector "+" to modify adjacent list.—Игорь Тълкачь (talk) 18:35, 15 July 2020 (UTC)[reply]
Solution (create another template): Template:topx (for any list, split is manual e.g. with {{mid}}/anything).—Игорь Тълкачь (talk) 16:13, 12 July 2020 (UTC)[reply]

I came up with a programming solution. This bit should fix things in Internet Explorer:

@media screen and (min-width:0\0) and (min-resolution: +72dpi) { .derivedterms ul { -webkit-column-break-inside: avoid; break-inside: avoid; } }

And I created {{mid}} to override auto-breaks and manually add breaks where you want, which is semi working...

--{{victar|talk}} 05:34, 26 June 2019 (UTC)[reply]

@Erutuon, did you want to try adding the above to common.css? It might also work for Firebox (I haven't tested it) but to enable it for FF you'll need to add something like @-moz-document url-prefix() --{{victar|talk}} 18:20, 26 June 2019 (UTC)[reply]
@Victar: Adding the following would make the CSS run in Firefox, where it improves the display, but not in Chrome, where it doesn't (from an answer to the StackOverflow question about "Targeting only Firefox with CSS"):
@-moz-document url-prefix() {
    .derivedterms ul {
        break-inside: avoid;
    }
}
I tested it in my common.css, so it should also work in MediaWiki:Common.css. Since the Firefox rule triggers an annoying error in the code editor that forces you to confirm "Yes, I do want to save the page", I've put the rules in a tiny gadget, MediaWiki:Gadget-column-hacks.css, which will be hidden and enabled by default. — Eru·tuon 00:18, 18 July 2019 (UTC)[reply]
Thanks, @Erutuon! P.S. I cited @-moz-document url-prefix() above as well. =) --{{victar|talk}} 04:43, 18 July 2019 (UTC)[reply]
Oh dang, I'll try to pay more attention next time. — Eru·tuon 04:55, 18 July 2019 (UTC)[reply]

Just so that it's known, I'm fine with whatever solutions you come up with, but I still disagree with the use of columns for hierarchical data, such as descendants. A tree is not just visually a tree, but also in terms of the HTML data structure. Both should be preserved. The current {{top3}} does not preserve the tree visually and is therefore undesirable, alongside the already-mentioned issues with column division, but it does at least preserve the HTML element tree intact. {{xtop}} preserves neither the visual tree nor the HTML one, so I consider it even worse. —Rua (mew) 20:15, 17 July 2019 (UTC)[reply]

1) So your main argument is "it looks ugly"? Well, i can say that narrow list + wide unused empty space are ugly. It's normal to use columns for long lists. You behave as if columns in descendants were forbidden. 2) Tree is visually preserved when it's divided correctly. Table {{top3}} was fulfilling this task but autobalancing {{top3}} fails in some cases. —Игорь Тълкачь (talk) 20:51, 19 July 2019 (UTC)[reply]

December 2019 edit

I am stating hereby explicitly that I have not discerned any to any degree convincing argument that this {{top3}} is disrecommendable; only that edits having just the purpose of spreading it are makruh by reason of the controversy. The HTML looks okay to me, everything is preserved, also I just have compared pages with {{top3}} and without it in a text-based browser (lynx) and it does not look worse. Particularly that statement by @Rua “A tree is not just visually a tree, but also in terms of the HTML data structure” seems sophistic. The data structure is linear and not really from top to bottom. There is no aesthetic rule that trees should not grow from left to right. Fay Freak (talk) 23:59, 12 December 2019 (UTC)[reply]

Vote edit

Since the matter is disputed, I have created Wiktionary:Votes/2020-07/Adding topa to all Proto-Slavic descendants sections. —Rua (mew) 11:29, 25 July 2020 (UTC)[reply]

Proposal: Make Latin the primary script for Serbo-Croatian edit

@Ivan Štambuk, Crom daba, Vorziblix At the moment, we duplicate a huge amount of information by having the exact same entry at both the Latin and the Cyrillic spelling. For English, we eventually relented and made colour link to color. I think the same should be done for Serbo-Croatian: the Cyrillic spelling should be defined as an alternative spelling of the Latin spelling (or "Cyrillic spelling"), and all information that is already present on the Latin page, such as etymology and pronunciation, should be removed from the Cyrillic page. Descendants and translations should be given only in Latin script, so no more cumbersome nesting. The reason I think Latin should be the primary script is that it's used in all four countries, and appears to be favoured in everyday use even in those that use both. —Rua (mew) 18:29, 11 June 2019 (UTC)[reply]

LOL, never going to happen. --{{victar|talk}} 03:09, 12 June 2019 (UTC)[reply]
It might work, it's all up to the community. We're almost there with dual re-transliterations into the other side - Roman/Cyrillic and vice versa but more needs to be done.
  1. Cyrillic to Roman converts one-to-one but there are cases when Roman to Cyrillic need to be decided
  2. All inflection tables need to display both Roman and Cyrillic.
  3. Consider using new Serbo-Croatian language-specific templates like {{sh-l}} with automated conversions, compare with {{zh-l}}, e.g. 中國中国 (Zhōngguó), which display traditional Chinese, simplified Chinese and transliterations with only traditional Chinese 中國 in the input.
  4. Cyrillic entries shouldn't be deleted, IMO but be converted to soft-redirects.
  5. We should also address how we display translations, there's a lot language-specific templates can do what {{t+}} or {{t}} can't, e.g.:
宮崎県(みやざきけん)九州地方(きゅうしゅうちほう)南東部(なんとうぶ)位置(いち)する(けん)
Miyazaki ken wa Kyūshū chihō no nantōbu ni ichi suru ken.
Miyazaki prefecture is situated in the south-east part of the Kyūshū region.
The above Japanese example doesn't have any Roman script. I can go on talking about Chinese, Thai, Korean Khmer templates. --Anatoli T. (обсудить/вклад) 06:14, 12 June 2019 (UTC)[reply]
I think it's a good idea. Trying to keep two separate entries for every S-C lemma and nonlemma form synchronized is absurd. To Anatoli's points:
  1. Sure, there may be times when a manual Cyrillicization needs to override the automatic one. Ought to be trivial.
  2. Agreed.
  3. Agreed.
  4. Well, duh.
  5. See point 1 above.
It feels like a lot of work, but reducing unnecessary duplication will be worth it. —Mahāgaja · talk 15:02, 12 June 2019 (UTC)[reply]
I’m inclined to agree with this proposal; making duplicate entries for every term is frustrating, and maintaining them so they stay synchronized is next to impossible. Latin script predominates even in Serbia (at least outside of official contexts). (It’s worth noting, though, that Cyrillic is also "used in all four countries", though its usage share in each has been rapidly declining over the past two centuries.) Of course, we’d still have duplication between entries for ekavian/ijekavian variants, but a two-way duplication is a decided improvement over a four-way one. — Vorziblix (talk · contribs) 15:15, 12 June 2019 (UTC)[reply]
  Support (my main concern is duplication of content, so I'd be fine too with making the Cyrillic spellings the lemmas). @Victar, are you opposed to the proposal, or do you simply think it won't garner enough support? Canonicalization (talk) 16:08, 12 June 2019 (UTC)[reply]

@Fay Freak Canonicalization (talk) 17:41, 12 June 2019 (UTC)[reply]

It would be easier to create entries. And indeed the inflection tables need more stuff done automatically. All that saves time. It does not despend on new linking templates though those are possible. Changing existing Cyrillic templates to display the new order is however critical in so far as they as they are already out of sync. What happens if some noob has added additional information to the Cyrillic entry that lacks on the Latin entry? Ivan Štambuk had some machine to detect whether entries are exact mirrors. But if there isn’t the parallelism and this is detected, I am afraid, a human must clean up and move because no machine can decide.
Or what’s with some kind of gadget that could convert a Latin entry into a Serbo-Croatian one, and what with a bot that applies changes done to one side only after some time to other?
At some point I have suggested on Wiktionary already – I’d need to search where – to have some template on Cyrillic pages (or the opposite) that fetch the content or whole language section of the other page (like {{desctree}}): so the scripts look treated equally and one finds all at every page but it isn’t twice the work. This looks the best to me. Ideally one could perhaps even write only one line: instead of putting ==Serbo-Croatian== one calls a template on that line that invokes a module. For why write {{spelling of|sh|Cyrillic|čutura}} plus header plus L3 and L2 and possibly altforms and inflection there too if you can get the whole with one line and then it even displays all there? Least work for editors, greatest gain for readers because they find all at every spelling and virtually synchronously. It would be fun to expand Serbo-Croatian on Wiktionary with such an architecture. That’s the utmost concentration.
Coding is needed for every alternative. Fay Freak (talk) 23:02, 12 June 2019 (UTC)[reply]
@Fay Freak: Perhaps labeled section transclusion (LST) could be used to grab the Serbo-Croatian section instead of a Lua function. That would save Lua memory, but maybe not much work otherwise; lots of entries would have to be edited (both the source and target of transclusion) and templates would still have to be made to display the right script on each page. — Eru·tuon 23:31, 14 June 2019 (UTC)[reply]
Not to forget the table templates. Those that I have created a month ago like the playing cards one have a based |sc= parameter whereby the display switches the script according to the script code. Others like the colours template just display all scripts. The list templates have subpages for all. This must be regularized for all Serbo-Croatian tables or list templates for the module to get it. Fay Freak (talk) 23:22, 12 June 2019 (UTC)[reply]

(in response to what has been posted so far) I'm not proposing to do it all in one go, or to make huge sweeping changes to our infrastructure for SC. The proposal is just to codify our intent to convert Cyrillic spellings to alternative forms of the Latin ones, and to remove the nested structure that is currently present in descendants and translations. This could be done on a page-by-page basis, whenever someone happens to come across it. As long as we know what direction we're going to be moving in on this. I don't really see the need for special-purpose templates for SC, let alone page-copying stuff stuff like what Fay Freak proposes. In proposing Latin as the primary script, I meant that when we link to a SC term, we link only to the Latin script form, which in turn lists the Cyrillic script for those interested. —Rua (mew) 10:21, 13 June 2019 (UTC)[reply]

But it is preferable if one can treat the display equivalently. So one can give in a Cyrillic term without needing to follow a soft-redirect just to use the dictionary like a dictionary. Linking Latin and Cyrillic forms at the same time is the least problem. What you propose is also a decrease in usability – why convert Cyrillic spellings to alternative forms of the Latin ones if they are successfully parallel at the time being? Why force people to click on Latin links if we could also link both, and even easily via {{sh-l}}? “I don't really see the need.” Fay Freak (talk) 11:57, 13 June 2019 (UTC)[reply]
The same argument could be made in opposition to the proposal entirely, because we're removing definitions and etymologies from the Cyrillic pages. They are "successfully parallel at the time being" too, after all. Why treat them equally in some respects but not in others? —Rua (mew) 12:50, 14 June 2019 (UTC)[reply]
I don’t think the entries at the time being are even successfully parallel; in the absence of Ivan Štambuk’s watchful eye a noticeable number have drifted apart, and a good many lack a Cyrillic or Latin counterpart entirely. — Vorziblix (talk · contribs) 16:20, 14 June 2019 (UTC)[reply]
But I did not talk about the whole, but about those which are parallel (why—if); in fact I warned that the conversion is manual work for those that are not equal. If a Cyrillic page is made a soft redirect this means a usability decrease. Readers want Cyrillic main entries. But we want to repeat less and we want to save attention from the synchronization. So the idea is to have the whole at the Cyrillic pages but in an automated fashion. The page only as displayed will be copied, in the source code we won’t have to do anything but insert a template which fetches the one page from the other.
Oh, I see; sorry, I misinterpreted a bit. I agree that your proposal (automated fetching) would definitely be ideal if it’s technically workable. I must admit I have no idea whether or not that’s the case. If not, though, I’d still support alt-form-style conversion of entries as preferable to the current system. — Vorziblix (talk · contribs) 23:13, 14 June 2019 (UTC)[reply]
I don’t understand what “Why treat them equally in some respects but not in others?” is supposed to mean. I have said it already: Different interests: If we treat inequally only to save work, we only have to do it to the extent in which it saves work. What I have outlined is a milder measure, and it will be more agreeable. The hard fans of Cyrillic will say: That’s a great measure, you save work but it does not look like the Cyrillic script is inferior or something. Then I am confident that in the future it will never be in question “why we have done that”. It is best if the end user does not see the problems the creator of the application had. Now the end user can choose arbitrarily which alphabet he types in and the system does no pressure to use Latin. It would be sad to sacrifice the Cyrillic script for occasional danger of asynchronicity and some saved repetitions only because we have no module to get the most out of all. And if we have, it enlightens the attitude of any editor who is pro-Cyrillic. Fetching all through a template with a module is danker than creating alternative forms. Think about the editors we possibly lose because they have a personal dislike to be subjected to such a ranking of Latin. If it is done differently, the resistance will be less. Forever – I think it is the best possible solution. Think about our marketing claims: “In Wiktionary you can type in Latin and Cyrillic and it will be equal.” Fay Freak (talk) 18:32, 14 June 2019 (UTC)[reply]

"heading" label edit

Examples of the "heading" label at draw#verb:

  1. (heading) To move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
    ...
    ...
  2. (heading) To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.
    ...
    ...

To me, this "heading" label seems superfluous almost to the point of being confusing. I am tempted to remove it where I see it. What do other people think? Does anyone think the label is useful? Mihia (talk) 20:02, 15 June 2019 (UTC)[reply]

I agree it is unclear. It's an attempt to group related senses together; my suggestion would be either to provide a definition that is a gloss, or a non-gloss definition, as appropriate, like this:
  1. Senses meaning to move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
    ...
    ...
  2. To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.
    ...
    ...
SGconlaw (talk) 22:26, 15 June 2019 (UTC)[reply]
AFAICR, it was an invention of @-sche intended to allow grouping of senses where there is no single definition that the contributor can think of that could stand in that location. It is necessary to make it clear to a user that what is on such a line is NOT a definition, even though it is positioned where one would expect a definition. Usually someone comes up with some more helpful label or non-gloss definition than "(heading)", along the lines Sgconlaw suggests. MWOnline and other dictionaries have groups of subsenses that do not have a sense-level definition. It is an artifact of wikiformat ("#" and "##") that we cannot duplicate their numbering scheme. DCDuring (talk) 22:38, 15 June 2019 (UTC)[reply]
It is an empirical question whether italics, even with the good wording Sgconlaw uses, are a sufficient indication that the content of the definition line should not be read as a definition. Sadly we don't have reliable means of running an experiment. DCDuring (talk) 22:45, 15 June 2019 (UTC)[reply]
I find subsenses confusing, headings or not, but the consensus seems to be that they should get used more (a while ago: Wiktionary:Beer parlour/2015/May#ELE: explicitly ban nested subdefinitions/subsenses? Or allow in rare cases?). In any case, they should at least get mentioned in WT:EL. – Jberkel 23:28, 15 June 2019 (UTC)[reply]
I am not keen on the "Senses meaning ..." suggestion. If we are going to use this format, we should just make the heading line read as a broad definition, in my opinion. Mihia (talk) 00:45, 16 June 2019 (UTC)[reply]
(@DCDuring's comment above) you are probably thinking of times I've converted entries to use subsenses :) but I always use coherent gloss or non-gloss definitions for the "super-sense"; "heading" labels are not my doing and I remove them when I see them. - -sche (discuss) 01:06, 16 June 2019 (UTC)[reply]
@Mihia: I would say give a broad definition wherever possible, but in some cases you may find that a non-gloss definition beginning with “Senses meaning […]” may be more appropriate, so I wouldn’t rule it out. For example, in some entries it seems appropriate to use NGDs like “Nautical senses” or “Senses relating to animals”. — SGconlaw (talk) 02:40, 16 June 2019 (UTC)[reply]
More extreme measures may be required for this entry. The highest level groups seem to me to be too abstract. For this word MWOnline has no more than five definitions in any of their groups of definitions, many of which have no master definition. They have nearly 50 definitions, compared to our 39. If we want to extirpate this kind of definition structuring, User:ReidAA, active from early 2013 to late 2015 did (some of?) them and used "structuring" in his edit summaries, AFAICT. DCDuring (talk) 03:32, 16 June 2019 (UTC)[reply]
I think we should usually keep the subsenses / top-level senses, and only remove "{{lb|en|heading}}". - -sche (discuss) 04:22, 16 June 2019 (UTC)[reply]
That's what we should do in the ambulance, but when we get this particular patient to the hospital, we can't just send it home. DCDuring (talk) 05:06, 16 June 2019 (UTC)[reply]

Partial blocks deployment to Wiktionary edit

Hello Wiktionary contributors,

Wikimedia Foundation Anti-Harassment Tools team is continuing to make improvements to Special:Block with the addition of the ability to set a partial block

While no functionality will change for sitewide blocks, Special:Block will change to allow for the ability to block a named user account or ip address from:

  • Editing one or more specific page(s)
  • Editing all pages within one or more namespace(s)

Additionally, changes are being made to the design of the user interface for Special:Block to enable admins to set partial blocks.

Until now partial block has only been deployed on Wikipedias. Since Wikipedia administrators found partial blocks useful and there are no serious known issues or bugs, our team is planning to introduce partial blocks into more Foundation wikis. We think it is important to find any bugs that might exist for Wikisource, Wiktionary, Commons, Wikidata, etc. that might not be on Wikipedias so we are going to deploy to a few of these wikis next week with our software developers ready to respond to any issues that may arise.

Currently it is scheduled to SWAT deploy to English Wiktionary on Monday, June 17, 2019.

Let me know if you have any questions or thoughts about introducing partial blocks on Wiktionary. For the Anti-Harassment Tools team. SPoore (WMF) (talk) 22:21, 15 June 2019 (UTC)[reply]

We always welcome useful hand-me-downs.
Why is this specifically an Anti-Harrassment matter? Is the idea that we can partially implement IBANs by not letting alleged harassers post on individual user talk pages and on Wiktionary discussion space? DCDuring (talk) 23:07, 15 June 2019 (UTC)[reply]
Yes, there are times when a full site block might not address the issue as well as other editing restrictions might. One of our working hypotheses is that some users are not given a full site block because it is too harsh. So, partial block is a more targeted option. This page lists some uses.
Additionally, partial blocks are being used to block ip contributors and vandals from one or a few pages to prevent collateral damage to other good users. Also, I can share documentation with you that show how other wikis are changing their local block policy and writing help pages about setting a partial block. SPoore (WMF) (talk) 12:41, 17 June 2019 (UTC)[reply]
Partial blocks is now deployed. Let us know if you notice any issues or have questions.
Here is a description of the use of partial blocks Also here is a page that the Italian Wikipedia created about partial blocks. This wiki might want to update there policies according with something similar. SPoore (WMF) (talk) 20:56, 17 June 2019 (UTC)[reply]
I put something informational on WT:Blocking policy#Partial blocks. Does it need a vote? DCDuring (talk) 21:34, 17 June 2019 (UTC)[reply]
I don't think we ought to use partial blocks for limiting interaction between people, if someone is harassing someone else to the point that I would block them from editing a particular talk page I would want to block them from editing altogether. I think there may be potential use in the (rare) cases where otherwise reasonable editors get into revert wars over the content of a particular entry, it could be used to enforce a cooling-off period. Previously we have just protected the entry. Really I don't see much value in this tool here. - TheDaveRoss 12:16, 18 June 2019 (UTC)[reply]
While it's true that we're more interested in managing access to languages rather than individual entries, it does allow us to stop certain types of edit wars at a given entry without cutting off access to non-involved parties. There might be an abuse filter or two that we won't have to employ in a few special cases. Chuck Entz (talk) 13:37, 18 June 2019 (UTC)[reply]

Languages that "use English" edit

 
Wikipedia has articles on:

These two links seem to imply something about the nature of Hakka and Min Nan dialects. They are using the "English" spelling as the name of their page for that nation in their Wikipedias. So IS 'Mauritius' a Hakka word? Is it a Min Nan word? If not, is there ANY place on this website where we would link to hak:Mauritius and nan:Mauritius?

--Geographyinitiative (talk) 04:43, 18 June 2019 (UTC)[reply]

You're making the mistake of assuming that a Wikipedia in a language is necessarily an accurate reflection of that language. Chinese is a macrolanguage, which means that the dominant lect tends to be used for many topics rather than the people's native lects. In languages such as these without an extensive corpus of writings in every possible subject, it's often impossible to find an authentic native word for everything that requires an article- so Wikipedia editors tend to make stuff up or borrow it from other languages. Of course, that's not unlike the kind of borrowing that happens at some time in the history of every language, but in the case of Wikipedias, the words tend not to be used by actual speakers who aren't writing Wikipedia articles. Not only that, but sometimes authentic words do exist that Wikipedia editors don't know about- so you have made-up words taking the place of real ones. I can't tell you how many times we've have to revert people who add bad translations in languages they don't know, "borrowed" from wikipedias in those languages. Chuck Entz (talk) 05:57, 18 June 2019 (UTC)[reply]
That's a pain of many languages but it also reflects the lack of language policies, especially when there is no such thing with mostly spoken dialects. Even Vietnamese, which has a rather peculiar situation with foreign place names, has a native word for Mauritius, it's Mô-ri-xơ, which we want in the dictionary, even if they often "borrow" the English name for country names, e.g. "Mauritius" (which will still be pronounced "Mô-ri-xơ") and many others. I think it's best not to add the "borrowed" spelling. 毛里求斯 (Máolǐqiúsī) has the Min Nan form, even if Min Nan Wikipedia uses "Mauritius". --Anatoli T. (обсудить/вклад) 06:17, 18 June 2019 (UTC)[reply]
Why is it better to misrepresent the language as it is actually used by ignoring such forms as Mauritius? Should we delete the English entry of Côte d'Ivoire? I'm not a fan of actually using it, but it certainly is used in English.--Prosfilaes (talk) 16:59, 18 June 2019 (UTC)[reply]
Because, eg Hakka dialect may not have an established/approved/standard, etc. name for a small country like Mauritius but they can still have an article about it. It’s not exactly a borrowing but a missing term in a language or a dialect (or editors don’t know the word or don’t care as in the case of Min Nan or Vietnamese). Anatoli T. (обсудить/вклад) 21:48, 18 June 2019 (UTC)[reply]
Wikipedia is not a good source under CFI, but this seems to be an evasion. A language using a word from another language for a missing term (or term that's not known to the speakers) is exactly a borrowing. Côte d'Ivoire is not a good English word, with "ô" and "d'", but we record it because it is used.--Prosfilaes (talk) 05:15, 20 June 2019 (UTC)[reply]
As to the question if whether there's anywhere we would link to such pages: if there's anywhere we'd link to the Hakka or Min Nan Wikipedia article on a country if its name were spelled in Chinese characters (such as: I see we add such links to 中國, so I guess we'd add them to 毛里求斯), then I guess for this country the target of our links would be that Latin-character string, since that's where those Wikipedias put their entries on that country... even if we decided we should "alias" them like [[w:nan:Mauritius|毛里求斯]] (or to link to nan:毛里求斯, if that entry existed as a redirect to the entry where the content is)... - -sche (discuss) 16:46, 18 June 2019 (UTC)[reply]
@-sche: In case of Hakka Wikipedia linking to "Chûng-koet" is appropriate because Hakka Wikipedia is written mostly in Pha̍k-fa-sṳ (PFS) and PFS transliteration of 中國中国 (Zhōngguó) is "Chûng-koet" but "Mauritius" is not a transliteration of 毛里求斯 (Máolǐqiúsī) in any Chinese lect, nor it is a loanword. --Anatoli T. (обсудить/вклад) 04:34, 19 June 2019 (UTC)[reply]
If Hakka is written in PFS, then PFS is no longer a transliteration; it is a script, and writings in it should be recorded as such, no matter what other scripts might show.--Prosfilaes (talk) 05:15, 20 June 2019 (UTC)[reply]
The Bible along with psalms has been translated into PFS along with Chinese characters for Hakka. Wikipedia dialect editors like to write their articles in PFS and make up new words but we go by dictionaries and our CFI and I don't think "Mauritius" will be attested in text written in the Hakka dialect. --Anatoli T. (обсудить/вклад) 05:56, 20 June 2019 (UTC)[reply]

Species names - sum of parts? edit

What lexicographic information do binomial specific names have that isn't in their two parts? DTLHS (talk) 01:45, 19 June 2019 (UTC)[reply]

Does penelope have a meaning apart from Anas so that one can know what kind of dabbling duck Anas penelope is simply by knowing both the generic and the species name? It seems to me that species' names have no lexical value on their own. They have to be used in tandem with generic names to mean anything. For instance, townsendii does not function as an adjective describing Scapanus or Microtus. It doesn't actually tell me anything more specific about the vole or mole than Scapanus or Microtus do, unless I already know something about Townsend's vole/mole. Andrew Sheedy (talk) 02:17, 19 June 2019 (UTC)[reply]
Indeed, this is true to the extent that any taxonomic names are lexicographically relevant (some authorities would say they are not; we choose to include them). —Μετάknowledgediscuss/deeds 02:21, 19 June 2019 (UTC)[reply]
True. I don't see them as being vastly different than common names, however, and I would say that those are about as inclusion-worthy as fried egg. Andrew Sheedy (talk) 02:32, 19 June 2019 (UTC)[reply]
We have chosen not to include the proper names of individuals with rare exceptions. We, like some other standard 'unabridged' dictionaries, have chosen to include these proper names of taxonomic entities. We also have such entries as Fermat's little theorem, Fermat's Last Theorem. DCDuring (talk) 02:40, 19 June 2019 (UTC)[reply]
  • I do often wonder why DCDuring spends so much time making species names, and would like to tell him it is pretty stupid and that he should stop, but he argues much better than me and I'd probably get blocked again. --I learned some phrases (talk) 22:17, 23 June 2019 (UTC)[reply]
    Thanks for the compliment buried in your comment. Taxonomic entries are part of many languages (hence Translingual). Species names are useful to clarify what vernacular names in various languages and regions are actually referring to. Taxonomic entries are good places to have things like images and links to specialized external sources. DCDuring (talk) 22:57, 23 June 2019 (UTC)[reply]
Species names, like any proper noun, are not SOP- because they refer to specific entities. Sometimes they're descriptive enough to distinguish the entity from all others: for instance, Aristolochia californica is the only species of Aristolochia native to California, and it's not native anywhere else. Mostly, though, you can't identify a species from the literal meaning of the species name, alone: sometimes the name is inaccurate- Simmondsia chinensis is native to the southwestern US, not China- and sometimes the description isn't unique to the one species. For instance, there are a number of species of white water lily, but only one Nymphaea alba. Then there are species that are named after someone or something that has nothing to do with that species, and those that are completely arbitrary. If you need more proof, consider: I can take a species with the specific epithet "minutiflora" because it has tiny flowers and develop a cultivar with huge flowers, but that doesn't change its specific epithet to "grandiflora". A favorite example is Eriogonum inflatum, which gets its name from its odd-looking swollen, hollow stems. Someone published a description for Eriogonum inflatum var. deflatum based on specimens without that characteristic, but, sadly, the variety is apparently not taxonomically valid. Chuck Entz (talk) 03:49, 24 June 2019 (UTC)[reply]

Hyphenation edit

Discussion moved from Wiktionary talk:English entry guidelines#Hyphenation.

What are the guidelines regarding hyphenation data? I've just come across the one in cromulent which seems phonetic rather than orthographic --Backinstadiums (talk) 10:36, 21 June 2019 (UTC)[reply]

I don't think we have guidelines at the moment. I prefer hyphenation that is based on the etymology of the word rather than how it is pronounced, where this is feasible. I suggest raising the issue at the Beer Parlour for general discussion. — SGconlaw (talk) 11:12, 21 June 2019 (UTC)[reply]
@Sgconlaw: I do not know how to move this post; can you do it? --Backinstadiums (talk) 14:51, 21 June 2019 (UTC)[reply]
@Backinstadiums:   Done. — SGconlaw (talk) 17:53, 21 June 2019 (UTC)[reply]
Hyphenation should probably be based on references, or perhaps we could look for real-world examples. See also Wiktionary:Tea room/2019/April#Hyphenation_at_supercalifragilisticexpialidocious. - -sche (discuss) 18:20, 21 June 2019 (UTC)[reply]
Surely we can find sources with general rules for hyphentation rather than looking for specific hyphenated examples for each word. DTLHS (talk) 18:26, 21 June 2019 (UTC)[reply]
I looked at the hyphenation of words of the form “XVCulent” in OneLook dictionaries. Most of the time it is like XVC·u·lent: crap·u·lent; fec·u·lent; flat·u·lent; flor·u·lent; muc·u·lent; op·u·lent; poc·u·lent; strid·u·lent; tem·u·lent; vir·u·lent. But not always: frau·du·lent; lu·cu·lent; lu·tu·lent; pu·ru·lent; ro·ru·lent. In one case I saw disagreement: while the American Heritage has truc·u·lent, Merriam–Webster has tru·cu·lent.
I see no clear pattern. Etymology is clearly not a guiding principle here, otherwise we’d see, e.g., luc·u·lent and pur·u·lent.  --Lambiam 23:25, 21 June 2019 (UTC)[reply]
Note that we have "Wiktionary:Pronunciation#Hyphenation", which states: "British hyphenation more often considers word etymologies, whereas American English hyphenation more often follows syllabification". So far I've generally been hyphenating on the basis of etymology (unless the etymology is unclear or, for some reason, impractical to follow), with the caveat that a word should not be hyphenated in such a way as to leave a single letter at the start or end of a line (so, per·se·cut·ion rather than per·se·cu·tion, and not *e·squa·mul·ose – esqua·mul·ose to be used instead). I suppose, if there is consensus, we could provide both etymology-based and syllable-based hyphenation as alternatives. — SGconlaw (talk) 05:51, 22 June 2019 (UTC)[reply]
There really don't seem to be clear-cut rules for where to hyphenate in English, and as noted above there are discrepancies between en-GB and en-US, and sometimes even between dictionaries of the same national variety. The rule I learned (phonologically based for en-US) is generally to hyphenate after vowels, except that a stressed checked vowel should be followed by a consonant. That rule would explain crap·u·lent, fec·u·lent, flat·u·lent, muc·u·lent, op·u·lent, poc·u·lent, strid·u·lent, tem·u·lent, vir·u·lent as well as frau·du·lent since /ɔ(ː)/ is a free vowel, not a checked one. (I don't know how to pronounce luculent and lutulent, and intervocalic r is tricky in American English since we've lost most contrasts between checked and free vowels in that context.) At any rate, my personal intuition is for crom·u·lent. —Mahāgaja · talk 06:12, 22 June 2019 (UTC)[reply]
Perhaps we should consider whether this is worthwhile information to provide at all, given how the rules do not appear to be consistent from reference to reference (even within dictionaries of one dialect), and are surely inconsistent in real-world usage, which we would theoretically privilege, being descriptivist... - -sche (discuss) 17:45, 26 June 2019 (UTC)[reply]

Pitch in to help with FWOTD. edit

There are consistently not enough Foreign Word of the Day nominations ready for me to set them far in advance, but I will have less time to dedicate to Wiktionary in the coming months, and I often need other editors' help when it comes to languages I'm not comfortable with. I don't want to annoy people too much, so if you're willing for me to ping you with various requests related to the languages you know, please add your username at User:Metaknowledge/FWOTD help. —Μετάknowledgediscuss/deeds 22:28, 21 June 2019 (UTC)[reply]

Pinyin conventions edit

User:Geographyinitiative has been insisting on being inclusive in terms of different Pinyin conventions (as presented in various dictionaries and other sources), including but not limited to capitalization and hyphenation. However, having both pǔtōnghuà shuǐpíng cèshì and Pǔtōnghuà Shuǐpíng Cèshì at 普通話水平測試 or both huàshétiānzú and huàshé-tiānzú at 畫蛇添足 just looks unprofessional and confusing. I see a need in formulating a set of guidelines on Pinyin to ensure consistency across entries. (See User talk:Justinrleung#perspective on capitalization for the latest discussion on this.) — justin(r)leung (t...) | c=› } 03:59, 23 June 2019 (UTC)[reply]

I'm sorry! If it's too inconvenient forget it! --Geographyinitiative (talk) 04:01, 23 June 2019 (UTC)[reply]
@Geographyinitiative: At Wiktionary_talk:About_Chinese#Capitalisation_of_demonyms_and_language_names_-_a_mini-vote and elsewhere you expressed the view that ALL possible pinyin variations (pinyin, space, capitalisations, numbering), etc. should be included, which I find disturbing and unsustainable. User:Suzukaze-c seems to back you up. Do you still hold this view? --Anatoli T. (обсудить/вклад) 05:17, 24 June 2019 (UTC)[reply]
As for the question for standardisation, even if pinyin is just a tool and not a writing system, we need to set standards and conventions and stick to them. Adding hard-redirects to the agreed version is fine, IMO but the exposed/displayed pinyin should be consistent. It's impossible to include all possible (even attested) romanisations. --Anatoli T. (обсудить/вклад) 05:21, 24 June 2019 (UTC)[reply]
Yeah- everything. It's disturbing to me because it's the reality, and it's not in the dictionary. --Geographyinitiative (talk) 09:45, 24 June 2019 (UTC)[reply]
“It is a damn poor mind that can think of only one way to spell a word.” ― Andrew Jackson. Same with Chinese romanizations. Let all the forms that can be included be included (with appropriate notation telling us what the differences mean). It is a kind of far-fetched long term goal. No standard is better than the standard of "everything". --Geographyinitiative (talk) 10:09, 24 June 2019 (UTC)[reply]
@Geographyinitiative: You will have to learn to work cooperatively and stop pushing your point of view in actual edits when it's controversial and other Chinese editors disagree with you and you engage in edit-warring. Dictionaries don't work like this - all possible standards and variations included. I already had to protect the page 吃飽吃饱 (chībǎo) from your edits. I don't want to have to block you and I don't have time to read your endless ranting. --Anatoli T. (обсудить/вклад) 12:27, 24 June 2019 (UTC)[reply]
@Atitarev I don't know about dictionaries, but we do work like this. If there's a spelling that's attestably used, we do cite it, no matter what the standard or variation.--Prosfilaes (talk) 03:01, 28 June 2019 (UTC)[reply]
Pinyin isn't written Chinese. It's a way to transcribe written Chinese for people who don't know the characters or who don't know a particular reading. It also allows people to search for the character entries if they know the pronunciation. For the latter use, having both an uppercase and a lower case entry means that people only find the case form they search on. That means that you either have to make one case form a redirect to the other, or you have to make absolutely sure that when a character spelling gets added to one case form, it also gets added to the other- good luck with that. Capitalization of Pinyin is a matter of style, not of substance, so it's silly to get all wrapped up in it- I somehow doubt that you'll find separate entries for both uppercase and lowercase pinyin spellings in the same dictionary. Chuck Entz (talk) 14:00, 24 June 2019 (UTC)[reply]
Chinese written in Pinyin isn't written Chinese? That's transparently false. Transcription is writing.--Prosfilaes (talk) 03:01, 28 June 2019 (UTC)[reply]
How wonderful it would be for you if everybody stopped using their squigglies and used Latin letters instead! No, the transcription is NOT writing and any dictionary can choose the transcription/transliteration, even if it's based on an existing standard. --Anatoli T. (обсудить/вклад) 03:13, 28 June 2019 (UTC)[reply]

MANDARIN CHINESE PINYIN: PRONUNCIATION, ORTHOGRAPHY AND TONE by Sunny Ifeanyi Odinye --Backinstadiums (talk) 15:02, 24 June 2019 (UTC)[reply]

"Pinyin isn't written Chinese" Irrelevant, but okay fine. I have no opinion on the issue, nor do I need one.
"It's a way to transcribe written Chinese for people who don't know the characters or who don't know a particular reading." Sure, it has that function.
"For the latter use, having both an uppercase and a lower case entry means that people only find the case form they search on." Why? Add a 'see also' to the top of the page and problem solved.
"That means that you either have to make one case form a redirect to the other, or you have to make absolutely sure that when a character spelling gets added to one case form, it also gets added to the other- good luck with that." It would take a lot of work and there would be a lot of mistakes involved. That is the nature of all human activity.
"Capitalization of Pinyin is a matter of style, not of substance, so it's silly to get all wrapped up in it" I know you believe that. The bald assertion of it does not prove it to be accurate.
"I somehow doubt that you'll find separate entries for both uppercase and lowercase pinyin spellings in the same dictionary" Correct, because the other dictionaries are trying to bend the readers to their view of Hanyu Pinyin rather than be a dictionary like Wiktionary. --Geographyinitiative (talk) 21:11, 24 June 2019 (UTC)[reply]
Let us have greater respect for the variant and historical forms of Hanyu Pinyin. --Geographyinitiative (talk) 21:37, 24 June 2019 (UTC)[reply]
Let's follow agreed conventions and styles, let's focus on the language itself, not the tools to transliterate it. Let us not act unilaterally and let's stop shouting. If you can formulate votes, make a vote but don't force us to protect pages from your controversial edits or block you. --Anatoli T. (обсудить/вклад) 11:44, 26 June 2019 (UTC)[reply]
I find this a very problematic post. Geographyinitiative is not shouting; there's no uppercase there. If you mean something else, well, it's not clear what you mean. Making unclear complaints about their argument style and threatening to block someone is not consistent with discussion about the issue under hand.--Prosfilaes (talk) 03:07, 28 June 2019 (UTC)[reply]
You can discuss away, as long as you don't engage in controversial edits people have been opposing and it doesn't match the existing policies and practices, as long as you don't edit-war. He has been shouting. He knows what we are talking about. --Anatoli T. (обсудить/вклад) 03:13, 28 June 2019 (UTC)[reply]

Mysterious messages on Facebook edit

Some years ago (in April 2013), I created Facebook pages for Wiktionary and Wikisource, just for fun. Several people are co-administrators of these pages and can post messages. (But most often, nothing is posted.) The page for Wiktionary has 1293 followers and the one for Wikisource has 2928. These are not impressive numbers, but all is nice and good. Lately, however, an increasing number of people send personal messages to the Wiktionary page containing a single word. Apparently, they believe this is some look-up service. Who gave them that idea? How can we make it stop? Another co-admin and I have started to ask the people who send such messages, but we have received no useful responses so far. One person mentioned "a messenger app", but could not be more specific than that. --LA2 (talk) 16:28, 25 June 2019 (UTC)[reply]

You could make it stop by deleting the page. I object to having any official or semi-official presence on Facebook. DTLHS (talk) 16:44, 25 June 2019 (UTC)[reply]
In the past couple of years, we've gotten tons of accidental bad edits from mobile-network IPs in India, Pakistan, and some other countries. I think there's some kind of dictionary app that sends people to Wiktionary when they search for a word, and many people in these countries don't have the English skills to understand that they're at a third-party website and not in something internal to the app, or at some kind of service that comes with their mobile account. We have abuse filters that stop edits with nothing but x's (probably kids looking for porn) and page creations that are too short to be actual content. We also get a lot of bogus new-user-pages where people post social-media-style profiles as if they're on Facebook or something. Is there any kind of link to the Facebook page anywhere on Wiktionary? If so, that may be how they're getting there. Chuck Entz (talk) 02:41, 26 June 2019 (UTC)[reply]
There’s a link to the Facebook page in this thread, but perhaps you mean the other way around. The About page on Facebook has https://wiktionary.org, which redirects to https://www.wiktionary.org. The first link on that landing page is to the English Wiktionary.  --Lambiam 09:06, 26 June 2019 (UTC)[reply]
Right, better delete the Facebook presence for your and Wiktionary’s benefit. It is diametrically opposed to the GPL spirit, and any efforts put into Facebook are wasted. It only supports enslavement by the algorithm instead of responsible use of information. Fay Freak (talk) 14:45, 27 June 2019 (UTC)[reply]
Maybe deactivate the account instead of deleting the page, to prevent the "wiktionary.org" name from being taken by somebody else. – Jberkel 09:46, 29 June 2019 (UTC)[reply]
It would seem unnecessary to delete the page. Keep it going. BTW, Wonderfool used to control a Twitter account to promote the use of Wiktionary. It was very stimulating --I learned some phrases (talk) 06:59, 30 June 2019 (UTC)[reply]
I don't think FB has much in the way of anti-spam features. (They're happy to hide legit posts they don't like, though.) Maybe you could make some bot or API thing to delete single-word posts from your group, if they allow such things. Equinox 18:14, 30 June 2019 (UTC)[reply]

Kazakh transliteration update edit

I think we can now update to the latest Kazakh romanisation (2018) in Module:kk-translit and WT:KK TR. Calling one everyone involved so far @Vtgnoq7238rmqco, Metaknowledge, Rua. --Anatoli T. (обсудить/вклад) 01:19, 26 June 2019 (UTC)[reply]

I disagree. Our romanisation need not match the schemes used in Kazakhstan, which are 1) intended as a primary script rather than as a romanisation, 2) in a state of flux and poorly clarified, and 3) more flawed than the romanisations we use for Cyrillic-script languages. Vtgnoq7238rmqco, who actually works on Kazakh, commented elsewhere that they think this as well. —Μετάknowledgediscuss/deeds 04:42, 28 June 2019 (UTC)[reply]
@Metaknowledge: User:Vtgnoq7238rmqco disagreed because of some issues with the new script. Kazakhstan doesn't use either the old or the new romanisation, it's just a... well romanisation. One of the schemes was occasionally used in Kazakhstan and the new one is likely to be promoted and used and our users may want to be more familiar with it. --Anatoli T. (обсудить/вклад) 05:28, 28 June 2019 (UTC)[reply]
Those are just my points 2 and 3 in the diff by Vtgnoq7238rmqco you linked to. —Μετάknowledgediscuss/deeds 01:04, 29 June 2019 (UTC)[reply]

Requesting a way to automatically transclude a header on Reconstruction pages edit

I've started the task T226846 on Phabricator requesting a way to automatically add the notice at the top of every Reconstruction page, so we don't have to manually add {{reconstruction}}. This would probably involve Extension:PageNotice. Previous discussions on this topic: Wiktionary:Beer parlour/2017/September#Proposal: install mw:Extension:PageNotice, Wiktionary:Grease pit/2017/June#Citations at citations, Wiktionary:Grease pit/2018/September#{{reconstruction}}. — Eru·tuon 18:21, 28 June 2019 (UTC)[reply]

Additional Form of Romanization of Mandarin Chinese edit

In 2002, the government of the Republic of China (Taiwan) approved the usage of the so-called "Tongyong Pinyin" system. Until at least 2008, the system was used throughout the island of Taiwan. I was taught low/mid-level Mandarin Chinese with a book that used Tongyong Pinyin and Bopomofo. To the vast majority of Chinese people in mainland China and around the world, the Tongyong Pinyin romanization system is meaningless and obsolete. But to the people in southwestern Taiwan, this system can still be used in some contexts. I have seen Tongyong Pinyin on printed documents at the Immigration office in Taipei (northern Taiwan). Tongyong Pinyin is much more commonly used than Gwoyeu Romatzyh in my experience. There's no reason to ignore Gwoyeu Romatzyh or Tongyong Pinyin. For these reasons, I would like to find out how to add Tongyong Pinyin to the zh-pron box under Mandarin. If you can show me how to do it, I will do it myself. I can't claim to be very familiar with the system, but I have a reliable source for matching Hanyu Pinyin syllables to Tongyong Pinyin syllables: http://www.pinyin.info/romanization/tongyong/basic.html. If you don't trust me to do it, then I would ask you do add it yourself. Minority and historical perspectives should be included in an appropriate way. Any help would be appreciated. --Geographyinitiative (talk) 04:51, 29 June 2019 (UTC)[reply]

@Geographyinitiative: I'd support including Tongyong Pinyin. I've also had textbooks from my Chinese school that show Tongyong Pinyin alongside Zhuyin (and if I remember correctly, Hanyu Pinyin). This should be automated, though, so it will require some fiddling around with code. We can add this to our list of tasks. — justin(r)leung (t...) | c=› } 05:55, 29 June 2019 (UTC)[reply]
@Justinrleung: Can it be added only to monosyllabic entries and only in expanded mode just like Wade-Jiles? We had discussions about overcrowdedness of romanisations. Only the mainstream Hanyu Pinyin and Zhuyin should show by default. Gwoyeu Romatzyh is even less popular and known than Wade-Giles, GR should also be hidden by default. Anatoli T. (обсудить/вклад) 07:03, 29 June 2019 (UTC)[reply]
@Atitarev: I would definitely have it only in expanded mode. That said, I don't think it'd be overcrowded to show them all, even in multisyllabic entries. It'd be nice to have Wade-Giles and Tongyong Pinyin in all entries. — justin(r)leung (t...) | c=› } 07:09, 29 June 2019 (UTC)[reply]
@Justinrleung: Many romanisations are so limited in their usage - in terms of time and territory. Anyway, if you add TP, please add WG as well. Anatoli T. (обсудить/вклад) 07:15, 29 June 2019 (UTC)[reply]
@Justinrleung: Yes, and thank you. I will add it there. But if you have a moment, I would like to do this immediately- it seems like a job of mindless copy-pasting for an idiot that I could do no problem. What page could I go to to add this? --Geographyinitiative (talk) 06:18, 29 June 2019 (UTC)[reply]
@Geographyinitiative: Sorry to crush your dreams of mindless copy-pasting, but ideally it should be automatically generated given the Hanyu Pinyin, just like Gwoyeu Romatzyh is automatically generated now. — justin(r)leung (t...) | c=› } 06:23, 29 June 2019 (UTC)[reply]
How can I do that? --Geographyinitiative (talk) 06:34, 29 June 2019 (UTC)[reply]
@Geographyinitiative: I don't think you know how to. There needs to be some code added to MOD:cmn-pron to make this happen. — justin(r)leung (t...) | c=› } 06:43, 29 June 2019 (UTC)[reply]
You're correct, I don't know how to do it. I will try to figure it out. --Geographyinitiative (talk) 07:01, 29 June 2019 (UTC)[reply]
Working on it. This is what I have so far:

function export.py_tongyong(text)

local tongyong_initial = {

['b'] = 'b', ['p'] = 'p', ['m'] = 'm', ['f'] = 'f',

['d'] = 'd', ['t'] = 't', ['n'] = 'n', ['l'] = 'l',

['g'] = 'g', ['k'] = 'k', ['h'] = 'h',

['j'] = 'j', ['q'] = 'c', ['x'] = 's',

['z'] = 'z', ['c'] = 'c', ['s'] = 's', ['r'] = 'r',

['zh'] = 'jh', ['ch'] = 'ch', ['sh'] = 'sh',

[] =

}


local tongyong_final = {

['yuan'] = 'yuan', ['iang'] = 'iang', ['yang'] = 'yang', ['uang'] = 'uang', ['wang'] = 'wang', ['ying'] = 'ying', ['weng'] = 'wong', ['iong'] = 'yong', ['yong'] = 'yong',

['uai'] = 'uai', ['wai'] = 'wai', ['yai'] = 'yai', ['iao'] = 'iao', ['yao'] = 'yao', ['ian'] = 'ian', ['yan'] = 'yan', ['uan'] = 'uan', ['wan'] = 'wan', ['üan'] = 'yuan', ['ang'] = 'ang', ['yue'] = 'yue', ['wei'] = 'wei', ['you'] = 'you', ['yin'] = 'yin', ['wen'] = 'wun', ['yun'] = 'yun', ['eng'] = 'eng', ['ing'] = 'ing', ['ong'] = 'ong',

['yo'] = 'yo', ['ia'] = 'ia', ['ya'] = 'ya', ['ua'] = 'ua', ['wa'] = 'wa', ['ai'] = 'ai', ['ao'] = 'ao', ['an'] = 'an', ['ie'] = 'ie', ['ye'] = 'ye', ['uo'] = 'uo', ['wo'] = 'wo', ['ue'] = 'yue', ['üe'] = 'yue', ['ei'] = 'ei', ['ui'] = 'uei', ['ou'] = 'ou', ['iu'] = 'iou', ['en'] = 'en', ['in'] = 'in', ['un'] = 'un', ['ün'] = 'yun', ['yi'] = 'yi', ['wu'] = 'wu', ['yu'] = 'yu',

['a'] = 'a', ['e'] = 'e', ['o'] = 'o', ['i'] = 'i', ['u'] = 'u', ['ü'] = 'yu', ['ê'] = 'e',[] = 'ih'

}

local tongyong_er = {

['r'] = 'r', [] =

}


local tongyong_tone = {

['1'] = , ['2'] = 'ˊ', ['3'] = 'ˇ', ['4'] = 'ˋ', ['5'] = '˙', ['0'] = '˙' }

--Geographyinitiative (talk) 03:23, 30 June 2019 (UTC)[reply]

@Geographyinitiative: You should probably put it in a module (something like MOD:User:Geographyinitiative/tongyong) for your own testing (and link us to it to avoid clutter on this discussion). I'm not sure if this is the best way to handle the conversion, though, since Tongyong Pinyin is so similar to Hanyu Pinyin. I'll take a crack at this later. — justin(r)leung (t...) | c=› } 03:33, 30 June 2019 (UTC)[reply]