Wiktionary:Grease pit/2020/July

discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← June 2020 · July 2020 · August 2020 → · (current)

r:obastan -> R:obastanEdit

Can someone with a bot please replace all instances of * {{r:obastan}} with * {{R:obastan}}, please? Allahverdi Verdizade (talk) 15:33, 1 July 2020 (UTC)

  • That template doesn't exist; you mean {{R:az:Obastan}}. But since there's a redirect from {{r:obastan}} to {{R:az:Obastan}}, why not leave them alone and allow the redirect to do its job? —Mahāgaja · talk 16:06, 1 July 2020 (UTC)
Sure, why not. Allahverdi Verdizade (talk) 16:16, 1 July 2020 (UTC)

Software changeEdit

The mw:New requirements for user signatures will begin on Monday, 6 July 2020. This is a change to MediaWiki software that will prevent editors from accidentally setting certain types of custom signatures, such as a custom signature that creates Special:LintErrors (such as <span>...<span> instead of <span>...</span>) or a signature that does not link to the local account.

Few editors will be affected. If you want to know whether your signature (or any individual editor) is okay, you can check your signature at https://signatures.toolforge.org/check You are not required to fix an invalid custom signature immediately. Starting Monday, editors will not be able to create new invalid signatures to Special:Preferences. Later, we will contact affected editors. Eventually, invalid custom signatures will stop working. There will be an announcement in m:Tech/News then. You can subscribe to m:Tech/News. You can also put mw:New requirements for user signatures on your watchlist.

If you have questions, then please ping me or ask questions at mw:Talk:New requirements for user signatures. Thanks, Whatamidoing (WMF) (talk) 03:47, 2 July 2020 (UTC)

WT:NORMEdit

I'm not sure whether this is a Grease Pit or Beer Parlour matter - I am often baffled why WT:NORM is added to a page - sometimes I can track it down to spacing, but more often than not I am puzzled, like here. DonnanZ (talk) 15:12, 4 July 2020 (UTC)

This paragraph may offer a clue:

Some edits are tagged with "WT:NORM" by an experimental filter. This means that the wikitext of the page violates one of these rules. It does not mean that the edit is bad, though bad edits will sometimes trigger the filter. DonnanZ (talk) 15:45, 4 July 2020 (UTC)
There was a space at the end of a line. I wish I could click on the WT:NORM tag link and have the system generate an edit normalizing the page. Vox Sciurorum (talk) 16:12, 4 July 2020 (UTC)
That sounds like a useful idea, if it can be made to work. My 72-year-old eyes didn't pick that one up, thanks for finding it. DonnanZ (talk) 16:51, 4 July 2020 (UTC)
I forget what it's called now, but there used to be a bot which targeted pages tagged with WT:NORM. It may have stopped operating. In fact I think the bot was used before the tag was introduced. DonnanZ (talk) 12:36, 5 July 2020 (UTC)
Maybe you're thinking of my bot, User:ToilBot, though I added the tag before creating the bot script. I run the script on relatively small batches of pages because the edit logs end up being so long. I need to figure out how to cut up the log into bite-size pieces (a size that doesn't take too long to load). — Eru·tuon 04:41, 6 July 2020 (UTC)
User:TheDaveBot used to do that sort of thing a few years ago, but I believe it was before the tag was created. Chuck Entz (talk) 06:13, 6 July 2020 (UTC)
It must be TheDaveBot I'm thinking of. DonnanZ (talk) 09:27, 6 July 2020 (UTC)

w inside coinage doesn't workEdit

If I write "{{coinage|en|{{w|Donald Trump}}}}" I get "Coined by [[w:Donald Trump|Donald Trump]]". Something is preventing substitution of the [[]] generated by {{w}}. Vox Sciurorum (talk) 16:10, 4 July 2020 (UTC)

{{coinage|en|w:Donald Trump}} doesn't work, even though {{l|en|w:Donald Trump}} does. So this seems like a bug in the former. —Rua (mew) 16:29, 4 July 2020 (UTC)
The template links to Wikipedia by default. So you can just unwrap Donald Trump out of {{w}}: {{coinage|en|Donald Trump}}. — Eru·tuon 18:46, 4 July 2020 (UTC)
Thanks. I should have RTFM a few more times. Vox Sciurorum (talk) 22:26, 4 July 2020 (UTC)

Kyrgyz declension templateEdit

The template {{ky-decl-noun}} is incapable of handling singulare and plurale tantum declensions. Could someone add that ability? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 20:58, 5 July 2020 (UTC)

@Ilawa-Kataka See {{ky-decl-noun-sg}} and {{ky-decl-noun-pl}}. I rewrote these two as well as {{ky-decl-noun}} using a module; they no longer require any parameters. Benwing2 (talk) 01:37, 24 July 2020 (UTC)
@Benwing2: Thank you! @Ilawa-Kataka: I have applied {{ky-decl-noun-sg}} to some country names where appropriate. --Anatoli T. (обсудить/вклад) 02:13, 24 July 2020 (UTC)
@Benwing2, @Atitarev: Thank you both! İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 03:09, 24 July 2020 (UTC)
@Benwing2 The module cannot handle capital letters such as at the entry Ош. Can you fix that? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 20:31, 25 July 2020 (UTC)
@Ilawa-Kataka Fixed. Benwing2 (talk) 21:01, 25 July 2020 (UTC)
@Benwing2 Thanks! Ош is uncountable by the way. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 21:10, 25 July 2020 (UTC)
@Benwing2 One more thing—could you add possessive suffix support? I think the best resource for that is https://www.researchgate.net/publication/313774172_Kyrgyz_Orthography_and_Morphotactics_with_Implementation_in_NUVE at page 7 (the key is at page 3). The forms with "з" in it are polite forms. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 13:31, 29 July 2020 (UTC)

Template:projectlinkEdit

This does not work with Wikipedia language codes not recognized in Wiktionary. For example, we treat azb under az, but there is a https://azb.wikipedia.org/ Wikipedia to which I would like to link. --Vahag (talk) 07:30, 6 July 2020 (UTC)

Template:quote-bookEdit

@Sgconlaw Currently, if an entry using Template:quote-book doesn't contain a year, it gets categorised into Category:Requests for date - which is good. It would be even better to categorise according to author, so the quote at abbreviature (#* {{quote-book|en|title=Via Pacis|author={{w|Jeremy Taylor}}| passage=This is an excellent '''abbreviature''' of the whole duty of a Christian.}}) would get categorised into Category:Requests for date/Jeremy Taylor. Obviously, I can't touch that template as I'm a total n00b, so I'm hoping someone else can. --Dada por viva (talk) 23:56, 6 July 2020 (UTC)

Special Characters in the Search BarEdit

How can I escape * in the search? I'm getting 60.000 results for "*le-".

I would put this question in the newby forum, but it does not get any exposure there, and if I remember correctly, there is just no solution besides downloading and grepping a db-dump, so this is likely a vain feature request. 109.41.0.27 19:04, 8 July 2020 (UTC)

Are you trying to use the asterisk as a wildcard? Our searchbox doesn't accommodate that. —Mahāgaja · talk 20:20, 8 July 2020 (UTC)
@Mahagaja no I'm grepping for a reconstruction that is not indexed. If it's red-linked, I should try the what-links-here page. 109.41.3.82
OK, since reconstructions are in a separate namespace, and the asterisk isn't actually part of the page name in the Reconstruction namespace, you have to go to Special:Search, then click "Add namespaces…" to restrict your search to the Reconstruction namespace, and then search for "le". I just did so and the closest things I can find to what you're looking for are Reconstruction:Proto-Slavic/a le and Reconstruction:Proto-Samoyedic/lë. —Mahāgaja · talk 07:14, 10 July 2020 (UTC)
@Mahagaja thanks, I knew, theoretically, but I'm looking for a mention in the main namespace (asserting *le- > PIE *leg'-). I could just go through my browser history one by one, but even if that were successfull, it would not resolve this request in general. To escape means adding a escape character to create a escape sequence. 109.41.2.63 18:24, 10 July 2020 (UTC)
If you made a red link to the page, you can use Special:WhatLinksHere to find the page it appears on (as you already mentioned yourself). I knew what you meant by "escape", but I don't think the search page has that function. —Mahāgaja · talk 20:15, 10 July 2020 (UTC)
If you're trying to search for a literal *, I think that's not supported as a word query (or whatever the term is) because only words are indexed, and *, and probably -, are not word characters. But you can use the insource feature with regex (insource:/\*le-/). insource searches with regex often don't finish, so you can do insource:"le" insource:/\*le-/ to speed it up. The insource with quotes does a word search in the wikitext, which is about as fast as a word query, and shortens the list of pages that have to be searched with regex. — Eru·tuon 22:00, 10 July 2020 (UTC)

ѣ with a combining diaeresis when it should be read as "ё"Edit

The archaic letter ѣ () was used in Russian in pre-1918 orthography. It normally corresponds to е (je) in the modern spelling but there are cases when it corresponds to letter ё (jo), which is normally or quite often replaced with letter е (je) in running texts by native speakers. In the pre-1918 letter ё (jo) was used even less but ѣ () was never written as ё (jo), unless to show how the word was pronounced. We are using "ѣ" with a combining diaeresis in cases when "ѣ" should be read as "ё" and it automatically adds a combining diaeresis when it's necessary, for example, the plural nominative of гнѣздо́ (gnězdó), etc. is displayed as гнѣ̈зда (gně̈zda) (ѣ with a diaeresis) in the declension table. In cases where "ѣ" should be read as "ё" in the lemma, it has to be specifically added, as гнѣ̈здышко (gně̈zdyško), the pre-reform spelling of гнёздышко (gnjózdyško)

@Benwing2: Can we please remove the diaeresis from links, as with acute accents, so that гнѣ̈здышко (gně̈zdyško) is linked to гнѣздышко? I don't know, which module is responsible for it.

I will update WT:RU TR so that users know about the usage of diaeresis over "ѣ", it's purely our Wiktionary convention. --Anatoli T. (обсудить/вклад) 00:23, 9 July 2020 (UTC)

@Atitarev Fixed. The module in question is Module:languages/data2. Benwing2 (talk) 00:54, 9 July 2020 (UTC)
@Benwing2: Thank you for the quick fix. This usage of the diaeresis made me think that maybe we should also use ея̈ (jejä) when it corresponds to modern её (jejó)? It was only one word where "я" = "ё", though, if I'm not mistaken and "её" is a more modern pronunciation but there were a lot of double pronunciations and confusions at the time when "ё" was first introduced. --Anatoli T. (обсудить/вклад) 01:05, 9 July 2020 (UTC)
@Atitarev There are a lot of similar cases, right? E.g. neut/fem pl бѣ́лыя instead of modern бе́лые. Benwing2 (talk) 01:13, 9 July 2020 (UTC)
@Benwing2: There are some cases but I didn't have this one in mind. бѣ́лыя could be read as it was written without any issues. It could be read as бѣ́лые or as it's spelled, very little difference with unstressed vowels. In fact, it was grammatically helpful but burdensome for learners. What I find troublesome is old "больша́го" for "большо́го". "больша́го" was not pronounced that way long before the reform, unless by clergy. (The introduction of letter "ё" was first mocked by many as (1) too colloquial or (2) in some cases labelled as "просторечие".) --Anatoli T. (обсудить/вклад) 02:02, 9 July 2020 (UTC)

Alphabetical order in OjibweEdit

(Please feel free to point me to the correct spot for this question if this is not the right one.)

I want to alphabetize lists that appear in Ojibwe pages - both namespace pages as well as categories - according to the (Latin) Ojibwe alphabet (a, aa, b, ch, d, e, g, h, ’, i, ii, j, k, m, n, o, oo, p, s, sh, t, w, y, z, zh).

Simply put, double vowels are considered as one letter, as are ch, sh and zh, and the glottal stop (apostrophe) " ’ " follows h. There are also several letters that don't appear, as well as Canadian syllabics, but I don't think these issues should pose a particular problem, as they't alter the order internal to the existing letters.

I'm not certain of the technical changes that are necessary for this, nor the scope of the solution, but I see that lists in other languages (Swedish, Spanish, for example) are organized according to those languages' alphabets, so i thought i would ask.

Thanks in advance for any help you can offer. SteveGat (talk) 18:43, 9 July 2020 (UTC)

We can fix the alphabetical order in categories the same way we do for Welsh, by specifying a sort key at Module:languages/data2. For automatically sorted lists of Derived terms and Related terms, you would have to use {{col-u}} and sort the list manually. —Mahāgaja · talk 19:47, 9 July 2020 (UTC)
Manual sorting isn't necessary because column templates such as {{der3}} will use the language's sortkey to alphabetize the words. They can also use an arbitrary sortkey, generated by a function specified in Module:collation, to sort more accurately than categories. Currently Egyptian is the only language that makes use of this feature, using a function in Module:egy-utilities. So for instance for Ojibwe, the function could replace a with 1, aa with 2, b with 3, ch with 4, etc. (Actually I would use different replacement characters, but you get the idea.) Column templates can use an arbitrary sortkey because they don't display the sortkey anywhere, but categories can't because they display the first code point of the sortkey in the column headers. — Eru·tuon 22:50, 9 July 2020 (UTC)
Cool, I wasn't aware that {{der3}} and its allies now use the language's sortkey. —Mahāgaja · talk 06:08, 10 July 2020 (UTC)
Just coming back to this. I think i understand theoretically what is being suggested, but i'm certain i don't know how to do it. Could someone explain it to me with instructions i might be able to follow or, if it really is so simple, maybe just do it? SteveGat (talk) 13:49, 27 July 2020 (UTC)
@SteveGat: Unless you have template editor or admin rights, you can't edit the module anyway, but I can do it for you, sort of. I can make it so that in categories, words beginning with "aa" are sorted at the bottom of words starting with "a", but they'll still be in the "A" section, and likewise for "ii" at the bottom of the "I" section, "sh" at the bottom of the "S" section, and "zh" at the bottom of the "Z" section. Since "c" doesn't exist outside of the digraph "ch", no special sorting is necessary for it, but the section will still be labeled "C" in categories. As for the ʼ character, I can make it appear at the bottom of the "H" section. Is that acceptable? In lists generated with {{der3}}, {{rel3}} and the like, the alphabetization will be right. Incidentally, we should be using ʼ U+02BC modifier letter apostrophe, not ' U+0027 apostrophe or U+2019, right single quotation mark, for Ojibwe, since it's functioning as a letter and not as a punctuation mark. —Mahāgaja · talk 14:51, 27 July 2020 (UTC)
@Mahagaja: Thanks, that would be great. It's less elegant than having an "a" section and an "aa" section, but at least things will be in order. Will it also work within a word, so that, for example, "adwe" is before "adaam"? SteveGat (talk) 14:58, 27 July 2020 (UTC)
@SteveGat:, yes, that will work too. —Mahāgaja · talk 15:00, 27 July 2020 (UTC)
@SteveGat: OK, this is done, but it may take a couple of days for it to filter through and actually work in all the categories. Also, I have only sorted the "modifier letter apostrophe", which means any entries using the normal typewriter apostrophe or the curly quote will not sort properly. Those entries should be moved to entries using the modifier letter apostrophe. —Mahāgaja · talk 15:06, 27 July 2020 (UTC)
@Mahagaja: Great, thanks. I'll check back through the list of lemmas in a few days to see if any apostrophe problems pop up. It is not a common letter, and never appears word-initially, so problem shooting should be straightforward. SteveGat (talk) 15:10, 27 July 2020 (UTC)
@SteveGat:. Great, let me know if you have any questions or issues. —Mahāgaja · talk 15:55, 27 July 2020 (UTC)

@SteveGat: Here's a list of Ojibwe entry names with straight apostrophes, as of July 20th. There weren't any with the curly apostrophe.

  • Eru·tuon 18:42, 27 July 2020 (UTC)

    @Erutuon: Thanks for the list. After Mahāgaja moved a few entries to modify curly apostrophes, i did the first row of your list (i didn't know how to do that without creating a redirect). It's pretty arduous, and raised a couple of questions.
    1) Why not treat regular apostrophes like curly ones on the back end? and
    2) How do we avoid people recreating the problem by creating new entries using the "straight" apostrophe? SteveGat (talk) 15:55, 29 July 2020 (UTC)
    @SteveGat: The software treats the modifier letter ʼ and the curly apostrophe differently, because they're used for different purposes. The apostrophe (regardless of whether we use the curly kind or the straight kind) is treated as a punctuation mark, while the modifier letter is treated as a letter, equivalent to "a", "b", "c", etc. As for making sure Wiktionarians use the right character in the future, ideally there should be a page WT:About Ojibwe where all of the conventions are outlined. But of course Ojibwe isn't a language that there are ever going to be dozens of editors working on! And don't worry about the redirects; only admins can move a page without creating a redirect. As long as Ojibwe is the only language that uses a certain spelling, keeping the redirect doesn't hurt anything. —Mahāgaja · talk 18:20, 29 July 2020 (UTC)

    Requested edit to Template:cite-metaEdit

    For the benefit of Wikipedians who cross projects occasionally, it would be swell if {{cite-meta}} (on which {{cite web}}, {{cite book}}, &c. rely) could understand access-date, the normative form in the CS1 Wikipedia templates. Please change all occurrences of {{{accessdate}}} to {{{accessdate|{{{access-date}}}}}}. Psiĥedelisto (talk) 20:21, 10 July 2020 (UTC)

    Pinging some admins. @SemperBlotto, Equinox, Metaknowledge: can y'all do this, please? Psiĥedelisto (talk) 08:40, 25 July 2020 (UTC)
      DoneSuzukaze-c (talk) 23:07, 27 July 2020 (UTC)

    Adding and Using Pali TransliterationEdit

    I have now got the Pali transliteration working for the 9 supported abugidas using module {{Module:pi-translit}}. However, it is not possible to always automatically determine whether a word is written in an abugida or an alphabet for the Thai and Lao scripts. Should I register an artificial script for these alphabets? How do I go about the appropriate registration processes? --RichardW57 (talk) 21:22, 11 July 2020 (UTC)

    For highly derived subsidiary script form පාචයන‍්ත් I have the gloss

    # {{pi-sc|Sinh|pācayant}}, {{inflection of|pi|පාචෙති||present|participle|tr={{l|pi|pāceti}}}}, {{inflection of|pi|පචති||causative|tr={{l|pi|pacati}}|t=to cook}}
    

    which renders, converting '#' to '*', as:

    Can I take advantage of automatic transliteration to make the transliteration a link, or would I have to create a new template, possibly backed up with a new module? An additional complication is that there are words whose transliteration is not the Latin script form of the word, e.g. ທັມມະ (damma, dharma), whose Latin script equivalent is dhamma, so I need to be able to not make it a link. --RichardW57 (talk) 21:22, 11 July 2020 (UTC)

    vi-etym-sinoEdit

    {{vi-etym-sino}} is very hard to adopt in other language projects (needs to be translated). Could you simplify it with module logic instead? --Octahedron80 (talk) 03:49, 15 July 2020 (UTC)

    I think that is a bad idea for a Roman script language. At 0.5MB per module used, such a conversion could make further pages run out of memory during rendering. Indenting the code (hiding white space within comments if need be) would make the code much more readable than it is now, thus making it easier to translate. --RichardW57 (talk) 12:11, 15 July 2020 (UTC)

    Module:R:ErtSz/dataEdit

    In trying to puzzle out why baba, an entry with only a few translations, had run out of memory, I stumbled on this near-singularity galactic gravity well of a data module- 12.35 MB of lua memory in preview for this and a couple of much smaller modules. I'm not very good at either Hungarian or lua, but it looks like Module:R:ErtSz is loading in a table of basically every word in the Hungarian language so it can find a single line using an alphabetic key.

    This leads to the obvious question: can't this be split up into sub-modules, based on the first letter of the key (like the Module:languages data modules)? Granted, the vowels are complicated a bit by diacritics that aren't reflected in ordering of the data, but that seems like a minor problem.

    Pinging @Erutuon, Adam78 who have been working on this. Chuck Entz (talk) 06:40, 15 July 2020 (UTC)

    @Chuck Entz, as far as I'm concerned, any improvement you (or anybody else) can possibly implement would be very welcome! (I'm not familiar with Lua myself.) Adam78 (talk) 13:19, 15 July 2020 (UTC)

    I'm sorry if my colorful description of the problem gave even the slightest impression of disapproval- my impression was that this was a a sketchy first draft of the code, and that it might benefit from a suggestion to speed up the optimization.
    In patrolling CAT:E over the years, I've gotten fairly good at getting a rough idea of what code is probably doing, but actually writing the code requires mastery of the details- and I know better than to implement anything myself. Unless it's correcting an obvious typo that's causing thousands of module errors, I keep my hands off the coding part of modules.
    If I understand the code, the simplest implementation would be to generate the name of the data module that's called by adding the first letter of the term to "Module:R:ErtSz/data", and have all the data that starts with "b" in a module called "Module:R:ErtSz/datab". That's what they did with Module:languages/data when it became obvious that loading in data for all the language codes every time would be unsustainable (see Category:Language data modules). The main choice would be whether to have ""Module:R:ErtSz/dataa","Module:R:ErtSz/dataá", "Module:R:ErtSz/dataä", etc. or to strip the diacritics first so it can all be in "Module:R:ErtSz/dataa".
    If you think about it, this module is called with only one search term per entry, so it's best to avoid loading the data for every word in the language. Theoretically, you could go as far as having a separate module for every word, but that would be a ridiculous waste of time to set up. I think a module for each combination of the first one or two letters should be more than enough to solve the memory problems. If I can help in setting up the data modules, I'd be happy to pitch in. Thanks! Chuck Entz (talk) 15:01, 15 July 2020 (UTC)
    Splitting up the data by letter of the alphabet is a good idea, but I tried the less drastic measure of changing Module:R:ErtSz/data and Module:R:ErtSz/homonyms to search for the word in the data string and return its code or codes, without parsing the whole data string into a table. Module:R:ErtSz/data is something like 1.5 MB, but parsed into a Lua table it is much larger. This change removed the error in the old version of baba, which I tested in WT:SAND. The memory usage of {{R:ErtSz}} on its own is still about 4 MB, so it might still be worth splitting the module up. Probably a good idea to get it out of the way so I might write a bot script to do it. — Eru·tuon 18:52, 15 July 2020 (UTC)

    It sounds great. All I could do now was replace this template with the old-style version (which requires one to enter the unique dictionary ID manually) on page baba, so the error is solved here but of course it's bound to arise elsewhere until the matter is solved. Adam78 (talk) 16:10, 15 July 2020 (UTC)

    @Erutuon There are lots of module errors now. Benwing2 (talk) 05:06, 16 July 2020 (UTC)
    @Benwing2: Thanks! Should have tested the new version of Module:R:ErtSz/homonyms before saving. Fixed. — Eru·tuon 05:25, 16 July 2020 (UTC)

    @Erutuon, thank you! 🙏 Adam78 (talk) 15:16, 16 July 2020 (UTC)

    Category:French terms spelled with ÂEdit

    Seeing as the French changed the spelling again, and that the circonflexe has pretty much been taken off, we would be wise to make alternative spellings of the circumflexed words. I started with the creation of hopital (which, as it turns out, is wrong). To start with, I propose the creation of Category:French terms spelled with Â, Category:French terms spelled with Ê, Category:French terms spelled with Î, Category:French terms spelled with Ô, Category:French terms spelled with Û. OK, not all circumflexed words are affected (it seems the rules just apply on the I and the U), but the categories would be useful anyway. See this website, this one and the BBC for more information. --CasiObsoleto (talk) 08:49, 15 July 2020 (UTC)

    "Seeing as the French changed the spelling again": I don't know what you're talking about. The last spelling reform took place in 1990, and there has been much back-and-forth since, but that's all, afaik. And I see no value in having those categories. PUC – 15:12, 15 July 2020 (UTC)
    You're probably right. I should have read up more on the subject before making such a newbie post. --CasiObsoleto (talk) 17:51, 15 July 2020 (UTC)

    ‘post’ and ‘ante’ should be equivalent to ‘circa’ in Module:QuotationsEdit

    It's a bit hard to explain, but look at, for example, at Module:Quotations/la/data and at the page coruscō. You’ll see that a ‘c.’ in the source code is treated as a special word that calls up a correctly formatted link to circa. Now look at the quotation from Juvenal in that entry. See how the module doesn’t treat ‘p.’ the same as ‘c.’? How would that be fixed? --Biolongvistul (talk) 09:27, 15 July 2020 (UTC)

    English Word DumpEdit

    I know there are dumps that have a list of all words defined by Wiktionary in any and all languages. Is there a dump like that for only English words? It does not have to include the definitions, just the entry words would be perfect. —⁠This unsigned comment was added by 2600:1700:E40:B5E0:C8A9:602C:1A45:323C (talk) at 05:12, 16 July 2020 (UTC).

    I don't believe so. The latest dumps are here: [1]. I see no filename that would suggest "English only". Equinox 08:30, 16 July 2020 (UTC)
    It isn't too hard to process the entire dump to extract all the English words. It is a bit harder to extract only lemmas, only non-obsolete forms, etc. Would you want all the content? Just the (current?) definitions? Just the headwords? All of these things are relatively easy using any language that supports (eg, Perl, Python) regular expressions once one understands the structure of our entries. DCDuring (talk) 18:20, 16 July 2020 (UTC)
    Perhaps we should create a little library of the main relevant regular expressions in each of the main flavors of regular expressions. DCDuring (talk) 18:25, 16 July 2020 (UTC)
    I have lists of each languages' entry names from the latest dump (the titles of the pages that have a given language header), but the English one is 10.7 MB so it would need to be uploaded to a place off-wiki to be available for download. Any suggestions? I could start a Toolforge site for this if more people would be interested. — Eru·tuon 19:13, 16 July 2020 (UTC)
    Maybe we should refer such requests to WikiData. DCDuring (talk) 21:55, 16 July 2020 (UTC)

    MediaWiki:AnontalkpagetextEdit

    Please change him/her to them. translatewiki:MediaWiki:Anontalkpagetext/en was updated on 2018-12-14. Thanks. -- 05:46, 16 July 2020 (UTC)

      Done Equinox 08:27, 16 July 2020 (UTC)

    Template importation request: {{tq}}Edit

    It's used for quoting things other people have said on talk pages, in a more visually distinctive way than putting quotation marks around them. I looked for some other template used for that purpose here and didn't find one. It's present and commonly used on English Wikipedia and on Commons, and probably other wikis. I assume one of those can be brought over here somehow, but I don't know how. (In particular, lots of templates on English Wikipedia use Lua, when I have no idea why most of them need it, which could complicate things.) PointyOintmentt & c 17:42, 17 July 2020 (UTC)

    What's the point? To my knowledge, nobody has ever wanted that template except you, someone who has not made any contributions in mainspace. —Μετάknowledgediscuss/deeds 18:04, 17 July 2020 (UTC)
    I don't know of anyone else requesting it either—I expect that would have come up in my searching, and I probably wouldn't have requested it myself if so (because either the template would already be here or I would've read the previous discussion and learned that it's diswanted and why). But that doesn't mean nobody else has ever wanted it; maybe they just didn't get around to asking. I first thought of requesting it a couple of weeks ago, and didn't do so until today. Or maybe they just didn't realize such a thing was available to ask for. Maybe now that it's been raised, other people will (state that they) want it too.
    And I may not be much of a dictionarist, but does that mean contributions in other ways (such as choosing words of the day, which I was about to start doing) are unwelcome? (You may also note that I have a reasonably long history of constructive editing on English Wikipedia, as well as a userpage there, which I just haven't gotten around to starting here—I'm guessing you looked at my contributions here at least in part because my signature is red?) PointyOintmentt & c 18:21, 17 July 2020 (UTC)
    I don't think I'd use it, but I wouldn't slap it with {{rfd}} either. DCDuring (talk) 21:01, 19 July 2020 (UTC)

    |alts= in {{desc}}Edit

    I can see that this parameter links to the right language section (e.g. at number, see Swahili desc). But at sexy the alt form of the Spanish descendants isn't linking because it's called from the same page. Ultimateria (talk) 17:30, 18 July 2020 (UTC)

    @Ultimateria: Fixed. — Eru·tuon 18:06, 18 July 2020 (UTC)

    Possible bot work, looking for feedbackEdit

    I just edited an alternative spelling entry to ensure that it has the same subject categories as the other spelling. This seems uncontroversial to me. So 1.) is it a good idea to have a bot that automatically ensures that alternative spellings have the same categories and 2.) is there anyone who can actually do this work to make a bot that scans alternative spellings, as I am not smart enough to do it myself? —Justin (koavf)TCM 03:30, 19 July 2020 (UTC)

    I've actually removed categories from alt forms plenty of times. There are 70,000+ alternative forms and spellings in English alone; the potential for cluttering our categories with essentially duplicate pages to sift through is high enough for me to oppose these edits. Ultimateria (talk) 06:16, 19 July 2020 (UTC)
    I agree. Not only is this not uncontroversial, I think it's an actively bad idea, and it's most certainly against our usual practice. —Μετάknowledgediscuss/deeds 06:34, 19 July 2020 (UTC)
    I also agree that it's a bad idea. Only the primary entry should be in topic categories. —Mahāgaja · talk 08:22, 19 July 2020 (UTC)

    Okay, consensus sussed out, would you (@Ultimateria, Metaknowledge, Mahagaja) be in favor of a bot that does the opposite? —Justin (koavf)TCM 02:36, 30 July 2020 (UTC)

    I would. It's also worth manually going through and removing other info (I think etymologies are especially common), first checking that it's at the main entry. If someone creates it, I'll go through a list of entries whose only definition line has an alt form/spelling template and includes etymology or translation sections. I think topic cats are very likely to be at both entries, so I'm not concerned about removing those by bot. Ultimateria (talk) 02:54, 30 July 2020 (UTC)

    ; doesn't workEdit

    NOt even following the link from semicolon --Backinstadiums (talk) 19:32, 19 July 2020 (UTC)

    @Backinstadiums: It's a server bug that has been reported at phab:T238285. — Eru·tuon 19:50, 19 July 2020 (UTC)
    Meanwhile [2] --Backinstadiums (talk) 19:53, 19 July 2020 (UTC)
    I managed to sort of fix the link at semicolon by having it link to one of the other semicolon-like characters ("") on that page, which redirects to the correct page. It's a real amateurish kludge, but it gets where it needs to go (Oddly enough, the Greek question mark (";") is the only one of those characters that gets the automatic redirect, and that's the only one that's semantically a not a semicolon). Apparently redirects work okay. Once you get to that page, you can edit it, but it goes to the main page instead of displaying the new version (your edit does show up in the edit history).
    Just to see if I could, I created a redirect at Category:English terms spelled with ﹔ that goes to Category:English terms spelled with ;, but that only works if you search for or link to it- the category link on the entry page goes to Category:English terms spelled with. I'm still tinkering with manually changing urls for other pages to see what else I can get access to. Chuck Entz (talk) 20:58, 19 July 2020 (UTC)
    Here's Pages that link to ";" (you have to modify the url for some other whatlinkshere query to get this). Also, I notice from a usage note that semicolon is most commonly used in Greek online instead of the dedicated Greek question mark, so that explains the automatic redirect. Chuck Entz (talk) 21:26, 19 July 2020 (UTC)
    It's not exactly a redirect; the Greek question mark is normalized to a semicolon by the MediaWiki backend, so it is replaced with a semicolon whenever anyone tries to insert it into a page or use it in a title. The character was added in version 1.1 in 1993 so Unicode probably decided on its normalization early on (though I'm not sure when they came up with normalization). As with other characters that are changed by normalization, the Greek semicolon can only be displayed by using a HTML character reference (;). — Eru·tuon 20:26, 20 July 2020 (UTC)
    On the talk page for semicolon, someone posted the suggestion to use en.wiktionary.org/w/index.php?title=;&redirect=no, which does work. Chuck Entz (talk) 21:33, 19 July 2020 (UTC)
    We should probably just make it an unsupported title. DTLHS (talk) 20:30, 20 July 2020 (UTC)

    normalizing the position of {{crh-latin-verb}}Edit

    Can we have a bot move all instances of misplaced {{crh-latin-verb}} preceding the definition down, and create a ====Congugation==== header? This should look like this. Allahverdi Verdizade (talk) 22:45, 19 July 2020 (UTC)

    @Allahverdi Verdizade I ran a bot to do this. Let me know if it missed anything or messed anything up. Benwing2 (talk) 04:09, 23 July 2020 (UTC)
    @Benwing2 Thank you! Allahverdi Verdizade (talk) 15:20, 23 July 2020 (UTC)

    Search-fu: searching for words ending in XEdit

    It's easy enough to find words in language ABC that start in XYZ: just go to the category for terms in that language.

    • But what if I wanted to find all words in language ABC that end in XYZ? Any ideas on how to do that?

    Advanced topic: For example, say I wanted to find all Japanese terms where the romanization ends in -ita. Japanese hiragana is a syllabary, so if I tried searching by kana (syllabic letter), I'd have to find all terms ending in each of いた・きた・した・ちた・にた・ひた・みた・りた. Or, for Korean, hangul is alphabetic, but it's composed into set glyphs that comprise multiple individual jamo (letters). If I tried searching by jamo for all Korean words ending in -i, it'd be a similar mess (I don't have a Korean IME installed, so I'll forgo listing examples).

    • How would I find all words in language ABC that uses a non-Latin non-alphabetic, where the romanization ends in XYZ?

    Curious, ‑‑ Eiríkr Útlendi │Tala við mig 03:58, 21 July 2020 (UTC)

    https://dixtosa.toolforge.org/ Chuck Entz (talk) 06:29, 21 July 2020 (UTC)
    Thank you, Chuck! Cool stuff.
    Unfortunately, it doesn't seem to work for romanizations. For example, if I search for ri in the category Korean_lemmas, I'd hope to see entries such as 머리 (meori) and 다리 (dari). Instead, I get nothing. I have to enter as my search string. Testing also confirms that I cannot used uncomposed jamo for searching -- attempting to search for (the lone jamo representing /i/) finds only the [[]] page.
    Any other possible search-fu moves? :) ‑‑ Eiríkr Útlendi │Tala við mig 21:39, 21 July 2020 (UTC)
    I don't know of a way to search romanizations, but I could probably make a Toolforge site for it if I can motivate myself. The backend for the site would need a minimal Lua module infrastructure to generate the transliterations for the search index, which could be a little complicated. A search engine just for titles, which would allow you to find Japanese entry titles matching a regex [いきしちにひみり]た$, would be easier, because I already have a program to generate an index of entry titles for every language based on language headers. (Unfortunately the index currently doesn't list the Han-script entries for the various Chinese languages that are under the Chinese header.) The MediaWiki intitle:// search feature would allow suffix searches if the developers added support for $, which is puzzlingly missing.
    It would be an interesting project. Not sure what the name of the Toolforge site would be. Maybe wiktionary-entry-names. I worry about it not being more general, in case I come up some other idea besides searching entry names and romanizations of entry names and want to include it. — Eru·tuon 08:04, 23 July 2020 (UTC)

    Language settings at menuEdit

    I know it is not en.wikt-specific, but there is something wrong about the Language settings - left hand menu.

    • Choosing language: but we still get the endonym if we click set in English e.g. Ελληνικά instead of Greek, etc. Some scripts are incomprehensible...
      alphabetically in the chosen language:
      #if set in English →     English - en / Greek - el
      #if set in Greek →     Αγγλικά - en / Ελληνικά -el
    • It would be nice to have an extra feature by code + in the chosen language e.g.
      el - Greek
      el - Ελληνικά

    Plus a) choose the languages you wish to view and b) view all languages Thank you ‑‑Sarri.greek  | 01:33, 22 July 2020 (UTC)

    Template:sisterprojectsEdit

    Template:sisterprojects edit request. Please replace Wikiversity's "Free learning tools" with "Free learning resources". See Wikipedia:Template:Wikipedia's sister projects and associated discussion. Thanks! -- Dave Braunschweig (talk) 18:57, 22 July 2020 (UTC)

      DoneJberkel 22:06, 22 July 2020 (UTC)

    Specifying "chiefly countable"Edit

    Is there any way, using the standard templates such as "en-noun" or "head", to specify that a noun is "chiefly countable", or words to that effect? I'm talking about the main heading, not an individual numbered sense. Mihia (talk) 20:00, 22 July 2020 (UTC)

    tlb (Template:term-label) is just like lb but covers the whole term instead of one sense. Equinox 20:52, 22 July 2020 (UTC)
    Thank you. Mihia (talk) 21:18, 22 July 2020 (UTC)

    "In other languages" sidebar - always expanded?Edit

    Is there a way (in preferences?) to make "In other languages" on the left always expanded, so that I could see ALL interwiki links? --Anatoli T. (обсудить/вклад) 05:24, 23 July 2020 (UTC)

    @Atitarev: Go to Special:Preferences, "Appearance" tab, scroll to the bottom and uncheck the box "Use a compact language list, with languages relevant to you.". —Mahāgaja · talk 09:10, 23 July 2020 (UTC)
    @Mahagaja: Great, thank you! It makes much easier to go to a specific interwiki link while editing. --Anatoli T. (обсудить/вклад) 09:19, 23 July 2020 (UTC)

    homonyms as a new category to Category:Terms by lexical property subcategories by language?Edit

    I wonder if there is a way to have homonyms of a language collected in a category, namely those pages that have the "Etymology 2" string within the section of a given language. I suppose there must be a way to make the software populate categories, in the worst case by means of a bot (even though it'd have to be updated time and again like for anagrams) or perhaps, more ideally, by means of a template or maybe a {{cln}} category. I'm familiar with Hungarian entries and I know there are hundreds (if not thousands) of terms that could be included: it would be useful for language learners (e.g. to help them get used to different ways of parsing words). I'm pretty sure it would be useful and interesting for other language editions as well. Any ideas? Adam78 (talk) 19:43, 24 July 2020 (UTC)

    "Prank vandalism" abuse filter blocking an editEdit

    I am attempting to change dragon's beard candy to the following content:

    ==English==
    {{wikipedia}}
    [[File:Dragons beard candy.JPG|thumb|A container of '''dragon's beard candy'''.]]
    
    ===Etymology===
    {{calque|en|zh|-}} {{zh-l|龍鬚糖}}.
    
    ===Noun===
    {{en-noun|head=[[dragon]][['s]] [[beard]] [[candy]]|-}}
    
    # A traditional [[Chinese]] [[sweet]] similar to [[halva]] or [[cotton candy]] made from [[fine]] [[white sugar]], peanuts, [[desiccated]] [[coconut]], white sesame seeds, [[corn syrup]] and [[glutinous rice]] [[flour]].
    
    ====Translations====
    {{trans-top|traditional Chinese sweet}}
    * Chinese:
    *: Mandarin: {{t|cmn|龍鬚糖}}, {{t|cmn|龙须糖|tr=lóngxūtáng}}
    {{trans-mid}}
    * Finnish: {{t|fi|[[lohikäärmeen]] [[parta]]}}
    * Portuguese: {{t-needed|pt}}
    {{trans-bottom}}
    
    [[Category:en:Sweets]]
    

    Unfortunately, I am met with the message "Error: This action has been automatically identified as harmful, and therefore disallowed. If you believe your action was constructive, please inform an administrator of what you were trying to do. A brief description of the abuse rule which your action matched is: Prank vandalism".

    Can anyone provide any guidance or information? Additionally, the error message would probably be more helpful if it linked to the Grease Pit and to a page that explains edit filters or the specific one in more detail. Thanks for any help and please ping me. —The Editor's Apprentice (talk) 20:19, 25 July 2020 (UTC)

    @Chuck Entz You last touched this private filter. Vox Sciurorum (talk) 15:11, 27 July 2020 (UTC)
    I was able to make the edit without issue, perhaps because I'm an admin, though of course that shouldn't be a prerequisite for such a simple edit. —Mahāgaja · talk 15:26, 29 July 2020 (UTC)
    @Mahagaja: I did a bit of research for information about the abuse filter and have come upon some details. First, the specific filter which blocked my edit was abuse filter number 49. Since its visibility is set to "private" I am unable to view almost any details about the filter. The only ones that I can see are those listed in the Special:AbuseFilter table. These details show that the filter was last modified by Chuck Entz at 23:38, July 27, 2020, about 8 hours after being pinged by Vox Sciurorum, presumably in response to the ping and my post. They further show that the filter is currently disabled. This explains why you were able to make the edit without issue. Something that I personally find interesting is that the change Chuck Entz made is not recorded in the contributions log. I hope this information is helpful and/or interesting to you, it was for me. —The Editor's Apprentice (talk) 18:09, 31 July 2020 (UTC)
    The filter looks for a couple of specific words and variations thereof that people insert randomly into text as a prank. Sometimes a large block of text will coincidentally include words containing parts that match variations of those words in the correct order. This is extremely, extremely rare, but I disabled the filter until I can figure out how to keep it from happening at all. The practice I'm trying to prevent isn't common (though those who do it are trying very hard to be subtle and non-obvious), so it's not worth the disruption. Chuck Entz (talk) 18:24, 31 July 2020 (UTC)
    You could enable the filter but have it not block the edit, and check the list of filter matches every once in a while to see what needs to be reverted. Vox Sciurorum (talk) 14:10, 2 August 2020 (UTC)

    Transcription errorsEdit

    Even though I specified the transcription as outlined at Template:alter at өсүмдүк, the untranscribed text was still returned. I also tested it with Template:link and the same thing happened so it's a problem with a module somewhere. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 03:09, 27 July 2020 (UTC)

    kk has "override_translit = true" set in Module:languages/data2 which determines the behavior you are seeing. DTLHS (talk) 03:56, 27 July 2020 (UTC)
    I came up with a solution: the transliteration function in Module:ky-translit now returns nothing if the text is in Arabic script, so your manual transliteration is shown. — Eru·tuon 04:59, 27 July 2020 (UTC)

    Verb and Noun templates for the Norwegian languageEdit

    Hi! Not too familiar with the discussions on Wiktionary so hope I am posting this in the right place. If not let me know.

    I want to request templates for Norwegian (specifically Bokmål) for verbs and nouns, and maybe even adjectives. Both Danish and Swedish have their own, but Norwegian has been left out! For example, if you look at the Danish entry "have" (Which is both a verb and a noun), you will find templates for both the noun and the verb, as well as the "main" forms written out next to the base form. Norwegian entries only have these main forms written at the top, but never a template with all the forms. Certain verb forms are never included at all in Norwegian entries, such as the non-finite forms, which could add at least 6 forms to a verb, which Wiktionary readers would not know. For nouns, the genitive could be added, as they are never included in Norwegian entries either. The way it is now, is very inconsistent. Some entries have a lot of information, others have very little, as there is no unifying template giving some kind of pointer as to how much information is required. Hope someone can help out :) Supevan (talk) 05:30, 29 July 2020 (UTC)

    For verbs, the template would need the following forms: Infinitive (active), infinitive passive, present indicative active, present indicative passive, past indicative active, past indicative passive, subjunctive, imperative, imperfective participle, perfective (masculine+feminine+neuter if applies) and perfective plural/definite.
    Looks like the Swedish wiktionary already has several templates for Norwegian at sv:Mall:no-subst for nouns, verbs, articles and pronouns. These could be adapted, I still have yet to figure out how templates work on MediaWiki. Kritixilithos (talk) 15:08, 2 August 2020 (UTC)
    Wow, you're right! They look pretty great, I hope someone could adapt them to English Wiktionary, I have no idea how that works, but would love to be able to use them. Supevan (talk) 08:30, 3 August 2020 (UTC)

    Disabling LQT without losing archivesEdit

    Does anyone know if there's any way to remove the old LiquidThreads system from a page without losing all of the old threads? --Yair rand (talk) 05:53, 29 July 2020 (UTC)

    (If not, is it possible to move the old LT page to an archive subpage and then 'reformat' the 'main' page?) - -sche (discuss) 18:25, 31 July 2020 (UTC)

    Feedback on Quiet Quentin with {{quote-text}} templatesEdit

     
    Example screenshot

    I made a modified version of QQ that uses {{quote-book}} and {{quote-journal}} instead of raw wikitext. I've accounted for all the edge cases I could find, but there are probably more out there. Is this something anybody wants?

    You can test it out by disabling QQ in your preferences and adding the following to your /common.js:

    importScript('User:Enoshd/QQ-test.js');
    importStylesheet('User:Enoshd/QQ-test.css');
    mw.loader.load(['jquery.ui']);
    

    —Enosh (talk) 14:49, 30 July 2020 (UTC)

    @Enoshd: I have tested out the code by following the instructions you provided and it has worked well for me. Thanks for doing the work to create what is, in mind, and improved version of Quiet Quentin. If you are taking requests, there are a few capabilities that I would enjoy being added to gadget. —The Editor's Apprentice (talk) 22:40, 30 July 2020 (UTC)
    @The Editor's Apprentice: Glad to hear and happy to take requests. —Enosh (talk) 07:50, 31 July 2020 (UTC)
    @Enoshd: Awesome. The first thing that comes to mind for me about Quiet Quentin (or I guess, more accurately, your modified version of it) is that the height that it opens to when a search is conducted is about the height of my screen and so by default the "more results" link is off my screen. The corresponding feature that I would like is that the last set dimensions for the window are remembered from page to page so that I only have to change the window size once or that the default window height is reduced. Another has to do with the fact that the end of quotations are currently appended with &nbsp;.... The documentation pages for Template:nb... and Template:... recommend that Template:nb... be used instead to aid in the differentiation of ellipses that are present in the original text and those which are added by editors. In the context of dating quotes, it looks Google Books provides a standardized YYYY-MM-DD format for sources so I would appreciate month and day information also included in the quotation code generated. Thanks again. —The Editor's Apprentice (talk) 17:20, 31 July 2020 (UTC)