Brahmi Transliteration
edit@Kutchkutch, Erutuon, -sche: What was the rationale for forking Module:Brah-translit to yield Module:sa-Brah-translit rather than adding a few lines at the end to convert the transliteration to IAST for the languages that use it? I thought the rationale for having Module:translit-redirect was that a page would only need to load one Brahmi transliteration module. As an example of a less elegant mechanism, Module:pi-translit (as of this morning not yet hooked into the transliteration system) invokes Module:Brah-translit and Module:si-translit and then converts their outputs to IAST. I was planning, once Module:pi-translit is hooked in, to incorporate conversion code in the Brahmi and Sinhalese modules to convert transliteration of Pali and Sanskrit to IAST. --RichardW57m (talk) 12:21, 2 June 2021 (UTC)
- @RichardW57, RichardW57m: The rationale for forking Module:Brah-translit to yield Module:sa-Brah-translit was that Module:Brah-translit wasn't working for Sanskrit as is, and I'm not proficient enough at coding to
add a few lines at the end
of Module:Brah-translit. If you know how to modify Module:Brah-translit so that it can replace Module:sa-Brah-translit at Module:translit-redirect/data without changing the current arrangement for Sanskrit (or causing errors with the languages that currently use Module:Brah-translit), then please go ahead with your planning (notifying @SodhakSH). Kutchkutch (talk) 09:54, 3 June 2021 (UTC)- I know nothing/very little about module coding to be able to do this. 🔥ಶಬ್ದಶೋಧಕ🔥 10:50, 3 June 2021 (UTC)
- I can make the changes. The hard bit will be extending the test cases, but I think I've worked out the key undocumented bit of the testcase utility Module:transliteration module testcases, namely that it will accept multiple languages as well as multiple scripts in a single call. Did you, @Kutchkutch, have any Sanskrit test cases in mind? I'm planning to use my Brahmi testcases from Module:pi-translit/testcases.
- @Kutchkutch, Erutuon, -sche: I've now made the changes, and while I was at it, added 6 more scripts to the list of those transliterated for Sanskrit: Burmese, Khmer, Lao, Sinhalese, Tai Tham and Thai. --RichardW57 (talk) 22:02, 5 June 2021 (UTC)
- I made heavy weather of extending Module:Brah-translit/testcases, which uses Module:transliteration module testcases. I ended up extending it to Module:per lang transliteration module testcases by allowing a different language in each test case and adding an option
|func_with_link=
to allow the user to provide a function that does all the formatting of the 'example' column in the outputs. That seems a lot of heavy work compared to the seemingly light coding in Module:si-translit/testcases, which also handles two different transliteration standards for one script. My question now is whether I should integrate the enhancements back into Module:transliteration module testcases, or just convert the testcase module to work like the Sinhalese one. I don't want the new support module to hang around for long. --RichardW57 (talk) 22:02, 5 June 2021 (UTC)- @RichardW57: If the enhancements don't necessitate any changes in the testcase modules that currently use Module:transliteration module testcases, I'd say go ahead. — Eru·tuon 22:35, 5 June 2021 (UTC)
- @Erutuon: Incorporated, tested and documented. --RichardW57 (talk) 12:23, 6 June 2021 (UTC)
- @RichardW57: If the enhancements don't necessitate any changes in the testcase modules that currently use Module:transliteration module testcases, I'd say go ahead. — Eru·tuon 22:35, 5 June 2021 (UTC)
Suffix Index
edit@Benwing2, Daniel Carrero, DCDuring, Dixtosa, Erutuon, Rua, Vriullop: https://dixtosa.toolforge.org has stopped working with the error Connection failed: Unknown database 'enwiktionary_p'
. Is there a way to fix this or a workaround? Kutchkutch (talk) 10:02, 3 June 2021 (UTC)
- I looked at the code for toolforge:dixtosa through my Toolforge account. Yeah, it looks like it can be fixed. The tool uses a Toolforge database and the database setup has changed as described here, breaking the old code. The new directions are at wikitech:Help:Toolforge/Database#PHP (using MySQLi). There is another change that I would make to the code, but User:Dixtosa would have to give me access. — Eru·tuon 18:18, 3 June 2021 (UTC)
- Added you as a maintainer. Dixtosa (talk) 21:39, 4 June 2021 (UTC)
- @Dixtosa, Kutchkutch: Thanks! After a bit of wrestling with PHP, I've gotten it working. Now it's also less vulnerable to SQL injection and you can search for words ending in
%
if you want. — Eru·tuon 01:24, 5 June 2021 (UTC)- @Dixtosa, Erutuon: Thanks for fixing and improving the tool! Kutchkutch (talk) 09:09, 5 June 2021 (UTC)
- @Dixtosa, Kutchkutch: Thanks! After a bit of wrestling with PHP, I've gotten it working. Now it's also less vulnerable to SQL injection and you can search for words ending in
- Added you as a maintainer. Dixtosa (talk) 21:39, 4 June 2021 (UTC)
Looking up templates and their attributes
editI see a template use that has an attribute who's meaning is not obvious to me from the name. How do I lookup the attributes that a particular template supports and what they mean? - Dough34 (talk) 14:21, 3 June 2021 (UTC)
- We call them parameters. The template ought to have documentation explaining what each of its parameters does, but alas very many templates here are lacking such documentation. Sometimes if you click "Edit" and view the source code you can figure it out: parameters are always enclosed in triple brackets. However, sometimes it still isn't clear, and of course a whole lot of our templates are dependent upon Modules, so viewing the template's source code is unhelpful. In that case, there's often nothing for it but either (1) to see what entries use the template (using Special:WhatLinksHere and see what that parameter does in the entries where it's used, or (2) ask the author of the template (if they're still around) or someone else. Which template and which parameter are you wondering about? —Mahāgaja · talk 19:11, 3 June 2021 (UTC)
- The template plural of has a parameter nocat. How do I find the source code for plural of, or any other template? I'm looking at the page dead woman walking and clicked on the green undefined plural link. - Dough34 (talk) 19:34, 3 June 2021 (UTC)
- @Dough34:
|nocat=
is a very common parameter used in a large number of templates. Setting|nocat=1
prevents the template from adding the term to the category that it otherwise would add the entry to. However,{{plural of}}
doesn't actually categorize entries it's added to, so in that case,|nocat=
isn't doing anything at all, which is presumably why the documentation for{{plural of}}
doesn't mention it. The way to find the source code for a template is to click the Edit button. —Mahāgaja · talk 19:51, 3 June 2021 (UTC)- @Mahagaja: As the
|nocat=
parameter does not do anything, I requested it to not be added to plural forms automatically any more, which is yet to be done. J3133 (talk) 21:45, 4 June 2021 (UTC) - I will ping @DTLHS, who has edited the module. J3133 (talk) 21:56, 4 June 2021 (UTC)
- I removed it myself (Special:Diff/62693241). J3133 (talk) 21:47, 7 June 2021 (UTC)
- @Mahagaja: As the
- @Dough34:
- The template plural of has a parameter nocat. How do I find the source code for plural of, or any other template? I'm looking at the page dead woman walking and clicked on the green undefined plural link. - Dough34 (talk) 19:34, 3 June 2021 (UTC)
Edit request at MediaWiki:Newarticletext
editKindly change <center>
to <div class="center">
and </center>
to </div>
to fix the obsolete tag --Minorax (talk) 10:31, 5 June 2021 (UTC)
- Done! — Eru·tuon 20:39, 5 June 2021 (UTC)
- @Erutuon Can you do the same for MediaWiki:Noarticletext? Thanks in advance. --Minorax (talk) 11:11, 6 June 2021 (UTC)
- @Minorax: Yep, also done. And I think that's all the center tags in the MediaWiki namespace. — Eru·tuon 21:24, 6 June 2021 (UTC)
- @Erutuon Can you do the same for MediaWiki:Noarticletext? Thanks in advance. --Minorax (talk) 11:11, 6 June 2021 (UTC)
Links from categories
editHello, happy Sunday from el.wikt. I see that words in your Cateogires link precisely to their language section. Is this done with a module, or some other trick? Thank you ‑‑Sarri.greek ♫ | 05:26, 6 June 2021 (UTC)
- @Sarri.greek: Hello! What we do is insert the HTML code created by
require "Module:utilities".catfix(language_object, optional_script_code)
into category pages, and then run MediaWiki:Gadget-catfix.js to transform the category links. The gadget is installed here so that it is loaded efficiently by the server. The catfix HTML is inserted either by calling the module function in the module functions that generate category descriptions (for instance in many categories that are added by{{autocat}}
) or by using{{catfix}}
in a page. So it's a combination of a module and a JavaScript gadget. — Eru·tuon 06:16, 6 June 2021 (UTC)- O! @Erutuon, thank you. You make it sound sooo easy :) I'll study it. ‑‑Sarri.greek ♫ | 06:18, 6 June 2021 (UTC)
- Ο :( ‑‑Sarri.greek ♫ | 06:29, 6 June 2021 (UTC)
Automatic categorization of irregular plurals
editCategory:English irregular plurals is small considering Category:English nouns with irregular plurals has more than 14000 entries: it is not feasible for someone to categorize these manually. J3133 (talk) 19:43, 6 June 2021 (UTC)
Edit request: Module:bo-pron
editCould
if args["zeku"] or args["labrang"] then textHide = textHide .. "\n* [[w:Amdo Tibetan|Amdo]]" if args["zeku"] then textShow = textShow .. "\n* [[w:Zêkog County|Zêkog]]: " .. ipaFormat(args["zeku"], true) textHide = textHide .. "\n** (''[[w:Zêkog County|Zêkog]]'') " .. ipaFormat(args["zeku"]) end if args["labrang"] then textShow = textShow .. "\n* [[w:Xiahe County|Bla-Brang]]: " .. ipaFormat(args["labrang"], true) textHide = textHide .. "\n** (''[[w:Xiahe County|Bla-Brang]]'') " .. ipaFormat(args["labrang"]) end end
be changed to
if args["zeku"] or args["labrang"] or args["arik"] then textHide = textHide .. "\n* [[w:Amdo Tibetan|Amdo]]" if args["zeku"] then textShow = textShow .. "\n* [[w:Zêkog County|Zêkog]]: " .. ipaFormat(args["zeku"], true) textHide = textHide .. "\n** (''[[w:Zêkog County|Zêkog]]'') " .. ipaFormat(args["zeku"]) end if args["labrang"] then textShow = textShow .. "\n* [[w:Xiahe County|Bla-Brang]]: " .. ipaFormat(args["labrang"], true) textHide = textHide .. "\n** (''[[w:Xiahe County|Bla-Brang]]'') " .. ipaFormat(args["labrang"]) end if args["arik"] then textShow = textShow .. "\n* [[w:Qilian County|Arik]]: " .. ipaFormat(args["arik"], true) textHide = textHide .. "\n** (''[[w:Qilian County|Arik]]'') " .. ipaFormat(args["arik"]) end end
to add the Arik dialect of Amdo Tibetan? --沈澄心✉ 12:53, 9 June 2021 (UTC)
- @沈澄心: Done. — Fenakhay (تكلم معاي · ما ساهمت) 00:05, 10 June 2021 (UTC)
Do citations indicating the existence of a term or phrase that are for an inflected form go under the inflected form, or the base form?
edit(Sorry if this isn't the right area to bring this up in. I don't really bring this kind of matter up often, so I am not quite certain where it best belongs.)
I recently created entries for all bedlam breaks loose and all bedlam broke loose. Since all bedlam breaks loose is the base form, I put the citations under that form. However, the citations themselves are actually for all bedlam broke loose, not the base form.
Is it proper to put them under the base form, or is it preferred for them to be put under the inflected form? Tharthan (talk) 22:16, 9 June 2021 (UTC)
- It's my understanding that you put them under the base form unless there's a reason to put them elsewhere. (I believe the correct place to go for this answer is Wiktionary:Information_desk.) --Geographyinitiative (talk) 22:40, 9 June 2021 (UTC)
- I agree. If there's nothing unusual or unexpected about the inflected form, put the cite at the lemma form. But if it's a rare/dialectal/nonstandard inflected form whose very existence some people might doubt, put the cite at the inflected form. See gots for an example. —Mahāgaja · talk 23:16, 9 June 2021 (UTC)
Why do I view modules like wikitext
editHave i clicked something wrong? At the moment I view module sources as wikitext. Does it happen to anybody else? ‑‑Sarri.greek ♫ | 08:01, 10 June 2021 (UTC)
- @Sarri.greek: It's happening for everyone I think. User:Justinrleung pointed it out to me in Discord. It's a JavaScript bug. I posted about it in phab:T284716. It looks really easy to fix (just change a variable name), so hopefully they'll fix it soon. — Eru·tuon 08:18, 10 June 2021 (UTC)
- Fixed! Thank you @Erutuon for taking care of this! ‑‑Sarri.greek ♫ | 10:49, 10 June 2021 (UTC)
Kharosthi font support
editIs there a font for the Kharosthi script that at least renders the text in a sensible way? According to the Wikipedia support page, the recommended font is Segoe UI Historic, but that seems to be specific to Windows 10.
On my system the word 𐨠𐨂𐨬 (thuva) (wiki template for the text: {{m|pgd|𐨠𐨂𐨬|sc=Khar}}
) displays as if it were "thavu" in the page (the vowel "u" as a little circle thingy on the bottom attaches to the hook-shaped letter for "V", rather than to the cross-shaped letter for "Th"), but it shows up fine in the editing area.
Is there a recommend font/setting/combo solution, preferably without switching to Windows 10? --Frigoris (talk) 17:16, 11 June 2021 (UTC)
- @Frigoris: There is a Noto font for it that you could install; search "Kharoshthi" in the Noto fonts page. (There might also be a full Noto fonts package for your operating system if you want to wipe out many non-displaying characters at once; Ubuntu has one for instance.) Then you might have to add some CSS to your common.css page to force this site to use the font:
.Khar { font-family: 'Noto Sans Kharoshthi'; }
. — Eru·tuon 18:24, 11 June 2021 (UTC)- @Erutuon, thank you! I think I have the Noto font installed, but I never knew you can do these CSS wizardry to apply the font everywhere. According to the Wikipedia page, it's not as good as Segoe, but let me see if it's better than the mojibake soup I'm seeing now... --Frigoris (talk) 09:57, 12 June 2021 (UTC)
Kharosthi transliteration with certain combining modifiers
editCurrently the transliteration module seems unable to deal with Kharosthi modifiers such as "bar above" (U+10A38) or "cauda" (U+10A39). The following texts generate zalgo text in transliteration:
It seems the Kharosthi combining modifiers are not transcribed and are just copied as-is into the Roman-alphabet transliteration, and gets combined to the text somehow. In the second case the "cauda" even combines to the closing parenthesis.
Unfortunately I don't know any Lua to improve the module. Could @AryamanA, Bhagadatta help? Thanks! --Frigoris (talk) 11:22, 14 June 2021 (UTC)
- Do you have a reference for how these words should be transliterated? Is working out what should be done, as opposed to how to do it, part of the problem? --RichardW57m (talk) 12:02, 14 June 2021 (UTC)
- @Frigoris: The cauda mark added to a character is transliterated with an acute accent. I have created entries with cauda in the spelling (e.g.: Niya Prakrit 𐨀𐨂𐨡𐨒𐨹 (udaǵa)) which currently use manual translit.
- If you review my antics at Module:Khar-translit, you'll find that I did try to address this issue a week ago. I added the combining cauda and added a combining acute accent for its transliterated countpart - I did manage to get it to work, but soon I realized that the functionality of the virama (𐨿) started getting affected. It would no longer transliterate the virama and I found that there were translit errors in Niya and Gandhari entries plus possible module errors in pages that used Module:inc-ash/dial (because
inc-ash
also uses Kharoshti). I had to rollback my own edits and reinstate a previous version of the page. I probably should have asked someone for help (e.g. @Erutuon) but it didn't cross my mind then. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴) 13:26, 14 June 2021 (UTC)
- @RichardW57m: Does this Unicode document help? Proposal for Kharoṣṭhī script. The "bar above" transliterates as a macron above the corresponding consonant letter in the Roman alphabet, and the "cauda" transliterates as an acute accent for any consonant that is not s or ś; for these latter two it is transliterated as s̱ and ś̱, with macron below the letter. However there are complications:
- The conjunct consonant in 𐨐𐨿𐨮 = (in the sequence of bytes rather than display sequence) Khar. letter KA (U+10A10) + Kharosthi Virama (U+10A3F) + Khar. letter SSA (U+10A2E) -> transliteration kṣa. According to the doc, the Kharosthi "bar above" can modify this consonantal conjunct as a whole. So in that case, the transliteration should probably look like k͞ṣ = k + Unicode Combining Double Macron (U+035F) + ṣ. For the transliteration, at least this "letter + combining double macron + letter" sequence is what renders correctly on my system. Well, in summary, the sequence of Unicode characters
- Khar. letter KA (U+10A10) + Kharosthi Virama (U+10A3F) + Khar. letter SSA (U+10A2E) + Khar. sign BAR ABOVE (U+10A38)
- should transliterate to k͞ṣ followed by the correct vowel. This appears to be the only exceptional case with "bar above" based on my reading. A very complicated example is given on Page 4 of the Unicode document: there's a conjunct rj with "bar above", but since the "bar above" conceptionally modifies the Khar. letter JA instead of the whole conjunct, in the translit the macron is only placed above j.
- Macron above ś or other letters with the acute accent (possibly caused by rendering a "cauda") - should it be like ś̄ or s̄́?
- On typographical grounds, I'd go for the latter, which is common enough to often render well. For ś with cauda. I'd be tempted to use ḉ or go Hungarian with a double acute. Test cases? What's current practice? --RichardW57m (talk) 12:47, 16 June 2021 (UTC)
- @RichardW57m: The problem is that ḉ suggests something to do with c, which is not the case. Presumably the transliteration should convey the sense "this is based on ś, with modification", I think. --Frigoris (talk) 14:04, 18 June 2021 (UTC)
- The writing ç is good old pre-IAST for ś. I haven't tracked down what ḉ is used for (and don't say a high-pitched syllabic sibilant).--RichardW57m (talk) 15:19, 18 June 2021 (UTC)
- @RichardW57m: The problem is that ḉ suggests something to do with c, which is not the case. Presumably the transliteration should convey the sense "this is based on ś, with modification", I think. --Frigoris (talk) 14:04, 18 June 2021 (UTC)
- On typographical grounds, I'd go for the latter, which is common enough to often render well. For ś with cauda. I'd be tempted to use ḉ or go Hungarian with a double acute. Test cases? What's current practice? --RichardW57m (talk) 12:47, 16 June 2021 (UTC)
- The Khar. "double ring below" -- this seems to be transliterated just as "double ring below", always combined to the vowel letter in the translit. I truly don't know how they mean it when it lists its transcription just as double ring below...
- I don't see a problem. It just transliterates as U+035A COMBINING DOUBLE RING BELOW. In the example, it applies to an implict vowel, which is par for the course. (One gets some rich combinations of marks in the eastern foothills of the Himalayas.)--RichardW57m (talk) 12:47, 16 June 2021 (UTC)
- @RichardW57m: Thanks, I never knew that one existed. --Frigoris (talk) 14:04, 18 June 2021 (UTC)
- Borrowed from Kharoshthi in the 1920's. Only added to Unicode at the same time (or thereabouts) as Kharoshthi. --RichardW57m (talk) 15:19, 18 June 2021 (UTC)
- @RichardW57m: Thanks, I never knew that one existed. --Frigoris (talk) 14:04, 18 June 2021 (UTC)
- I don't see a problem. It just transliterates as U+035A COMBINING DOUBLE RING BELOW. In the example, it applies to an implict vowel, which is par for the course. (One gets some rich combinations of marks in the eastern foothills of the Himalayas.)--RichardW57m (talk) 12:47, 16 June 2021 (UTC)
- The Khar. anusvara: In the doc it follows the IAST style and transliterates it as ṃ. Then it says the Khar. "dot below" combines to the Khar. character MA. This is obviously a conflict, which can be solved by using the ISO-style ṁ for the transliteration of the anusvara.
- There may not be a conflict. They may occur in different contexts. --RichardW57m (talk) 11:39, 15 June 2021 (UTC)
- Anyway, big thank you to all involved. Kharosthi seems the closest to an actual zalgo script among real-world scripts.. --Frigoris (talk) 15:28, 14 June 2021 (UTC)
- The document, combined with the Unicode Character Database, certainly looks as though it should be enough. However, it would be good to have far more testcases - adding them to Module:Khar-translit/testcases should just be a case of monkey see, monkey do. There are a lot more cases for Module:Brah-translit/testcases, and no serious complications were expected there. --RichardW57 (talk) 19:52, 14 June 2021 (UTC)
- @RichardW57m: Does this Unicode document help? Proposal for Kharoṣṭhī script. The "bar above" transliterates as a macron above the corresponding consonant letter in the Roman alphabet, and the "cauda" transliterates as an acute accent for any consonant that is not s or ś; for these latter two it is transliterated as s̱ and ś̱, with macron below the letter. However there are complications:
- After studying the USE overrides for the script, I now know how to code the transliteration of renderable words, using the elements consonant+nukta+vowels, and intend to do so tonight. (I include virama as a vowel.) As usual, visarga and anusvara can be dealt with afterwards. As Wiktionary pages are stored in form NFC, that simple element expands to consonant+cauda?+(ZWJ?+virama)?+dot_below?+double_ring_below?+bar_above?+vowel_complex?. Only one of 'ZWJ?+virama', 'dot_below' and 'vowel_complex' will be non-empty. This may be overkill - do we get nukta + virama? Have I spotted an earlier virama-related foul-up by Mark Davis? We need the ZWJ for the sequence 𐨫𐨿𐨩 transliterated as 'lý'. I've noticed Allon being published with kṣ̄ rather than k͞ṣ - have you actually seen the latter? --RichardW57m (talk) 15:19, 18 June 2021 (UTC)
- @Frigoris, AryamanA The Kharoshthi side of transliteration now appears to be working. (In particular, the opening paragraph here has been dezalgoed.) We haven't any torture tests for the typographic niceties. It can fail on a legitimate vowel sequence - up to two short vowels, a length mark and a double ring below. It will paste up a visual warning for an unsupported legitimate vowel sequence when one previews the affected page. The transliterations will need to be added if they're not already there. (It currently supports a, e, i, o, u, ā, ai, ī, au, ū, a͚ and u͚.) I haven't investigated the USE constraints on the combinations of short vowels, and I don't know what combinations occur. --RichardW57 (talk) 22:34, 18 June 2021 (UTC)
- @RichardW57: hey, that was amazing!
- As for anusvara vs "M with dot-below", are we sure it's not a conflict? I can imagine it being problematic for reverse transliteration at least, if that's going to get implemented. For ś with bar-above (macron), it seems the Gandhari.org transcription scheme simply tacks a combining macron upon ś. OTOH their transcription is different from the Unicode doc's in other ways, e.g. they use "d with combining-macron-below" for Khar. "D with cauda", while the doc suggests d with acute accent. --Frigoris (talk) 07:59, 19 June 2021 (UTC)
- @RichardW57:, also, I haven't seen the "double macron" over kṣ, but that was how I read the Unicode doc. My interpretation could be inaccurate. --Frigoris (talk) 08:08, 19 June 2021 (UTC)
- @Frigoris, AryamanA: It seems that it is time to start populating Appendix:Kharoshthi_script, which is currently a red link from Category:Kharoshthi script. I'm currently looking at the transliteration in Andrew Glass's 2000 MA thesis. I think we may also have some encoding issues. Is the conjunct V.HA distinct from the consonant VHA (new at Unicode 11.0)? They're not canonically equivalent. We have the conjunct at 𐨀𐨂𐨮𐨬𐨿𐨱 (uṣavha, “bull”). I think it's time to start cataloguing the consonant-nukta combinations and their transliterations. I've a feeling the 'h' of aspirates shouldn't take the accents marking the nuktas, but that issue may not arise in practice. --RichardW57 (talk) 10:06, 19 June 2021 (UTC)
- I read the proposal's remark about nuktas applying to a whole conjunct as referring to the Kharoshthi script, rather than the transliteration. It read to me as primarily rendering advice. Additionally:
- It may not be easy to apply a nukta to part of a conjunct rather than a whole conjunct.
- Some conjuncts were probably perceived as unitary characters.
- So far, in transliteration, I've only seen the diacritics on the final element of conjuncts. I'm still reading the MA thesis. --RichardW57 (talk) 10:06, 19 June 2021 (UTC)
- I think I may have to strip out my cunning conversion of <acute, macron> to <macron, acute>. The thesis seems to consistently write ɡ́̄ rather than ɡ̄́. --RichardW57 (talk) 10:06, 19 June 2021 (UTC).
- As precedent, note that IAST dropped the distinction between the syllabic consonant and the retroflex lateral. --RichardW57 (talk) 10:06, 19 June 2021 (UTC)
- @Frigoris, AryamanA The Kharoshthi side of transliteration now appears to be working. (In particular, the opening paragraph here has been dezalgoed.) We haven't any torture tests for the typographic niceties. It can fail on a legitimate vowel sequence - up to two short vowels, a length mark and a double ring below. It will paste up a visual warning for an unsupported legitimate vowel sequence when one previews the affected page. The transliterations will need to be added if they're not already there. (It currently supports a, e, i, o, u, ā, ai, ī, au, ū, a͚ and u͚.) I haven't investigated the USE constraints on the combinations of short vowels, and I don't know what combinations occur. --RichardW57 (talk) 22:34, 18 June 2021 (UTC)
Pali Transliteration Issues
editNow that we finally have automatic Pali transliteration available, I am now hitting problems because of the range of writing systems seen in the wild. Specifically, issues are arising because of the following features:
- There are two basic writing systems in the Thai and Lao scripts, one of which uses implicit vowels and the other which writes vowels explicitly. Occasionally a valid sequence of characters could be in either system, and the reading of the word depends upon which one it is.
- The Roman script is one of the scripts used for Pali, and it has spelling rules. For example, one cannot use niggahita (=anusvara) for a homorganic nasal, so if the non-Roman script does, the transliterated form is not a correct Roman script spelling.
- Not all the writing systems can make the distinctions normally made in the writing of Pali. This is most significantly true of most Lao script writing, but I strongly suspect it may also be true of some Tai Tham script writing systems. Consequently, in some regions, two words may be written and pronounced the same, even though the correctly spelt Roman script spellings of the two wrods are distinct. I presume that words that are written the same and pronounced the same should be transliterated the same.
Consequently, the automatic transliteration (which would be correct for another writing system), correct transliteration (dependent on knowing the writing system) and Roman script equivalent may all be different. This is rare, but happens. --RichardW57m (talk) 13:02, 14 June 2021 (UTC)
I have been in the habit of using {{link}}
and {{inflection of}}
to simultaneously link to the word or stem both in the relevant non-Roman writing system and its equivalent in the IAST-based Roman script origin system. I have been using the transliteration output as the link. Now, if it were not for the issues above, setting the 'link_tr' property for Pali would give me exactly what I want. However, this would work incorrigibly badly when the correct transliteration and Roman script equivalent were different. Am I missing a bettwer way of simultaneously linking to non-Roman and Roman script forms of a word or stem? Would someone (e.g. @Benwing2) be open to providing manual control of the link_tr behaviour? --RichardW57m (talk) 13:02, 14 June 2021 (UTC)
Now, I can, and before transliteration, had been, getting the equivalent of manual control by passing the manual transliteration in as the result of {{link}}
. However, this never matches the automatic transliteration, and, before I started implementing a work around, the category of pages with mismatching manual and automatic transliterations had reached the size of 360. I have been experimentally reducing these reports by using the as-yet undocumented template {{pi-nr-inflection of}}
, which has some fairly dirty behaviour so that it can invoke {{inflection of}}
. (Essentially, it has parameters |tr=
to override transliteration and |eqv=
to override the equivalent form in Roman script.) If I follow this approach, I will need a clutch of susbstitutes for other members of cat:Form-of templates and even {{link}}
. --RichardW57m (talk) 13:02, 14 June 2021 (UTC)
I have now created that clutch of replacements:
{{pi-nr-inflection of}}
for{{inflection of}}
{{pi-link}}
for{{link}}
{{pi-mention}}
for{{mention}}
{{pi-alternative form of}}
for{{alternative form of}}
{{pi-alternative spelling of}}
for{{alternative spelling of}}
{{pi-combining form of}}
for{{combining form of}}
{{pi-form of}}
for{{form of}}
{{pi-misspelling of}}
for{{misspelling of}}
So far, I have documented the first four, except for the internal support template {{pi-ml}}
. --RichardW57 (talk) 01:26, 11 July 2021 (UTC)
Merriam-Webster Online template needs documentation, dating
editTemplate:R:Merriam-Webster Online produces the text "(Please provide a date or year)". I cannot see, though, any way to add the date or year to uses of the template. The documentation only mentions entry=
, url=
, and nodot=
. There also appears to be a 3=
, which I guess is for including quotations from the MWO entry. The documentation needs to be updated to describe how this template actually works currently. Perhaps someone wiser than me could also figure out what is causing that "Please provide date" warning. Cnilep (talk) 01:04, 15 June 2021 (UTC)
Adding new pronunciation variants to Module:la-pronunc
editHi, I've been wanting to add some new traditional pronunciations to this module as well as make (see here) what's currently nebulously called "Vulgar" into a default concrete alternative pronunciation called "2nd CE Campanian" while also possibly having a specialized reconstructed proto-Romance (ideally just phonemic), but I'm coding-disabled. I know how to make the current Vulgar be displayed by default, but I'm not sure how to split an existing pronunciation into two, one of which would have to be enabled with the option vul=1
and automatically in the Reconstructed namespace, and the other would be default like current Classical. With Ecclesiastical, I just want to double, triple etc. the current automatic Ecclestiastical output - ideally under a drop-down menu like Ancient Greek (λόγος) and with the general title "Traditional". Starting at line 963 the module generates phonemic and phonetic transcriptions, and the Traditional ones need to output both in a new line, just like all three currently existing ones do.
Additionally, I'd like to be able to add automatic variant outputs to one variety, as with the different syllabification at petra or different pronunciations of y at Syrus without having to manually specify it (alternative syllabification currently requires manual specification as pet;ra
). The second case does probably need qualifiers such as "older, hellenizing" instead of just being given inside the same line, but I'm not sure how to implement this either way. Since I'm not likely to suddenly evolve an understanding of Lua, I have to ask for some pointers if not outright for someone to implement it for me. Can I steal the code from somewhere? Which parts of the module do I need to copy-paste and what do I want to keep track of? Mentioning @Benwing, Urszag, J3133, Erutuon, JohnC5 as having edited the module recently. I figured here is more visible than at the module's discussion page. Brutal Russian (talk) 15:03, 18 June 2021 (UTC)
Request for reference template: Franklin Edgerton's Buddhist Hybrid Sanskrit Dictionary
editI feel that a reference template to Franklin Edgerton's Buddhist Hybrid Sanskrit Dictionary would be valuable for Sanskrit entries and perhaps also for languages that borrows heavily from BHS, such as Chinese. The dictionary is available from U. Köln's Sanskrit Lexicon website, which also hosts the {{R:sa:MW}}
dictionary, among others. Example PDF page output.
The BHS dictionary template would serve a purpose similar to that of the {{R:sa:MW}}
, by showing the headword and linking to the PDF server by page number. I can see several parameters in the MW template useful for this, too, such as the ones for resolving page/column.
The problem is that I don't have the skills with wiki templates and their testing & maintenance, even if I presume there's a lot to re-use from the MW template. I appreciate your help! Thank you! --Frigoris (talk) 15:35, 19 June 2021 (UTC)
- Also, if I understand it correctly, the digital edition hosted by UKöln is licensed under Creative Commons by-nc-sa 3.0, according to this XML document.
Experience on mobile
editCarried over from [[Wiktionary:Grease_pit/2021/March#Wiktionary:Information_desk/2021/March#Always_minimize_all_sections_(mobile_version)]]:
We have reader complains that mobile pages are always expanded which make the pages hard to see. This becomes a serious problem on LongPages with a hundred entries. Especially that TOC does not appear in mobile view.
I have verified with mobile view that this is the case on WT. Section headings always appear expanded.
I also went to the other WikiMedia projects and their section headers are minimized (collapsed).
Anyone knows why the difference in behavior and how to fix it? Need to call in help from meta:?119.56.97.84 05:55, 30 March 2021 (UTC)
- I will go look for a meta developer to see if I can get some opinions. 119.56.103.124 16:58, 31 March 2021 (UTC)
- This is a global setting active on all Wiktionaries (but not on other Wikimedia projects). It was discussed in phab:T63447, and while I’m also embarrassed by this setting when I read Wiktionary on mobile (as I do a lot), I also understand the reasoning there. —Tacsipacsi (talk) 19:46, 1 April 2021 (UTC)
From [[Wiktionary:Beer_parlour/2021/April#collapsed/minimized_language_headers]]:
I am bringing this up again for our attention. On mobile pages, language headers are overtly expanded which makes the pages very long to scroll through.
A 2014 change in phabricator was the cause of Wiktionary headers always being expanded. This is different from other Wikimedias which have collapsed headers, which makes it easy to go to header you want to. (shared by someone from the tech community).
I was not sure how much community agreement the change has.
One suggested solution is to set the headers to collapsed again.
A second proposal is to only collapse entries with more than, say, 5 language headers. Then, shorter pages with less than 5 headers will not be collapsed. This behavior is like the _TOC_ box which only appears when there are about 5 or more language headers in desktop view.
I wonder how many of us write on mobile, but increasing number of people use mobile view to visit Wiktionary. This means overly long pages which are difficult to read are driving away readers and potential contributors. So it is quite an important issue. 119.56.98.229 04:54, 4 April 2021 (UTC)
- I was thinking how many pages and how many visitors are affected by this issue. We can have a look at the most visited pages, have a look using mobile view and imagine how it will look like on a phone/tablet. There should be a link to most visited at Special:Statistics119.56.96.203 06:41, 8 April 2021 (UTC)
- Mobile page has been bad on WMF for a long time. Many current readers and editors still use desktop for full functionality, but it is a reality that more people are accessing the web through mobile. The subpar mobile experience has become such an impediment to wikiwork that complaints are filed with User_talk:Jimbo Wales.
119.56.97.153 19:00, 19 April 2021 (UTC)
- Agreed that the mobile user experience is pretty awful. Sometimes pages load on mobile with all the L2 headers collapsed (user talk pages), which is preferable; sometimes they don't (our forum pages, like WT:TEA), and the page can quickly become unusably long. I don't understand why the behavior is different; it comes across as shoddy programming. ‑‑ Eiríkr Útlendi │Tala við mig 18:46, 20 April 2021 (UTC)
- @EirikrSomeone from phabricator, which is WMF volunteer tech working group, complained that it is too much work on Wiktionary to open up the collasped headers when there are only a few headers on a non-talk page. So all the headers are now expanded by default to meet that guy's requirement.119.56.111.132 13:45, 20 June 2021 (UTC)
- meta:Tech/Archives/2021See April. — This unsigned comment was added by 119.56.111.132 (talk).
- Huh. So I was right -- it is shoddy programming. What's worse, it's intentional. :(
- @Anon, thanks for the link. ‑‑ Eiríkr Útlendi │Tala við mig 19:08, 21 June 2021 (UTC)
This is on one of the biggest and oldest projects on Wikimedia.
On mobile site, the header sections which are usually collapsed on other wikis, always appear expanded on this wiki.
Our readers are complaining about this issue. Sections can be very long and always-expanded sections are hard to read. Links to a section don't position correctly, because the link first goes to the section, then all sections expand after that which makes useless the positioning that happens just before.
Would like guidance on what is causing this, and what can be done to correct this behavior. 119.56.97.84 17:35, 31 March 2021 (UTC)
- Answer at wikt:Wiktionary:Grease pit/2021/March#Wiktionary:Information_desk/2021/March#Always_minimize_all_sections_(mobile_version) (I’d appreciated if you’ve given the link, though). —Tacsipacsi (talk) 19:46, 1 April 2021 (UTC)
- @Tacsipacsi So it was a change on Phabricator. Thank you very much for getting the decision for us. And also for visiting our forum (sorry that I did not think to bring it back).
- For us, would need a look on how many pages this seriously affects (as a percentage), to decide if something needs to be done. I do note that it affected your reading wikt.
- Mobile view has been more neglected even though more people are reading on mobile. It is sad when many editors still stay with desktop view because mobile view is still difficult to use. 119.56.100.135 07:05, 2 April 2021 (UTC)
- If there is community consensus to change an existing configuration setting for a Wikimedia website, then please see Requesting wiki configuration changes how to proceed (plus include a reference to phab:T63447). Thanks! --AKlapper (WMF) (talk) 12:17, 2 April 2021 (UTC)
From MobileFrontend user experience on Wiktionary is rough, T63447 on Phabricator:
Authored By
MZMcBride Feb 16 2014, 9:25 PM
Screenshot of https://en.m.wiktionary.org/wiki/wildcard on an iPhone, 2014-02-16
The MobileFrontend + Wiktionary user experience is pretty rough. Screenshot of https://en.m.wiktionary.org/wiki/wildcard on an iPhone attached. The page almost looks broken. There's only a single language for this entry yet it's collapsed. Owww.
Given that MobileFrontend is the default for mobile devices, I think this is fairly high priority.
I get that it's inconvenient to have an entry with only one section, and that section is collapsed.
However, that is only an inconvenience -- a mild annoyance.
When browsing to a large page, and all the sections are expanded, the page can be unusable.
Trading a mild annoyance for page unusability is a bad trade.
I don't have time to dig, and all the related Phabricator issues I can find at the moment appear to be closed:
- MobileFrontend user experience on Wiktionary is rough
- [EPIC Improve Wiktionary experience on mobile (placeholder)]
Does anyone know where we could discuss this with the back-end maintainers? Or could we fix this somehow without them? ‑‑ Eiríkr Útlendi │Tala við mig 19:17, 21 June 2021 (UTC)
Modi Transliteration
editIs there any reason to keep Module:sa-Modi-translit, for Sanskrit, and Module:Modi-translit, for Old Marathi, separate? There don't seem to be any conflicts yet, and if any arise they can probably be handled by a simple language-sensitive tweak as with Brahmi. Pinging recorded editors - @SodhakSH, AryamanA, DerekWinters, Kutchkutch. Silence is consent. --RichardW57m (talk) 12:35, 22 June 2021 (UTC)
- @RichardW57, RichardW57m: The three languages that are currently defined as using the Modi script in Category:Modi script languages are Old Marathi, Sanskrit and (modern) Marathi. Since Old Marathi and Sanskrit are not modern languages, using the closest approximation to IAST using a single transliteration module is fine.
- For some reason User:DerekWinters / User:Smettems preferred transliterating the anusvara as ṁ instead of ṃ. Tulpule's dictionary of Old Marathi has attempted to differentiate instances of the anusvara as either homorganic nasal consonants or nasalisation of the preceding vowel. Perhaps this attempt at making such a differentiation could be indicated with the
|ts=
parameter.
- Although writing modern Marathi in the Modi script is rare and there is no coverage of it yet, the transliteration of modern Marathi in the Modi script may require a separate transliteration module to account for schwa-deletion and perhaps the phonological pronunciations of ज्ञ (dny) and ऋ (ru) depending on the outcomes of Wiktionary talk:About Hindi and Category talk:Konkani language.
- @Bhagadatta Are you able to corroborate the claim that the Modi script is/was sometimes used for Konkani, Kannada, Telugu, etc. as it says on Wikipedia? Kutchkutch (talk) 12:19, 30 June 2021 (UTC)
- @Kutchkutch: This news article talks about how experts are trying to transliterate Kannada documents written in Modi. As for Konkani, it is likely that Konkani speakers in modern day Goa and Karnataka once used the Modi script. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴) 01:06, 4 July 2021 (UTC)
the edit history at https://en.wiktionary.org/w/index.php?title=Talk:;&action=history shows edits up to 2014, but the talk page is unviewable now. it just returns "Bad title" and behaves as if viewing the main page. Sure we can use talk:semicolon but i wonder if this is a bug that can be easily fixed. thanks, —Soap— 15:19, 27 June 2021 (UTC)
There are also seem to be problems with viewing the definition page ... I made an edit just now but can't get to it from my contribs page. —Soap— 15:23, 27 June 2021 (UTC)
- @Soap: See Phabricator (“Pages whose title ends with semicolon (;) are intermittently inaccessible (likely due to ATS)”). Our temporary kludge is using redirects from “﹔” (e.g., in
{{punctuation}}
). The other URL works (e.g., https://en.wiktionary.org/w/index.php?title=;). J3133 (talk) 15:44, 27 June 2021 (UTC)
Sorting in Pali Categories
editDo we have control of the basic sorting rules in categories? Where do the basic sorting rules come from? They don't look like ICU/CLDR/DUCET to me.
If we have control, we need to sort out the sorting of:
- U+1033 MYANMAR VOWEL SIGN MON II
- 105A;MYANMAR LETTER MON NGA
- 105B;MYANMAR LETTER MON JHA
so that they sort near the corresponding 'Burmese' Burmese script characters. MON NGA does occur in Pali, even if it shouldn't. I think we should also accommodate
- 105E;MYANMAR CONSONANT SIGN MON MEDIAL NA
- 105F;MYANMAR CONSONANT SIGN MON MEDIAL MA
- 1060;MYANMAR CONSONANT SIGN MON MEDIAL LA
just in case. I've seen MEDIAL MA in a Sanskritic spelling; I don't know which the language was. I don't have the privilege to edit the definition of the sort keys. @Octahedron80 --RichardW57 (talk) 04:14, 2 July 2021 (UTC)
U+1028 MYANMAR LETTER MON E already sorts next to U+1027 MYANMAR LETTER E.
- English Wiktionary is simply sorted by codepoint (character's binary value), because the community wants to be like that. So they have to override sorting algorithm on their own. On Thai Wiktionary, we use the defined Unicode collation so less problems occur. However, both methods are not the best for some individual languages that they do in real dictionaries. You can present new sort keys here to be reviewed and I hope it works when it is replaced. --Octahedron80 (talk) 06:02, 2 July 2021 (UTC)