shy edit

You've made the translation say "act shy toward strangers", but the actual English definitions do not say that. The mention of strangers is only a usage example. Equinox 09:26, 30 April 2021 (UTC)Reply

Thanks for the note. I created the new translation section because, although the translations I gave are all valid translations of 'be shy', particularly referring to children, I couldn't see which English definition was a better fit for them. Which do you think would be most appropriate?
Incidentally, the expression given there, "shy bairns get nowt, shy bairns get noot", both refer, superficially at least, to the shyness of infants.
In the meantime, I will also be moving those definitions to "make strange", which is a regional expression used specifically with the meaning of "shy towards strangers".
Edit: I checked again and the correct placement would appear to be with "reserved" (shy with strangers). I'll put those translation equivalents there, unless you have any objections.
Bathrobe (talk) 14:09, 30 April 2021 (UTC)Reply
I've made the changes that I alluded to. However, I've discovered a major problem with this word on Japanese Wikipedia, which is brought about precisely by the issue you mention.
As you point out, shyness in infants is not well covered by the English definition of 'shy' at this entry. However, shyness in infants is one of the legitimate uses of this word.
At English Wikipedia, the two senses of shyness are covered at two separate articles, "Shyness" (the personality trait) and "Stranger anxiety" (in infants).
At Japanese Wikipedia, the two are conflated into one article, 人見知り, which starts out: 人見知り(ひとみしり、英: Shyness)とは、従来は子供が知らない人を見て、恥ずかしがったり嫌ったりすることである。大人の場合は「内気」・「照れ屋」・「はにかみ屋」・「恥ずかしがり屋」の言葉をあてるのが標準的である。社会心理学では、社会的場面における上記のような行動傾向をシャイネスという[1]。
The meaning is: "Hitomishiri (English 'shyness') originally refers to the way that a child acts bashful or shows dislike upon seeing a stranger. In the case of an adult, the standard way of referring to this is uchiki, tereya, hanikamiya, hazukashigariya. In social psychology, a tendency to such behaviour in a social setting is called shainesu (i.e. shyness)".
The rest of the article proceeds to discuss shyness as a problem in adults.
The English Wikipedia article that this links to is "Shyness", which discusses ONLY the issue of shyness in adults. Since the Japanese article is entitled 人見知り (shyness in children), I attempted to redirect that to "Stranger anxiety" but was not allowed. In fact, the Japanese article should be retitled Shainesu and hitomishiri in children should not be referred to at all. Do you have any idea how this can be resolved?
The root cause of the problems with the Japanese article is the failure to realise that the English word "shy" (for reserved) has two separate senses. To use the Japanese term 人見知り (stranger anxiety in children) as the title of the article, and then deal almost exclusively with extreme shyness as a personality trait in adults, is an egregious example of taking English semantics as standard and ignoring the native semantics of Japanese.
As you will notice, this is also the problem that I tried to address when I set up a separate translation for "shy" at Wiktionary.
Any suggestions are welcome.
Bathrobe (talk) 23:19, 30 April 2021 (UTC)Reply
"I attempted to redirect that to Stranger anxiety but was not allowed." Why were you not allowed? Was any feedback given by Wikipedians? Equinox 08:18, 1 May 2021 (UTC)Reply
I can't reproduce it at the moment; I think I tried an incorrect method of editing (things have changed quite a lot) and was blocked by the system. It basically seemed to be saying that another article already linked to the English article, so I wasn't allowed to link from the article on 人見知り -- although that doesn't make sense because there is no Japanese article that matches the English article on "Stranger anxiety".
I then tried the 'correct' method of editing, which I had trouble with because all the foreign-language articles linked to were about Shyness as a psychological characteristic.
At any rate, changing the link would not have been the right thing to do since everything in the body of the article is about the psychological issue. It appears that the only way of fixing the Japanese Wikipedia article is to hive off "Hitomishiri" (infant shyness) as a stub linking to "Stranger anxiety", and to leave the bulk of the article at "Shainesu" (shyness), which would link to "Shyness" in English. I've left a message on the Talk Page supporting another editor's proposal to rename the article, but there have been no responses as yet. If no one responds within a week or two I might create the new article myself (at one stage the Wikipedia encouraged people to "be bold", although that might have gone by the wayside), which is sure to provoke a reaction and lead to one outcome or other.
Turning back to the Wiktionary article, as I mentioned, I've added translations relating to infant shyness to translation equivalents of "reserved" at the Wiktionary article on Shy. I don't think this is ideal but users have the right to know that it is inappropriate to talk about an infant being "shy" in Japanese using words like 恥ずかしがる (or similar words in other languages). It's simply a hole in the translations caused by the lack of fine-tuning in the English definition.
Bathrobe (talk) 08:43, 1 May 2021 (UTC)Reply

Mongolian script in translation sections edit

Hi - having the Mongolian script in translation sections is clutter, because it takes up a large amount of space and doesn't add anything that the Cyrillic entries won't add (which will obviously include the Mongolian text). We don't generally treat Mongolian script entries as the main entry unless a Cyrillic one simply doesn't exist. Theknightwho (talk) 18:49, 4 September 2022 (UTC)Reply

Thanks for the message. I'm a little puzzled.
1. Is this official Wiktionary policy? (You use "we", which suggests that it is).
2. What is the policy on Simplified and Traditional Chinese? (This is presumably not clutter because it doesn't take up space, but it is definitely clutter because it creates cluttered entries. Theoretically, the Traditional characters are only necessary if they are not inferrable from the Simplified).
Impressed that you found нэрлэх as an abbreviation of нэрлэхийн тийн ялгал.
Bathrobe (talk) 18:56, 4 September 2022 (UTC)Reply
It's not policy - it's just how things have been. The primary issue is the vertical script (though it will be inconsistent, as many things are). If/when we can improve the layout of how translation are given, I'd be fine with it.
Frankly, the way we handle traditional/simplified Chinese in translation sections isn't great and probably isn't something we want to use as a model (e.g. having to give both manually, with no indication). Translation sections in general are a bit of a mess, but not all langauges have to work in the same way. Theknightwho (talk) 19:15, 4 September 2022 (UTC)Reply
By the way - I'm not saying that to imply we can automatically generate Mongolian text, either. Obviously that isn't possible. Theknightwho (talk) 19:17, 4 September 2022 (UTC)Reply
Appreciate your comment.
Obviously "the way things have been" (with previous users' contributions) and "the way I would like them to be" (from now on) are not quite the same thing. (Although there are actually quite a few entries giving Mongol bichig, to which I've added a number.)
However, I do agree with you that it looks messy and I appreciate that it is a matter of concern. Whenever I add Mongol bichig I feel pangs at how bulky Mongolian entries are compared with other languages. On the other hand, the idea that information should be omitted or deleted because it takes up space in the current format makes me uneasy.
Incidentally, I also try to add Inner Mongolian usages in addition to the current Mongolian-centred entries. It feels strange writing these in Cyrillic. (I originally put Mongol bichig first for Inner Mongolian usages until I realised how unwieldy it was. I now invariably put Cyrillic first.)
At any rate, I am now at a bit of an impasse. I would like to have Mongol bichig there -- the rendering engine does a surprisingly good job at rendering it -- but it's obviously no use doing that if it runs against current policy, written or unwritten, and is going to be deleted anyway.
Bathrobe (talk) 19:41, 4 September 2022 (UTC)Reply
Actually, if the rendering engine could render Mongol bichig horizontally -- which is quite possible and can be found in Chinese-language grammars of Mongolian -- the main grounds for your objection would disappear.
Bathrobe (talk) 19:46, 4 September 2022 (UTC)Reply
@Bathrobe I appreciate your comments as well - from your userpage, I get the impression that you're probably much more comfortable using the Mongolian script than I am (Cyrillic is much easier for me), so I will defer to you on things relating to that. Plus the reality is that the number of users that edit Mongolian is very small and sporadic, so I think you're unlikely to run into any problems by adding them.
I think your idea of running it horizontally may be a good idea for translation sections - otherwise it's going to become a nightmare when (if) we get around to adding traditional Kalmyk and Buryat, too.
In terms of how we deal with the parallel scripts, you raise a good point as well. Most languages with multiple scripts can very easily stick to one or the other as the 'main' form, but when there's a geographic division that leads to regionalisms that only exist in one or the other, it is pretty weird to have the wrong script as the main entry. I know it's an issue with quite a few central Asian languages (for obvious reasons). I'm wouldn't be a fan of only including Inner Mongolian senses under bichig entries, as I think that would cause confusion.
The solution that I've come up with is to develop a series of templates that automatically copy the information from one entry to the other. This has the avantage of making neither the primary entry (from the perspective of the viewer), while keeping them both synchronised - as there's nothing worse than trying to maintain two parallel entries for every lemma. That won't apply to editing, where you'll need to edit the "main" entry, but that doesn't seem like a problem to me. Check out ᠪᠡᠭᠡᠵᠢᠩ. Currently, I only have templates for the definition and pronunciation sections (and both need to be refined to cope with multiple etymologies), but it's a start. Theknightwho (talk) 12:01, 5 September 2022 (UTC)Reply
Your example from Mongolian Wiktionary looks great.
One problem on Wiktionary is that English has become a kind of central clearing house -- it seems to be the only one where there is a well-developed system of giving multilingual translations. You would want horizontal renderings only at English (for translations), not elsewhere on Wiktionary.
In html / css it is a cinch to convert Mongol bichig to horizontal format. I don't know how these things would work at Wiktionary. How would it be set up and implemented? How would it be decided? Who would actually do it?
I am curious how you manage to be a lawyer and simultaneously an expert in Mongolian and Chinese (to speak only of the two that I'm aware of).
Bathrobe (talk) 22:44, 5 September 2022 (UTC)Reply
@Bathrobe Just to clarify that that's still an entry on English Wiktionary - though there is a Mongolian Wiktionary as well. It's (written) policy that we only include translation sections on English entries, to prevent having a huge amount of duplication.
It's probably worth asking at the Grease Pit, which is the place for technical questions/requests - someone should pick it up. Ultimately, it will be a case of CSS instructing the browser to render horizontally, but it's a matter of amending the translation template to achieve that.
Calling me an expert is probably a big exaggeration! I've had an amateur interest for quite a long time, and have picked things up over the years. How come you decided to move to Mongolia? That must have been quite a change. Theknightwho (talk) 00:12, 6 September 2022 (UTC)Reply
My mistake, yes, that is on English Wiktionary.
Before asking at Grease Pit it's probably a good idea to ask Wiktionarians what they think. Horizontal is not the normal direction of writing for Mongol bichig; it's only used as a fallback when inserting Mongol bichig into horizontal text. Some people might take exception to it. I'm not sure where or how you could poll people on that.
Long story. Was working in Beijing, posted to Mongolia and started learning the language, returned to Beijing and learnt the old script from Inner Mongolians, decided to return to Mongolia after I stopped working to learn the language properly (vain hope, it's much harder to learn languages as you get older). At any rate, since I've read quite a few stories in Mongol bichig, for me it's a real writing system, not a decorative or heritage script as it is in Mongolia.
And yes, it's a big difference, from an agrarian-based to a pastoral-based culture. Moreover, it's pretty much outside the Sinosphere (although there is more influence from China than Mongolians want to admit). Bathrobe (talk) 06:10, 6 September 2022 (UTC)Reply
This page has some discussion of horizontal left-to-right rendition of Mongol bichig.
https://github.com/w3c/mlreq/issues/6
Bathrobe (talk) 14:32, 6 September 2022 (UTC)Reply
There is always the Beer Parlour, though from experience people tend to leave this up to the editors involved in a language - which is currently you and me, and particularly with smaller issues like this. If someone objects they will usually make it known, anyway. I suppose there is the wider question of how vertical scripts are handled in general in translation sections, but I'm not aware of any others that have active communities. I have done a small amount of experiemental work with Middle Mongolian (e.g. ᠬᠠᠭᠠᠨ (qaɣan)), and I did notice an IP adding some Kalmyk pronunciations the other day, but that's about it really.
If you don't mind me asking, what do you do for work? That sounds like a pretty interesting time that you've had out there. I'm also curious how people feel about the reintroduction of bichig for administrative use - is it popular?
Thanks also for the link. The comment about running the script right-to-left in horizontal mode is an interesting one, although not really appropriate for this, and I would rather we had the vertical script wherever possible (e.g. the title, headword, bulletpoint links etc.). The translation section seems much more suited to horizontal, though, and possibly also the etymology section (where I've generally tried to stick to using Cyrillic if possible, but that isn't always possible/appropriate). Essentially, anywhere where there's running text/interaction with horizontal text.
It also occurs to me that we might want to find some way of refactoring things such as bulletpoints to deal with vertical text better. The large lists of derived terms as seen on хөх (xöx) would look awful with horizontal formatting. Theknightwho (talk) 15:10, 6 September 2022 (UTC)Reply
I think that horizontal should be confined to the translation sections.
Formatting with mixed Cyrillic and Mongol bichig is doable but could be pretty clunky. See http://www.cjvlang.com/mongol/index.html for a not strictly comparable situation.
I am retired, now studying for an MA.
I'm not sure how people feel about the reintroduction of bichig for administrative use. Quite frankly, I think most Mongolians think bichig is a nice heritage script but Cyrillic is fine, thank you very much. Of course there is a minority that is quite into bichig (with Facebook groups devoted to it, for instance), but as in most places your average Joe isn't into culture that much. The new rules will force public servants to use bichig more but it will mostly be lip service. For instance, even now birth certificates in Mongolia are in both Cyrillic and bichig, but in actual practice the bichig is completely ignored by officials administering the system.
I don't think Mongolians are taught bichig very well. They teach it far better in Inner Mongolia. Children have to read bichig 'as spelt' for two years in Inner Mongolian primary schools, only being allowed to convert it to the current pronunciation in their third year. That means pronouncing ᠬᠠᠢᠮᠭ᠎ᠠ as 'hamiga' rather than 'haana' ('where'). It's a very effective way of inculcating irregular spellings. In Mongolia children are not forced to commit spellings to memory, which means they forget them quite quickly. It's pretty painful for adult Mongolians to read, let alone write, Mongol bichig.
When China decided to effectively phase out bichig I think there was a temporary upsurge in support for bichig but I think that has died down. Thanks to Xi Jinping's assimilationist policies, I believe the days of bichig are numbered, unless something intervenes. The Chinese authorities say that Mongolian will still be used in Mongolian-stream schools, except in Language and Literature classes and History classes (if I remember rightly). Well, the subject they call "Language and Literature" is equivalent to the subject we know as "English" in English-speaking countries, so it sounds as though they are going to stop teaching the Mongolian script and literature altogether.
Bichig has its disadvantages. These include: letters representing multiple phonemes (ᠳ for 'd' and 't', ᠥ for 'ö' and 'ü", etc.) and archaic spellings (representing pronunciations from way back). English has plenty of archaic spellings, too, but I guess that's not really a defence. Cyrillic also has its problems, including a completely messed up verb-ending system, partly caused by the fact that Cyrillic totally ignores epenthetic vowels (Mongol bichig is much better on verb endings), and the preservation of Russian spellings for borrowed Russian words.
Another problem with Mongol bichig in the digital age is the fact that, under the current input and rendering methods, surface representations (what you see in your browser) tells you nothing about the input used to achieve that representation. A person could input ᠳᠤᠷᠵᠢ (Дорж) as D-o-r-j-i (ᠳᠣᠷᠵᠢ), as it should be, or T-u-r-j-i (ᠲᠤᠷᠵᠢ), with no visible difference. This has security implications, including phishing and problems with legal documents.
Anyway, unless they OUTLAW Cyrillic, Mongol bichig will not make a comeback. Socialism was a good system in that sense, because one man (Stalin) had the power to force the Mongolians to switch from bichig to Latin to Cyrillic without much opposition. Bathrobe (talk) 03:27, 7 September 2022 (UTC)Reply
"Language and Literature" is 语文 in Chinese and ᠬᠡᠯᠡ ᠪᠢᠴᠢᠭ (хэл бичиг) in Mongolian. Bathrobe (talk) 03:39, 7 September 2022 (UTC)Reply
Thanks for the insight - in some ways the attitude to bichig in Mongolia reminds of how things are with Irish in Ireland, where there is a general positive feeling towards it from a cultural perspective and it has a nominal status of being equal in every way to English, but it's taught terribly in schools, and most people don't use it except for heritage stuff. Language-revival can certainly be done well (Wales has done an exceptional job), but it's a difficult process.
That's a real shame to hear about in Inner Mongolia, though - but assimilation has always been the Chinese way with everything. I had assumed that there was an economic interest in encouraging Mongolian education, as it obviously assists with economic integration with Mongolia - but I don't know how much China values that, given that Mongolia is a relatively small country at the end of the day.
I know that there are plans to completely redo the encoding system as well, because you're right that the current system has serious problems. I saw this proposal a while back, which looks like a big improvement, but I don't what stage things are at at the moment. It seems pretty urgent (to me) that codepoints like "o" and "u" simply need to be unified. It would make things mildly more difficult to transliterate automatically here at Wiktionary, but we'll cope. Theknightwho (talk) 16:50, 7 September 2022 (UTC)Reply
"I had assumed that there was an economic interest in encouraging Mongolian education, as it obviously assists with economic integration with Mongolia"
That's not how the Chinese think. Any kind of identification by "ethnic minorities" with coethnics abroad is a threat to Chinese unity. It is an extremely sensitive political issue and the Chinese bend over backwards to avoid it. Political considerations trump economic considerations every time. Xi's policies on educating children of ethnic groups in Chinese from the start, at the expense of the ethnic languages, is being applied to all groups that previously implemented native-language education at primary school. If I remember rightly (and I'm not sure), that includes Tibetan, Uyghur, Mongolian, Korean, and maybe Kazakh.
The current encoding system is based on the traditional way that the alphabet is taught to Mongols, although it's alphabetic rather than the traditional syllabic. Once you get the hang of it it's not too hard, but it's very cumbersome in dictionaries because you are expected to look words up by the underlying sound, not by the overt form. So if you don't actually know the word you have to look at multiple places in the dictionary. I can now do that reasonably well, although it was really confusing at the start. The encoding issue is a result of tradition encounters modern computer systems.
Bathrobe (talk) 23:40, 7 September 2022 (UTC)Reply
One example of the political aspect I mentioned above is the naming of China. In Mongolia it's Хятад, which is identified with the Han Chinese ethnic group. In Inner Mongolia it's Дундад улс, a calque on 中国, because the ideology is that China is not just the Han Chinese, it's all the different ethnic groups in China as a unified whole. I once did an interview for an Inner Mongolian TV or radio station in which I used Хятад. The interviewer was visibly perturbed (perhaps that is too strong a word) and quickly corrected me to Дундад улс. Bathrobe (talk) 23:45, 7 September 2022 (UTC)Reply
As you can imagine, China is very jealous of its minority ethnic groups. I have heard that China makes it difficult for foreigners to study Tibetan in China. It has also taken steps to prevent the study of Uyghur outside of China.
As part of ethnic policies early this century, the State sponsored the publication of a range of dictionaries of up-to-date (Inner) Mongolian terminology in different fields, including scientific. What was noticeable was that 1) all terminology was a direct and literal translation from the Chinese, 2) dictionaries disagreed in their translations -- there was no coordination among dictionaries as to how terminology was translated (the workd was probably farmed out to students), 3) Terminology used in Mongolia was studiously avoided, 4) Nobody seems to actually use this terminology; people just use Chinese. Bathrobe (talk) 02:42, 8 September 2022 (UTC)Reply
As a result of the dictionary issue, I've become wary of using translations found in these "up-to-date" vocabulary sources from China.
Another example of Chinese political control is the problem I had with Mongolian-language books (including children's books) in Inner Mongolia. I bought some books in Ulanhot, including illustrated books, and wanted to mail them to Mongolia as my luggage was getting too heavy. The woman in the post office said that she couldn't send the books unless I got certification from the Cultural Bureau that they didn't contain any politically sensitive material (specifically with regard to the Cultural Revolution). The fact that one book had illustrations of a Chinese myth, and that all books had the title and publication details in Chinese inside the back cover in accord with Chinese requirements, didn't make any difference. Since I didn't have time to go to the Cultural Bureau I had to take the books with me.
Not all of this has to do with Wiktionary, but it does indicate some of the issues involved in "vocabulary" in Inner Mongolia.
Bathrobe (talk) 03:02, 8 September 2022 (UTC)Reply

Could you please help me with some Mongolian inflections? edit

Hiya - I’ve been doing quite a lot of work on building a Mongolian inflection template for nouns (just in Cyrillic for now, but eventually in both scripts); I’ve used a few sources to do this, including the 2018 national regulatory dictionary which has around 13,000 lemmas with specified inflections (which is extremely useful for determining the accuracy of my template algorithm). I’m trying to automate it as much as possible, to minimise room for mistakes - ideally so that the only parameters that a user needs to give are whether a stem has an unstable n or g, and whether or not it’s a loanword, as those are the only factors that are truly not possible to determine automatically (or so I thought). Russian loanwords in particular also modify the vowel harmony of у to э rather than а, which doesn’t seem to occur in other loanwords - but this is semi-automatable, as I can check the etymology section automatically. So far, I’m getting an accuracy of between 99 and 99.5% (depending on the noun case) - which is obviously great - but I was wondering if you could help me shed some light on the remaining outliers, as I suspect your Mongolian is much better than mine, so you’ll have a much more intuitive feeling for the language. I’ve also been checking against a corpus of around 700 million words, so I’m reasonably confident that any deviations from the official dictionary are reasonable (which has eliminated most of the inexplicable exceptions it contains).

The three troublesome areas are:

  1. Whether a noun takes a -(а)д or -т in the dative-locative when it ends in a б, в, р or с. My observations show that (almost) all of these take -т when they’re multisyllabic, but there are a tiny number of exceptions. With monosyllabic words, it seems to be pretty random - for example, клубт vs вебд, гэрт vs сард. I’ve read explanations that it’s to do with whether or not the bichig ends in a vowel, but having checked about 50 lemmas I see no correlation at all. It also doesn’t seem to correlate with actual pronunciation, either.
  2. I did extensive testing to check exactly when the vowel in the final syllable is deleted when it occurs before a final consonant. However, I still have a tiny number of holdouts that I cannot explain - particularly those ending in -рав4, where the deletion seems to be pretty random. However, билиг unexpectedly does delete the vowel to become билгийг etc (though maybe it’s just irregular).
  3. I can’t seem to determine exactly when a final vowel is deleted when when adding a suffix. Native stems and very old loanwords always seem to lose it, but with newer loanwords it’s not always clear. Those ending in лба4, мба4, нба4 and нга4 seem to, unless the vowel in the syllable before is of a different vowel harmony (e.g. Тонга becomes Тонгагийн). My understanding was that an unaccented final vowel in a loanword should be dropped altogether even in the base form, but presumably this doesn’t occur with these words due to the quirks in the Cyrillic that already exist with words ending in these patterns (where the vowel is necessary due to the pronunciation not matching the spelling). However, I don’t understand why Гана becomes Ганын or Ботсвана becomes Ботсванын. I understand that the the final а might be necessary to determine the pronunciation of the н as being “n” (not “ng”) in male words, but most words ending in -ана4 don’t seem to drop the final vowel. There are a handful of others that do this, too. Plus, there is also кило inflecting as (e.g.) килээр and авто as автын, despite the final syllables being stressed in both; perhaps they’re just irregular - particularly given the unexpected vowel harmony of кило, which is the only non-compound word I can find that takes a female vowel harmony despite containing a male vowel. Theknightwho (talk) 15:44, 18 October 2022 (UTC)Reply
Hi theknightwho. I'm not ignoring you deliberately but am in an extremely busy period in "real life". I might be a while getting back to you.
Bathrobe (talk) 20:49, 19 October 2022 (UTC)Reply
@Bathrobe No worries at all. I realise I never responded to your previous comments either, so apologies for that! To give you a flavour of the template, it's in use at баатар. Theknightwho (talk) 21:27, 19 October 2022 (UTC)Reply
Every time I feel like goofing off I find myself editing Wiktionary, so I may as well answer your question.
To be honest, I have no idea. I find the Cyrillic script difficult to fathom and difficult to use. As far as I can tell it was implemented hastily and without sufficient thought. Although people seem to think it was a big improvement on the old script (which it was), in fact it failed to create a completely rational system and introduced new problems. A lot of Mongolians spell very badly. I've been told that people educated during the Socialist period didn't make spelling mistakes, which is essentially blaming any problems on the post-1991 education system. Well, maybe.
There are dictionaries for looking up the correct spelling (I have one), and as you may be aware, there is also a government website where you can check whether your spelling is right or not (https://spellcheck.gov.mn). That is an indication of the seriousness of the problem.
I realise this is not an answer to your question. I admire you for trying to create an algorithm for this. It would be completely beyond my technical capabilities. At any rate, I've given up on making sense of this reformed spelling system and can't really help you with the issues that you raise (of which I wasn't actually aware!) Bathrobe (talk) 07:23, 21 November 2022 (UTC)Reply
I've looked at your first question. Сард appears to be a way of disambuating from сарт, which means the same as сартай. As to why this is the case, well, as you suggest, bichig might be playing a role. Mongol bichig for сар 'moon' is ᠰᠠᠷᠠ (sara), so the suffix -д might indicate the presence of a final vowel. I know that some spellings were brought across from Mongol bichig (IIRR, сурч), so wild as it might seem it can't be entirely discounted. But it doesn't help you much.
You mention билгийг, but I notice that бэлэг also drops the vowel (бэлгийг). As does бичгийн.
I realise that I'm not being very helpful here. I suspect that in country names there is indeed confusion between the final а (in native words) to ensure that the pronunciation is /n/, and a final pronounced /а/ in foreign words.
Sorry to be of so little help with your question! Bathrobe (talk) 07:42, 21 November 2022 (UTC)Reply
"I suspect that in country names there is indeed confusion". What I mean is that the Mongolians themselves get mixed up and rely on some kind of Sprachgefühl. Bathrobe (talk) 07:44, 21 November 2022 (UTC)Reply
No problem at all! I suspect I'm just running up against the limits of what is algorithmically possible to calculate, so I'll just have to make exceptions for the rest. Someone has created a rather impressive HunSpell spellchecking dictionary, and has written quite extensively on their complaints about Mongolian spelling here. I don't agree with all the decisions they've made (with certain words, a corpus analysis reveals that no-one spells things that way), but then I can say the same thing for the official government spellings at http://toli.gov.mn/, too. It's been a very interesting learning experience, anyway, and I think I can call a ~99% accuracy rate a success. Theknightwho (talk) 18:06, 21 November 2022 (UTC)Reply
Just to add - I also did manage to solve one of these: monosyllabic words starting with consonant clusters take -т. My theory is that it’s because they’re treated as polysyllabic due to the consonant cluster at the beginning, as these don’t occur in native words: к(ә)-луб (a Mongolianised respelling probably something like кэлүүб). Whether or not people actually say it that way is a different matter, but it’s one of those rules that I suspect developed out of intuition (the Sprachgefühl), rather than by design. It’s also similar to the pseudo-syllables that occur at the end of borrowed words like буддизм, as зм is not a permitted cluster at the end of the words. Theknightwho (talk) 19:00, 21 November 2022 (UTC)Reply
I take my hat off to you!
I once tried to come up with a rational system to explain Cyrillic verb endings by dividing them into classes (verb root ending in x-consonant). It failed. In the end you have to rely on their quasi-phonotactic spelling rules, which I am very bad at applying. It's the complex interaction of all the rules that produce the correct forms; you can't just divide them into classes and think it will work.
Bathrobe (talk) 19:15, 21 November 2022 (UTC)Reply
Yes, absolutely! To get the inflections working, I had to design the module in a completely different way to those for Indo-European languages. With those, you apply a series of rules to determine exactly which class the root fits into, and then apply the relevant set of endings. With Mongolian, I have a different set of rules for each inflection - otherwise the number of possible classes quickly spirals out of control. In some cases, I have to determine one form first before I can determine the others (the most obvious examples being the fleeting-n and the nominative plural). After the application of each suffix, I apply the syllabification rules to determine whether a vowel needs to be dropped. This neatly simplifies some of the inflection rules, actually: e.g. the genitive with a fleeting-n stem is -ан followed by -ы. In all cases the а is dropped due to the syllabification rules, which results in -ны. Theknightwho (talk) 20:04, 21 November 2022 (UTC)Reply

Capitalising Mongolian transliterations edit

Hiya - I just wanted to give you a heads up that I've recently made it possible to use ^ before a letter to capitalise it in the transliteration, when scripts don't have capital letters. For example, {{l|mn|^ᠭᠣᠪᠢ}} gives ᠭᠣᠪᠢ (Ɣobi) (compared with {{l|mn|ᠭᠣᠪᠢ}}, which gives ᠭᠣᠪᠢ (ɣobi)). This works anywhere in the term (though obviously it usually only makes sense to do it at the start of a word). Theknightwho (talk) 17:20, 1 March 2023 (UTC)Reply