Script recognition module
I would like if {{auto cat}}
(or {{charactercat}}
or whatever template), when used in Category:Bb, automatically recognized that "Bb" is in Latin script. For example, it could be categorized into "Category:Latin script something", it could have "Latin script" in the description and the "Bb" in the description would have the right script label in the code.
Likewise, Category:Δδ can be created for Greek script.
And Category:Bb: ⠃ (Latin–Braille) already exists. The category name has a mixture of scripts, but the module is already prepared to recognize the different contents before and after the colon.
But findBestScript
requires a language code and the categories mentioned are multi-language categories. Can't we change the module so that it iterates over all scripts, when the language is und
or something?
That can work, but what about cases like Latn vs Latinx? A language would never have both as its script, but if it blindly goes over all the scripts, it's different.
You're right. A letter like "C" is probably both Latn and Latinx. The same problem probably would happen with pa-Arab, ota-Arab, etc. if we had similar categories for the Arabic script.
Maybe it's not feasible, but can findBestScript
iterate over all scripts, but give priority for 4-letter scripts? If it finds something in Latn or Arab, it stops the search and does not iterate over Latinx and fa-Arab.
Or maybe just give priority to Latn over Latinx and forget Arab and the others unless they become a problem at some point.
We could also change the data format of the scripts a bit, giving them a "hierarchy" of some sort.
Suggestion: in Latinx, nv-Latn, pjt-Latn... add parent = "Latin",
.
In Latn, Grek, Cyrl... add parent = "top",
.
And in findBestScript
, give priority to scripts that have "parent = top".
I added the parent in all scripts of Module:scripts/data. Feel free to check if I did it right. I'm not sure what to do with cases like Jpan, Hira, Kana, Hani, Hans, where scripts overlap, so when in doubt I used parent = "top",
in all cases.
I also created a function :getParent()
. I tested it; it's working.
I don't know yet if I would be able to make findBestScript
give priority to scripts that have parent = "top",
. If you'd like to do it, please be my guest. Otherwise, I think I should try later.