Derbeth
Welcome!
Hello, and welcome to Wiktionary. Thank you for your contributions. I hope you like the place and decide to stay. Here are a few good links for newcomers:
- Wiktionary Tutorial
- How to edit a page
- How to start a page
- Our format guidelines
- Criteria for inclusion
- Wiktionary Sandbox (a safe place for testing syntax)
- What Wiktionary is not
- FAQ
I hope you enjoy editing here and being a Wiktionarian! By the way, you can sign your name on Talk (discussion) and vote pages using four tildes, like this: ~~~~, which automatically produces your name and the current date. If you have any questions, see the help pages, add a question to the beer parlour or ask me on my Talk page. Again, welcome! --Connel MacKenzie 08:05, 22 July 2006 (UTC)
Audio
editI see you're doing some audio today. Thank you, and please keep it up! Polish is one of those languages that I look at and think, "can I buy a vowel?" It looks to me like you're doing just fine, but if you have any questions, please let me know. I have a bit of experience at it, myself, if not in Polish. —Dvortygirl 14:59, 16 October 2006 (UTC)
Where do you take IPA pronunciation from? --Derbeth talk 14:26, 16 October 2006 (UTC)
- Only from speaking it a little and studying it, and from my knowledge of phonetics. There is a good guide to the basic phonemes of Polish at w:Polish phonology. Why, did I get something wrong? Widsith 17:40, 16 October 2006 (UTC)
Translation help
editCould you check/add the Polish, Upper Sorbian, and Yiddish translations for listen and parrot? Thanks, --EncycloPetey 11:01, 20 July 2007 (UTC)
- Thank you for your help, --EncycloPetey 09:22, 21 July 2007 (UTC)
Arabic pronunciations
editJust a reminder that you have to be careful with these. You’re putting some of them with the wrong word. In Arabic, a single spelling can be pronounced in numerous ways with different parts of speech and different meanings. You always have to look for the correct one and not just plop it down on the first one you see. —Stephen 22:47, 31 October 2007 (UTC)
Problem with one of the Mandarin audio files
edit- Just so you are aware, the audio file which is currently included in 因為/因为 is incorrect. The filename says zh-yīnwèi.ogg, which should be the correct file. However, the man in the audio is actually saying yīnwéi. If the file were named zh-yīnwéi.ogg, it would be accurately labeled (but not correct for 因為/因为). Hopefully, this note makes sense. This is the only audio file that I have come across so far which is mislabeled. -- A-cai 13:21, 26 April 2008 (UTC)
Uploading Hungarian audio
editHi Derbeth, I'd like to upload Hungarian audio to Wiktionary using DerbethBot. So far I used Shtooka for recording, then uploaded the .ogg files to Common and added the audio template to the entry manually. Very time-consuming. How does DerbethBot work? Would you have time to help? The Hungarian pronunciation is much simpler than English. Only one pronunciation section per entry, and the audio template goes to the end of the section below other possible items. Or if there is no pronunciation section, it would have to be created. Thanks. --Panda10 19:04, 17 June 2008 (UTC)
- Thanks for responding. There is no hurry. Your master's thesis is much more important. Please contact me at your convenience. Good luck! --Panda10 18:26, 4 July 2008 (UTC)
- I don't have a lot right now, maybe about 40 files and they are on my PC. I tried to upload them to Common using Commons:Tools/Commonist but it did not work. It somehow could not log it with my valid ID and password. How do you upload the files to Common? I think it would be great help if I could just figure out how to do that since I don't have thousands of files and I don't want to take your time for such a small amount. Thanks. --Panda10 18:24, 6 November 2008 (UTC)
- I do use Shtooka to record words in .ogg format. Does your script handle special characters in file names such as áéíóöőúüű? --Panda10 23:35, 6 November 2008 (UTC)
- I don't have index.tags.txt with file descriptions. Is this something that I should create or is it something that Shtooka is supposed to create? And do you mean file name when you say description? One file name per line? --Panda10 00:41, 7 November 2008 (UTC)
No, I just copied it from Polish Wiktionary ^_^. I have no Lower Sorbian dictionaries, so I can't verify it either. There's the word for border, granica, so it would be reasonable to assume that there exists a verb with sense to border too, but whether it's granicyś or something else, I have no knowledge of.. --Ivan Štambuk 12:49, 6 October 2008 (UTC)
Wolof pronunciation
editthis is a list for more words of wolof language, you can check it at this addresse here Can I help you to record the pronunciation words, Wolof is my mother tongue. Thank for your work OK, --Ahloubadar 01:23, 1 May 2009 (UTC)
Audio bot
editHi, I noticed that you have a bot that adds audio. Would you be able to run it on Simple Wiktionary?--Brett 15:01, 9 May 2009 (UTC)
- Simple is only English.--Brett 18:05, 9 May 2009 (UTC)
Hi, I reverted your edit as it has clearly attestable plural use. [1]. (Though I don't disagree there will be others that are actually incorrect). Conrad.Irwin 10:12, 10 May 2009 (UTC)
- Ok, I've updated the entry to show countable and uncountable for the crop sense, and uncountable for the shaving sense (I was confused, because I saw "shaving" in the books.google.com page and assumed it was face-shaving, on closer inspection it was field-shaving :D). Conrad.Irwin 10:53, 10 May 2009 (UTC)
Audio bot, again
editHello, Derbeth. I'm interested in your Audio Bot for importing my recordings of Armenian words. If it’s possible, how do I feed your bot with recordings? If I upload a file like this to Meta, can the bot import it to an entry like աշխարհ, as I have done it manually? --Vahagn Petrosyan 14:46, 13 May 2009 (UTC)
Hello. Do you check this page? I left some comments. ---> Tooironic 01:16, 26 December 2011 (UTC)
Bot request
editHi Derbeth,
If you have time, could you run your audio bot again? See also here. Thanks in advance! -- Curious (talk) 21:01, 22 January 2014 (UTC)
- Thank you very much for your bot run! -- Curious (talk) 22:32, 27 February 2014 (UTC)
Some Thai audio files
editHello! Your bot keeps adding incorrect audio files to these entries: ความรัก and จีน. Could you please prevent it from doing that again? Thank you so much!
P.S. Correct audio files are not available at the moment.
User:DerbethBot's automatic pronunciation adder might need some workaround for this new template that can contain audio files under parameters named a, audio, a2, a3 etc. The bot doesn't need to be able to add pronunciations this way (leaving a report or similar is perfectly fine), but it should hopefully at least recognize the presence of the audio files. — surjection ⟨?⟩ 22:14, 17 August 2019 (UTC)
- Thank you for the reminder! --Derbeth talk 10:34, 18 August 2019 (UTC)
Since your last run of User:DerbethBot the parameter |lang=
has been removed, as in many other templates in a general cleanup. Now the positional parameter |1=
should be used instead. Fay Freak (talk) 23:19, 11 November 2019 (UTC)
hyphenation and hyph
editHi, I thought I'd notify you that DerbethBot doesn't treat the templates {{hyphenation}}
and the more recent shorthand form {{hyph}}
in the same way. ←₰-→ Lingo Bingo Dingo (talk) 08:59, 15 January 2020 (UTC)
- What do you mean by that? My bot does not process any of those templates. --Derbeth talk 18:22, 15 January 2020 (UTC)
Armenian pronunciations
editHi. Wikimedia Commons uses the file name Hy-entry.ogg for pronunciations in Eastern Armenian and HyW-entry.ogg for pronunciations in Western Armenian. Can your bot automatically add the description based on the file name? I.e. do this. --Vahag (talk) 18:06, 29 August 2020 (UTC)
- Sure, I will make an appropriate change. --Derbeth talk 19:34, 31 August 2020 (UTC)
- Thanks. Please be sure to use "Eastern Armenian" and "Western Armenian", not the less common "East Armenian" and "West Armenian". --Vahag (talk) 10:20, 1 September 2020 (UTC)
- This is done. Note that changes will apply only to new edits and currently there are no new Armenian pronunciation files. --Derbeth talk 08:25, 6 September 2020 (UTC)
- Thanks! --Vahag (talk) 09:56, 6 September 2020 (UTC)
Hello again. Would it be difficult to run the bot to add the audio descriptions to old edits? --Vahag (talk) 14:59, 25 October 2020 (UTC)
- Can you ask other bot owners in Beer Parlour? Fixing markup of already existing entries isn't the thing I have ready. I would need to implement it from scratch. --Derbeth talk 15:53, 25 October 2020 (UTC)
- OK, no problem. --Vahag (talk) 16:24, 25 October 2020 (UTC)
Hi again! Can you make DerbethBot treat <և> and <եւ> as the same thing? E.g. both File:Hy-բարև.ogg and File:Hy-բարեւ.ogg should be added both to բարև and բարեւ, whichever exists. --Vahag (talk) 18:09, 19 August 2022 (UTC)
- Hello, looks like an easy thing, I will let you know when I'm done. --Derbeth talk 17:50, 20 August 2022 (UTC)
- I've done this but it seems all existing Armenian entries have their audio right. Maybe it will help in the future. --Derbeth talk 10:17, 21 August 2022 (UTC)
- In the past pronunciation files with և in them were not uploaded to entries with եւ, e.g. File:Hy-բարև.ogg to բարեւ. We recently moved all եւ spellings to և in Wiktionary. Vahag (talk) 14:32, 21 August 2022 (UTC)
Can the bot add the audio after the IPA, as described in Wiktionary:Pronunciation#Section_layout_and_templates? --Vahag (talk) 16:58, 13 January 2023 (UTC)
- Sure, thanks for letting me know about the rule. --Derbeth talk 07:48, 14 January 2023 (UTC)
I noticed that many Armenian pronunciations that are being now manually added by ԱշոտՏՆՂ were not picked up by your bot even though they were uploaded to the Commons long ago. Is there a problem with those files? --Vahag (talk) 08:04, 19 April 2024 (UTC)
- Hello. I found 181 such files (Hy-NAME.ogg files exists on commons, NAME exists here but the entity isn't in Category:Armenian terms with audio links). The list is here (here). ԱշոտՏՆՂ (talk) 08:47, 19 April 2024 (UTC)
Hello, sorry for my late reply! I was on vacation and had to catch up with my things. The reason my bot does not add audio to some entries is that multiple etymologies are not supported, so cases like in բջջային. It was requested on en.wikt with the reason that multiple etymologies often mean that pronunciation is different (at least it is so in some languages). A human has to decide in each case how to handle the audio file. I prepared a list in User:DerbethBot/Add manually that you can use to add missing audio files. --Derbeth talk 08:11, 8 May 2024 (UTC)
- Armenian orthography is phonetic, homonyms are pronounced the same way. Is it possible to automatically upload audio for Armenian multiple etymologies? It goes to an L3 ===Pronunciation=== header, before ===Etymology 1===. Vahag (talk) 16:38, 8 May 2024 (UTC)
- I will see if I can adapt the code to make an exception for Armenian. --Derbeth talk 06:20, 9 May 2024 (UTC)
File name format
editHi Derbeth. I may have missed it, but I couldn't find any documentation on how you handle country codes in file names. Will you automatically identify File:Sw-ke-tatu as belonging at tatu#Swahili? Even better, do you automatically add Audio (Kenya) as the description when the country code is provided? Thanks! —Μετάknowledgediscuss/deeds 06:19, 5 September 2020 (UTC)
- Hello. The naming format of pronunciation files is defined in commons:Category:Pronunciation but it is not very precise. I see there are nearly no files in commons:Category:Swahili pronunciation - do you plan to upload them? The naming scheme you suggested will be absolutely fine. I can configure my bot to describe Sw-ke- files as "Audio (Kenya)". Cheers --Derbeth talk 16:17, 5 September 2020 (UTC)
- Yes, the plan is to upload a lot of them soon. Could you please do that, and also configure Sw-tz- files as "Audio (Tanzania)"? Thank you!
- On another note, it seems like your bot doesn't upload files with the Lingua Libre naming format, like File:LL-Q150 (fra)-Fhala.K-Afrique.wav. (That's LL-[Wikidata object number of the language] ([ISO 639-3 code of the language])-[Username of uploader]-Word.filetype, if it wasn't clear.) Would it be possible to support those as well? —Μετάknowledgediscuss/deeds 16:43, 5 September 2020 (UTC)
- Sure, I will add a case for Tanzania as well. I will make changes after any new files appear (please let me know as I don't monitor them). I don't see much sense in making changes prior to files appearing: the bot is started manually by me, it's not that immediately as files appear the bot begins its work. By seeing the files I will be able to test that my edits are correct.
- I used to have an extensive support for Lingua Libre recordings in different languages. However I became suspicious of those files as I found out they allow non-native speakers to add recordings. It's a ridiculous, insane idea and disqualifies the whole project as a useful data source. I blacklisted all their recordings except for Mandarin, where I found only Chinese citizens made recordings. Independently other Wiktionary users became dissatisfied with Lingua Libre recordings, even so much that they suggested blocking my bot for adding Lingua Libre files. So generally I don't plan to support Lingua Libre any more. --Derbeth talk 08:33, 6 September 2020 (UTC)
- Thanks. A few Sw-ke files have already been uploaded, with more to come, but there'll be no Sw-tz in the near future.
- And thank you for explaining the history with LL (maybe that should be mentioned in the bot FAQ?). Would you consider using a whitelist, or would that be too much work? I would be happy to help by going through the files and making a whitelist of users who make good-quality recordings and their native language(s). That would probably eliminate the issues raised about the non-native audio. —Μετάknowledgediscuss/deeds 05:51, 9 September 2020 (UTC)
- I could apply a whitelist to LL files, but this would probably need to wait a month or two for the implementation. It's not complicated, but I don't want to spend too much time working on my bot.
- As for the Swahili files, so far I see only 1 new file. Are the rest properly categorised? --Derbeth talk 19:50, 10 September 2020 (UTC)
- I guess only one has been uploaded so far... not entirely sure what the hold up is, but maybe the idea is to wait until there are enough for mass upload? As for a LL whitelist, there's no rush, but I think it would be a great addition. Maybe you could ping me when you feel ready to work on it? —Μετάknowledgediscuss/deeds 20:26, 10 September 2020 (UTC)
- I would prefer the other way around - that you try to prepare a small whitelist and let me know, and I will implement a solution after some time. As I have written before, I prefer to have some real-life data before I start coding. --Derbeth talk 05:46, 11 September 2020 (UTC)
- I have created a small whitelist that you can work with: User:Metaknowledge/audiowhitelist. In the cases where a country is given, that should ideally be displayed in the
{{audio}}
template as well. Let me know if you need anything else to proceed. —Μετάknowledgediscuss/deeds 06:29, 25 September 2020 (UTC)
- I have created a small whitelist that you can work with: User:Metaknowledge/audiowhitelist. In the cases where a country is given, that should ideally be displayed in the
- Thanks. I made some preliminary work on whitelisting Lingua Libre for Bengali, so applying your whitelist shouldn't be complicated. --Derbeth talk 17:24, 25 September 2020 (UTC)
- The bot is adding Moroccan Arabic (ary) audio to Arabic (ar) entries. Can you please remove these and fix the code? —Μετάknowledgediscuss/deeds 16:48, 5 October 2020 (UTC)
The bot added this to lumi. It's however not a reading of the word lumi, but of the Finnish Wikipedia article Lumi. The filename isn't wrong (it matches the format used by other such clips), so I don't know what the best approach is to prevent the bot from adding it again. Has this been a problem before? — surjection ⟨??⟩ 21:15, 19 September 2020 (UTC)
- It's the first time I see such a problem. I will blacklist commons:Category:Spoken Wikipedia - Finnish and this will fix the problem. Thanks for letting me know. --Derbeth talk 08:03, 20 September 2020 (UTC)
audio En-six (2).oga
editHello,
Your bot added a barely audible OGA file in 2018... https://en.wiktionary.org/w/index.php?title=six&type=revision&diff=49294671&oldid=49270177 --Wisdood (talk) 13:54, 21 September 2020 (UTC)
- It's not the bot's fault. The only way to solve this is to get it deleted at Commons. I will start the deletion process there now. —Μετάknowledgediscuss/deeds 17:45, 21 September 2020 (UTC)
Occitan audio
editHello, do you plan on importing Occitan audio from LL user Davidgrosclaude? I added his name to the whitelist, but I haven't seen any of his files imported yet. Hopefully I can clear up any doubts you have about them. Ultimateria (talk) 18:28, 16 November 2020 (UTC)
- Just a reminder about the audio whitelist. I'm also looking forward to the Swahili audio files being added, as there are now thousands of files at Commons! —Μετάknowledgediscuss/deeds 22:35, 11 January 2021 (UTC)
- And on another topic, I see that your bot added a duplicate audiofile here; might be worth finding a way to avoid that happening. —Μετάknowledgediscuss/deeds 19:21, 17 January 2021 (UTC)
- These are two separate files, so it's not a duplicate. Is this a bug? There may be different ways to pronounce a word. Speakers may be from different regions, have different gender etc. I try to avoid adding two new files for the same 'dialect' of a language, but it's not easy to automatically parse existing information. --Derbeth talk 06:55, 21 January 2021 (UTC)
- Nope, they're the same file. One is just a redirect to the other. That's why it should be something your bot can avoid, right? —Μετάknowledgediscuss/deeds 17:15, 21 January 2021 (UTC)
- The person who created the redirect should have fixed all the usages so that all point to the new file name. The job of my bot is to add new audio, not fix old incorrectly added audio files. At least the edit of my bot makes it clearly visible that something is wrong with the entry and it can be fixed. There is no straightforward way to prevent such a case. I would need to check each single audio file for redirects. Is this a frequent case? I don't want to slow down the bot and spend time implementing a check that will prevent one or two such cases per month. --Derbeth talk 13:58, 23 January 2021 (UTC)
- I see. In that case, I agree that it's not your responsibility. On the other topic, when do you think you can add the Swahili audio? —Μετάknowledgediscuss/deeds 18:03, 23 January 2021 (UTC)
- The person who created the redirect should have fixed all the usages so that all point to the new file name. The job of my bot is to add new audio, not fix old incorrectly added audio files. At least the edit of my bot makes it clearly visible that something is wrong with the entry and it can be fixed. There is no straightforward way to prevent such a case. I would need to check each single audio file for redirects. Is this a frequent case? I don't want to slow down the bot and spend time implementing a check that will prevent one or two such cases per month. --Derbeth talk 13:58, 23 January 2021 (UTC)
- Nope, they're the same file. One is just a redirect to the other. That's why it should be something your bot can avoid, right? —Μετάknowledgediscuss/deeds 17:15, 21 January 2021 (UTC)
- These are two separate files, so it's not a duplicate. Is this a bug? There may be different ways to pronounce a word. Speakers may be from different regions, have different gender etc. I try to avoid adding two new files for the same 'dialect' of a language, but it's not easy to automatically parse existing information. --Derbeth talk 06:55, 21 January 2021 (UTC)
- At the end of this week I have a scheduled run of my bot for German. I will try at the first week of February. --Derbeth talk 06:40, 25 January 2021 (UTC)
Esperanto entries should not be using Template:audio
editEsperanto has Template:eo-pron that is more useful than audio and completely supersedes it for Esperanto usage; see https://en.wiktionary.org/w/index.php?title=boltingo&diff=prev&oldid=63027212 .--Prosfilaes (talk) 08:41, 10 July 2021 (UTC)
Audio bot
editHi, could you add contents of this next time you run the bot? Allahverdi Verdizade (talk) 18:35, 10 September 2021 (UTC)
- Hello. Sure, I will. Thanks for letting me know. --Derbeth talk 06:58, 14 September 2021 (UTC)
- Hey! Could you also add the Spanish audios of AdrianAbdulBaha (he is whitelisted). Thanks in advance! — Fenakhay (حيطي · مساهماتي) 18:24, 24 September 2021 (UTC)
Fenakhay: does this edit look good? For IPA there is 'Spain' and 'Latin America'. Is there a rule to use something like this for audio? --Derbeth talk 18:29, 26 September 2021 (UTC)
The audio should be below the IPA. You can add (Colombia) in the audio for this user. — Fenakhay (حيطي · مساهماتي) 19:34, 26 September 2021 (UTC)
Hi, Derbeth. Can you also add https://commons.wikimedia.org/wiki/Special:ListFiles/Vahagn_Petrosyan? All should be marked as "Eastern Armenian". --Vahag (talk) 19:25, 27 January 2022 (UTC)
Done. --Derbeth talk 08:14, 28 January 2022 (UTC)
- Thanks. Will my new recordings be automatically uploaded or should I ask you again after a new batch? Vahag (talk) 20:16, 28 January 2022 (UTC)
They will be picked up automatically, but please be patient – there have been lots of uploads for other languages, and Armenian (ISO code hy) may need to wait. --Derbeth talk 21:32, 28 January 2022 (UTC)
- No rush. I just wanted to now if the system is automatic. Vahag (talk) 12:46, 29 January 2022 (UTC)
My Armenian uploads starting from 17:59 August 19 are not being picked up by your bot (the reason may be because their file name format is different). Can you check? All my uploads should continue to be tagged as "Eastern Armenian". --Vahag (talk) 12:32, 26 September 2022 (UTC)
- Vahag: This is because those files have a new speaker: Yevgenya Shamshyan. Does this person also pronounce East Armenian? --Derbeth talk 09:51, 28 September 2022 (UTC)
- Should work fine now. --Derbeth talk 15:49, 28 September 2022 (UTC)
IPA Templates With Parameters on Separate Lines
editI've just cleaned up a few cases like this where the template took up more than one line and what DerbethBot thought was the line after the template was actually in the middle of it. Is there any way to check for this? I know parsing template syntax is a pain, but perhaps you could check for pipes or closing curly brackets with no opening curly brackets in between. Thanks! Chuck Entz (talk) 15:24, 2 October 2021 (UTC)
- I have just added a fix. Thanks a lot for catching this. --Derbeth talk 15:51, 2 October 2021 (UTC)
Audio bot and pages with several Etymology section
editThanks for your audio bot! I believe I found a bug: the bot doesn't consider words that have several "Etymology" sections. Here is one such page where I had to add the audio link manually: vua. I'd appreciate it if you could check if there's a bug. Thanks! Tbm (talk) 02:20, 6 November 2021 (UTC)
- Hello. This isn't a bug, see the FAQ in User:DerbethBot. When there are multiple etymology sections, a human decision is required. I used to prepare reports which listed all audio files that require a human action – but stopped since no one took any action. If you are interested in such a report, I can prepare it. --Derbeth talk 08:26, 6 November 2021 (UTC)
- In Swahili the pronunciation doesn't depend on the etymology. Can your script handle this? If not, I can add them manually. If you can generate a report easily, that would be nice. If not, don't worry. I am working on a script to download all Swahili entries anyway, so I could just grep through them. Tbm (talk) 07:24, 13 December 2021 (UTC)
- It won't be easy to handle Swahili differently. For now I can give you the report: User:DerbethBot/Add manually#Swahili. --Derbeth talk 08:55, 14 December 2021 (UTC)
- Thanks for the list. I believe I've fixed all of them. Tbm (talk) 03:19, 25 December 2021 (UTC)
New template Template:it-pr
editHi. I have created a new template {{it-pr}}
that is similar in semantics to {{fi-pronunciation}}
/{{fi-p}}
. Given the respelling of an Italian term, it auto-generates the pronunciation, rhyme and hyphenation. It allows the audio to be specified, in a format like this (for the page forchetta):
{{it-pr|forchétta<audio:It-una forchetta.ogg>}}
The caption for the audio can be specified after a semicolon, like this (for the page tra):
{{it-pr|tra**<audio:LL-Q652 (ita)-Happypheasant-tra.wav;Audio (Milan)>}}
It would be great if your bot could recognize the presence of audio files specified this way, and not redundantly add them. All you would need to do is check (for an audio file FILENAME
) to see if the string <audio:FILENAME>
or <audio:FILENAME;
exists in any numbered parameter of {{it-pr}}
.
You don't need to be able to understand the format of this template in detail or modify the format, and it's fine to add an audio line at the end of the section, like this:
{{it-pr|tra**}} * {{audio|it|LL-Q652 (ita)-Happypheasant-tra.wav|a=Milan}}
Thanks!
Benwing2 (talk) 21:50, 7 November 2021 (UTC)
- Thanks for letting me know, I have just made changes. --Derbeth talk 07:00, 8 November 2021 (UTC)
Polish pronunciation
editHey there! I've got two requests regarding Polish pronunciation.
Firstly, I've noticed your bot is still adding pronunciation using the {{audio}}
template. Somebody should've notified you earlier probably, so sorry for that, but {{pl-p}}
has been introduced which combines IPA, syllables, rhymes and pronunciation; the audio is added as an argument, so for example {{audio|pl|Pl-różany.ogg|Audio}}
should be {{pl-p|a=Pl-różany.ogg}}
. Would it be possible to change the code to add the audio to the template as an argument like that?
Secondly, looking through its edits, I've noticed the bot seems to omit Polish LinguaLibre recordings (while it appears to be adding French ones, looking at its edits from December 15th, so I assume this is not how it's supposed to be working). Any way to amend that?
Thanks in advance. Hythonia (talk) 18:57, 23 December 2021 (UTC)
- Hello. Thanks a lot for letting me know! I will change the bot, but for now I will omit Polish. As for LinguaLibre, I was demanded to stop adding LinguaLibre recordings at some time, because it includes non-native authors and some irresponsible people upload those non-native recordings to Commons (some discussion here: Wiktionary:Beer_parlour/2020/July#Labeling_non-native_audio). I use a whitelist for verified LL authors, for now the list includes author Poemat. If there are more authors, let me know. --Derbeth talk 09:29, 24 December 2021 (UTC)
- Gotcha, makes sense to be careful with the recordings. We seem to have three other natives, though: Olaf, KaMan, and ThineCupOverfloweth. It'd be nice to include them all as well. Hythonia (talk) 13:59, 24 December 2021 (UTC)
- On this note - it'd be nice if when the bot runs (if we can get it to add audio to the
{{pl-p}}
), it could remove the{{rfap|pl}}
I've been adding. Vininn126 (talk) 12:48, 12 January 2022 (UTC)
- On this note - it'd be nice if when the bot runs (if we can get it to add audio to the
- Gotcha, makes sense to be careful with the recordings. We seem to have three other natives, though: Olaf, KaMan, and ThineCupOverfloweth. It'd be nice to include them all as well. Hythonia (talk) 13:59, 24 December 2021 (UTC)
- The bot always removes rfap, regardless of the language. Adding support for pl-p will need to wait a while. --Derbeth talk 19:43, 12 January 2022 (UTC)
- Hi. I know you've been working on the bot and that's cool. Recently, I've been recording words thru lingualibre and I'd like to know if your bot will be able to add the files to the correct entries? Tashi (talk) 18:31, 13 January 2022 (UTC)
- Hi Tashi. I'm adding you to the whitelist for Polish. Your audio files will be added automatically. --Derbeth talk 07:56, 14 January 2022 (UTC)
- Thank you! Could you let me know when the process starts? Tashi (talk) 19:37, 14 January 2022 (UTC)
- I'm already adding on de.wikt: de:Special:Contributions/DerbethBot. For en.wikt I need to modify the code to handle the new template. I need some spare time for that and I cannot declare any particular time. I plan to do it somewhere before the end of January. --Derbeth talk 21:51, 14 January 2022 (UTC)
- Hi Tashi. I'm adding you to the whitelist for Polish. Your audio files will be added automatically. --Derbeth talk 07:56, 14 January 2022 (UTC)
Hythonia: what should I do in cases like besserwisser? This entry does not use pl-p. --Derbeth talk 07:52, 28 January 2022 (UTC)
- All pages have been converted to have pl-p, or
{{IPA|pl}}
. A thousand-odd non-lemmas still don't have it, but I'm working on those. Vininn126 (talk) 15:54, 2 February 2022 (UTC)- Notably: most will just be
{{pl-p}}
and to add audio, they need to have{{pl-p|a=FILE NAME}}
. Some will (accidentally) have an unclosed a=, in which case the file name should just be added directly. Vininn126 (talk) 10:46, 3 February 2022 (UTC)- It would seem most bot work is done. Should be safe to run the audio bot now! Vininn126 (talk) 21:04, 26 February 2022 (UTC)
- Notably: most will just be
Thanks for letting me know! I will re-run my bot in a few days. --Derbeth talk 11:35, 27 February 2022 (UTC)
English Pronunciation
editHello! I recently recorded a bunch of English words on lingua libre, I will probably do more. Could we get the bot to deploy them? Vininn126 (talk) 23:23, 22 January 2022 (UTC)
- Hello. Are they uploaded to commons:Category:English pronunciation? --Derbeth talk 09:46, 23 January 2022 (UTC)
- It's here[2], how do i get it to categorize? Vininn126 (talk) 12:22, 23 January 2022 (UTC)
- You don't need to change anything, it's already in a subcategory of English pronunciation. I will add your pronunciation soon. --Derbeth talk 14:38, 23 January 2022 (UTC)
- Not sure if I made an oopsie, but I marked my place of residence as Warsaw, despite being an General American speaker. Not sure if there's anything I need to change about that. Vininn126 (talk) 18:07, 23 January 2022 (UTC)
- Not a problem, I added you manually and marked your pronunciation as US: example. --Derbeth talk 06:41, 25 January 2022 (UTC)
- Not sure if I made an oopsie, but I marked my place of residence as Warsaw, despite being an General American speaker. Not sure if there's anything I need to change about that. Vininn126 (talk) 18:07, 23 January 2022 (UTC)
Afrikaans Pronunciation
editI noticed that Wikimedia Commons has around 2,900 audio files for Afrikaans from Lingua Libre (Afrikaans_pronunciation) but only around 300 are linked on Wiktionary. Can your script handle these? As an aside, some of the 300 have a description of "Audio (AF)". I don't think it makes sense to indicate "(AF)" since they are in the Afrikaans section. Should I remove that? What do you think? (Some say "Johannesburg", which I guess might make some sense; I'm not sure how different pronunciations in different regions are). Tbm (talk) 00:07, 29 March 2022 (UTC)
- Hello. Thanks for letting me know, I did not notice those files are available. I will change the script to add them. I cannot speak of files linked so far – this wasn't done by my script. As for different regions, I also don't know, I'm not very active on Wiktionary. Sorry for that ;) You can ask in Wiktionary:Beer parlour. --Derbeth talk 06:43, 29 March 2022 (UTC)
Wrongly added Norwegian pronunciations
editHi Derbeth, I have noticed that your bot added audio files relating to various other languages (mainly, but not only, German) in the Norwegian Bokmål section of the entries. You can see a list at WT:Todo/Template language code doesn't match header under "audio" in the first column and "Norwegian Bokmål (nb)" in the third column. I assume this was an error - could you please take a look and correct it if possible? Thanks. This, that and the other (talk) 05:23, 13 May 2022 (UTC)
- Hello. Thanks for letting me know. I fixed my bot's code and will manually fix wrong edits. --Derbeth talk 06:38, 13 May 2022 (UTC)
Swahili Pronunciations
editI've uploaded more Swahili audio to Commons. Can you please run your bot again. They are in the same format as before, so no change to the scripts are required. Thank you! Tbm (talk) 06:06, 11 December 2022 (UTC)
- Thank you! Tbm (talk) 01:30, 12 December 2022 (UTC)
- It's not urgent but if you have time you could update Swahili again. I've uploaded more files recently. Thanks. tbm (talk) 19:02, 24 May 2023 (UTC)
- Sure! --Derbeth talk 20:25, 24 May 2023 (UTC)
- Thanks! tbm (talk) 07:31, 25 May 2023 (UTC)
Maltese pronunciations
editCould you please run your bot for Maltese pronunciations by the user GħawdxiVeru? He is a native speaker from Gozo, Malta. I've already added him to User:Metaknowledge/audiowhitelist.
Thanks in advance. — Fenakhay (حيطي · مساهماتي) 13:58, 11 December 2023 (UTC)
- Hi, I will, I have this on my list. :) The bot code needs small updates to support Maltese. --Derbeth talk 18:16, 11 December 2023 (UTC)
Hi, on Bri'ish, DerbethBot made this edit (diff), which added the same link twice. I assume that you would normally have it check for duplicates, but in this case the other file had an HTML entity in it that would have seemingly caused it to not identify the files as being the same. Theoretically it could be possible to (1) expand the names of the audio files already on the page to get rid of templates or anything else, and then (2) normalize the text by removing those entities and so on, which could potentially help the problem. But in the first place this page could be fixed to not use the HTML element. Kiril kovachev (talk・contribs) 20:46, 28 January 2024 (UTC)
- Hi, thanks for informing me. I actually do some normalization already, just HTML entity normalization isn't part of it (yet). I will enhance the code. --Derbeth talk 07:53, 29 January 2024 (UTC)
Russian MP3s
editHello! I've recently started importing pronunciations to Commons (category). They are all high-quality. But the format is Ru-{{PAGENAME}}.mp3
.
How could I help you make it compatible with your bot? Nyuhn (talk) 04:46, 12 March 2024 (UTC)
- Hi, thanks for letting me know. When I wrote my bot, it was not allowed to upload mp3 to Commons, so the bot ignored this extension. I made small change and your files are handled: edit. Your naming is fine, you don't need to change anything. --Derbeth talk 10:59, 12 March 2024 (UTC)
- Thank you! I've also noticed that there are a lot of Lingua Libre pronunciations lying at Commons and not yet referenced by Wiktionary. Some of Svetlov Artem files have romanized names (264 items), but this have to be fixed by hand I think.
- Missing nicknames:
- Native:
- Good quality:
'DomesticFrog', 'Pacha Tchernof', 'Cokewanna', 'Rominf', 'Infovarius', Sofia Sycheva', 'DoctorandusManhattan'
- Medium quality:
'Герман Мейстер'
- Good quality:
- non-native and not from Russian-speaking country:
XANA000
- Native:
- Nyuhn (talk) 18:16, 12 March 2024 (UTC)
- There was a strong objection against non-native pronunciation here on English Wiktionary, so I enabled just authors you mentioned as 'good quality'. After those objections, I only allow chosen nicknames from Lingua Libre. Thanks for checking the ones above. --Derbeth talk 07:36, 13 March 2024 (UTC)
Ukrainian (uk
) Linguo Libre natives (by record quality):
- High:
'Fanat22012', 'Gzhegozh'
(first letter of all entries in uppercase for no reason),'Po ukraińsku (Andriana)', 'Renvoy', 'Snizhana Umanets', 'Tohaomg'
- Varies from high to low:
'Bicolino34'
(273 records)
captions for audio files added
editHi. I changed {{audio}}
so it automatically adds a caption "Audio" if the caption isn't explicitly given, and automatically adds a colon after the caption. I also added a bunch of parameters to allow you to specify the IPA, text, accent, etc. as separate parameters and in general I'm trying to eliminate the use of explicit captions as much as possible. Can you change your script not to auto-add a caption that reads "Audio"? And if your bot ever adds a different caption, let me know what it's doing and I'll tell you how to convert it to use the new params. Benwing2 (talk) 03:53, 5 June 2024 (UTC)
- Hi, thanks for letting me know. I will change my bot's code before the next run. Derbeth talk 10:19, 7 June 2024 (UTC)