Open main menu

Admin againEdit

Hey Dan. I'd like to nom you for adminship again. Sure, Wiktionary:Votes/sy-2016-08/User:Dan Polansky for admin failed and caused lots of people to talk, and since then you've only been blocked once, and I can't see anyone on your talk page reprimanding you. Wanna try again? --Wonderfool Dec 2018 (talk) 13:40, 1 January 2019 (UTC)

Back in 2016, the probability of pass seemed rather low to me. Now, the probability of pass (crossing the 2/3 threshold) seems to approach zero. It does not seem worth the energy. --Dan Polansky (talk) 14:55, 1 January 2019 (UTC)
Too bad. Admin votes are a great social event and usually generate such stimulating comversations. --Wonderfool Dec 2018 (talk) 16:46, 1 January 2019 (UTC)

Creating entries in languages you don't knowEdit

You've been making a lot of these recently, like дэмакрат (demakrat) and demokratas. In general, it's a bad habit to create entries in languages you know so poorly that you can't even provide gender. But if you're going to do this, you should make reasonable efforts for those entries to meet some basic level of quality. For example, that would mean using the proper headword-line templates so that when you don't know the gender, the entry is placed into a category requesting that. For a language like Lithuanian, where we also try to provide the accent and declensional paradigm on the headword for nouns, you can easily grab that from the very dictionary that you're adding as a reference. As it stands, once those entries are blue links that aren't in any maintenance categories, it will likely be a very long time until anybody cleans them up, or even checks whether you've gotten them right. —Μετάknowledgediscuss/deeds 16:35, 9 February 2019 (UTC)

Czech pivní sýr was created on 25 January 2019‎ by Metaknowledge, who does not profess to know Czech on their user page; no verification was entered into the entry except for Wikipedia. As for gender, my German is quite serviceable and actually battle tested, so to speak, but when entering German gender I always check with sources; the phrase "languages you know so poorly that you can't even provide gender" makes no sense to me. Accuracy and verification available to the customer are great things, and we should have more of them; by contrast, our redlinks in translation tables lack verification and people should work on turning them into bluelinks. --Dan Polansky (talk) 19:56, 9 February 2019 (UTC)
Yep, I don't speak Czech, so I am careful to check that I've got it right and that I can add whatever the headword ought to have (in this case, just gender) when I create such an entry. You are not putting in that level of care. Even for German, which you do know, you are not bothering to use {{de-noun}}, the standard for our entries. Why? —Μετάknowledgediscuss/deeds 21:21, 9 February 2019 (UTC)
My prime concern is with accuracy combined with verification; {{head}} is generally fine and does not require the user to enter various additional bits. I want my undivided attention to be channeled toward making sure that the semantic information I am entering is correct, and that the lemma is appropriate. The effort not only leads to new accurate entries equipped with verification artifacts but also to removal of incorrect information such as in diff and diff. --Dan Polansky (talk) 21:27, 9 February 2019 (UTC)
The main problem is with incomplete entries that don't show up in any maintenance categories and won't be noticed for years. If there are no maintenance categories showing, it might be a good idea to put {{attention|de|needs gender}} or something along those lines. Either that, or look it up in Duden online. I find myself doing both when I find misformatted new German entries in my patrolling- though my German is far worse than yours. Chuck Entz (talk) 22:55, 9 February 2019 (UTC)
English Wiktionary entries are usually incomplete. For instance, for English entries, only a tiny fraction of them has IPA pronunciation and another tiny fraction has at least one example sentence; these entries do not carry {{attention}} template. People looking for incomplete entries to work on have to learn how to find them. For instance, insource:/head\|be\|noun\|g=}}/ finds Belarusian nouns without gender. On a related note, the reader looking for gender is very often served by the further reading or by another Wiktionary, both one click away; but they are only served so if an entry with a further reading exists. --Dan Polansky (talk) 06:18, 10 February 2019 (UTC)
Let's consider the hub benefits (one click away from other sources) on the example of Bulgarian entry китара. The entry is linked from guitar but since Bulgarian Wiktionary does not have the entry, there is no further link to another Wiktionary from the translation table. A user who lands in китара may be satisfied with the information provided: language, meaning and apparent verification (in this case I also entered gender, but let's suppose the gender is not there). If they are not satisfied, they can try the external link, which provides gender but no inflection. That failing, they can try a large competing Wiktionary, fr:китара, and there they find inflection and gender. They'll do well to try the French Wiktionary; some other Wiktionaries provide neither inflection nor gender. They can do all these things by clicking links with no typing; the only thing they have to type is guitar, in Latin script. --Dan Polansky (talk) 08:39, 10 February 2019 (UTC)
Later: For ease of reference: I now created Wiktionary:Beer parlour/2019/March#Translations in languages you don't know since another user has received a message to an effect that matches neither policy nor common practice. --Dan Polansky (talk) 10:06, 17 March 2019 (UTC)

Creating incomplete entries reduxEdit

I'm going to second what User:Metaknowledge said. You've recently created a whole lot of extremely low-quality entries in a lot of languages. PLEASE DO NOT DO THIS. If you're not willing to bother to learn how to create proper entries, it would be much better for you to refrain from creating entries at all than create bogus, low-quality entries. User:Metaknowledge has already addressed many of the issues, but I should note that e.g. an entry in Russian should

  1. include the accent
  2. use the proper headword template ({{head|ru|noun}} or {{head|ru|verb}} is NOT acceptable)
  3. include the declension or conjugation
  4. include the pronunciation using {{ru-IPA}}.

Your entry e.g. for полигамия does none of these things, and the corresponding Bulgarian entry looks just as bad. Plenty of Wiktionary editors, including many admins, have commented at various points on problems with entries you've created; please listen to them. Benwing2 (talk) 06:20, 13 February 2019 (UTC)

Let me add that part of being a good editor is working *WITH* the other members of a given language community. If other members have established certain standards for entries, you should follow those; you should not simply create your own rules and ignore what everyone else has done. If everyone did that, the result would be chaos. "English entries do it such-and-such a way" is NOT a good justification for ignoring a community's rules; English is not Russian is not Bulgarian, etc., and the goals of foreign language entries in the English Wiktionary are entirely different from the goals of English entries in the English Wiktionary. Benwing2 (talk) 06:24, 13 February 2019 (UTC)
The English Wiktionary has a long tradition of incomplete entries. For instance, English entries largely lack IPA markup of pronunciation, and so do Czech entries. I think providing a minimum entry with a definition and a further reading is excellent service for the dictionary user, and can be provided in volumes. I have been creating incomplete Czech entries for over a decade, and was thanked for it at the beginning. At the beginning, I could not even provide good further reading since there was none online.
Should I be prevented from contributing value in the form of accurate entries to the non-paying customer of the English Wiktionary with the use of force, I will use Beer parlour to address this as a policy issue. I submit to policy and to demonstrable consensus, as usual. I think it would be more proper of those who want to impose a non-existing policy to demonstrate consensus by starting a Beer parlour discussion themselves, but I can do it myself if required. I am certainly not "ignoring a community's rules", but rather, I am behaving in a way consistent with applicable policies that I know of, and with consensus in so far as I can determine its presence.
Let me emphasize that I have argued the matter on substance, not only on rules and policies. As for substance, a minimum entry with further reading is hugely better for the customer than no entry; this is argued in greated detail in #Creating entries in languages you don't know above. As for policy, I know of no policy prohibiting creating of entries without pronunciation and inflection. As for non-policy-based consensus, I know of no Beer parlour discussion from which it follows there is consensus against minimum entries. --Dan Polansky (talk) 11:29, 16 February 2019 (UTC)
More comments on the things said. "You've recently created a whole lot of extremely low-quality enties": Not really. I created accurate entries that lack completeness, which is lack of vertical quantity, not quality. As for "than create bogus, low-quality entries": I did not create "bogus" entries; that is really inappropriate. Again, the quality is fine, but there is vertical quantity lacking. These disparaging comments are inaccurate and inappropriate. --Dan Polansky (talk) 11:35, 16 February 2019 (UTC)
I'm not too keen on you creating such stubs, but I agree that calling them "bogus" is inaccurate. Per utramque cavernam 13:40, 16 February 2019 (UTC)
As for whether minimum entries are better than no entries, Wiktionary:Votes/bt-2007-12/User:Tbot creating FL entries vote passed 12:2 and has not been rescinded. Unlike Tbot, I am not a bot and provide guarantee on accuracy. Unlike Tbot, I provide further reading, which is excellent added value. --Dan Polansky (talk) 11:39, 16 February 2019 (UTC)
That vote is more than 11 years old. I'm fairly sure such a bot wouldn't be allowed to operate today. Per utramque cavernam 13:40, 16 February 2019 (UTC)
The Tbot vote has not been rescinded so it still has all the legal force it needs. I know there are some people who hate Tbot, but I do not know whether they are a supermajority or at least a superminority. Putting the legalistic argument aside, that vote is an indication, imperfect as it may be, of views of a broader group of people than the limited group appearing on my talk page. The complaints I heard about Tbot were about lack of accuracy; what is discussed in this thread is lack of completeness. --Dan Polansky (talk) 13:49, 16 February 2019 (UTC)
I created Wiktionary:Beer parlour/2019/February#Stub entries and minimum required content. --Dan Polansky (talk) 19:11, 16 February 2019 (UTC)
Dan, I apologize for the tone of my comments. When I wrote them I was frustrated with your actions and it led to me saying things that I shouldn't have said. I will respond further on the beer parlour page. Benwing2 (talk) 19:21, 17 February 2019 (UTC)
I appreciate the apology; I know too many people who never apologize for anything and never admit any mistake. Let us continue the discussion in Beer parlour. Here only a brief motto to help memory: Make yolk and hub and skip all fluff. --Dan Polansky (talk) 07:38, 23 February 2019 (UTC)

Definitionless entries in the Russian WiktionaryEdit

Some years ago, there was a Beer parlour discussion about volume creation of definitionless entries in the English Wiktionary. There was no consensus in either direction, from what I remember. I must have mentioned the Russian Wiktionary as an example of a Wiktionary which has too many definitionless entries. Let's have a look at how many definitionless entries the Russian Wiktionary currently has.

Definitionless entries seem to land in categories named like ru:Категория:Статьи без перевода/cs (172 entries), ru:Категория:Статьи_без_перевода/en (9049 entries), ru:Категория:Статьи_без_перевода/de (8513 entries), ru:Категория:Статьи_без_перевода/fr (14 157 entries), ru:Категория:Статьи_без_перевода/es (807 entries), ru:Категория:Статьи_без_перевода/ru (918 entries), etc. The template placing entries into these categories seems to be ru:Template:Нужен перевод. The search "insource:/Нужен перевод/" in ru wikt yields 54 044 entries; I do not know whether this number may be subject to error, but given the item counts for several large languages, the number seems plausible.

Entry count and page view statistics can be obtained from Report card for Russian Wiktionary, stats.wikimedia.org:

  • Page views per month are 16,612,080 for 1,002,462 entries as of December 2018. For comparison, en wikt has 185,677,042 page views per month for 5,896,720 entries, and fr wikt has 19,419,107 page views for 3,392,407 entries. Malagasy wikt has 254,755 views per month for 5,466,228 entries, nearly all of which are bot-taken from en wikt and fr wikt. A compact comparison of Wiktionaries from another source can be found at hypestat.com, under "Where do visitors go on this site?", where the Russian Wiktionary appears as second behind English.
  • Those 1,002,462 entries of the Russian Wiktionary could theoretically include non-lemmas. However, I checked inflected forms of кошка, and ru:кошки is a hard redirect to кошка, and ru:кошек has no entry. A quick look at ru:Категория:Русский язык does not show anything like non-lemma entries. These observations suggest these are in fact lemma entries.

Another statistics can be had from some kind of new v2 tool, Page views per country, stats.wikimedia.org. There we can see the portion of access to the Russian Wiktionary coming from various countries. If we change the view from Map to Table at the right upper corner and if we do the calculation, we can see that 73% of page views come from Russia and Ukraine.

Hightlight: Definitionless entries in the Russian Wiktionary seem to make up 5.4% of all lemmas, as per above.

Speculation: The percentage of about 5% of definitionless entries presents no significant detriment to the usefulness and popularity of the Russian Wiktionary.

--Dan Polansky (talk) 12:38, 3 March 2019 (UTC)

"en wikt has 185,677,042 page views per month for 5,896,720 entries, and fr wikt has 19,419,107 page views for 86,780,431 entries": There seems to be a mistake in the number of entries of fr.wikt. It can't have 90 millions of them. Per utramque cavernam 13:06, 3 March 2019 (UTC)
Indeed, thanks a lot. Replaced with 3,392,407. --Dan Polansky (talk) 13:30, 3 March 2019 (UTC)
To remove doubt about whether these approximately 1 million entries could also include redirects, I now checked ru:Special:Statistics, which shows there are 1 009 655 content pages and 1 452 575 all pages, including redirects. A further check: The search for "Значение", which appears in headings in the sense of "meaning", shows 1 006 027 entries. --Dan Polansky (talk) 15:56, 3 March 2019 (UTC)

Definitionless entries in the Serbo-Croatian WiktionaryEdit

Above, I deal with definitionless entries in the Russian Wiktionary, to investigate the value and impact of definitionless entries. Serbo-Croatian Wiktionary is relevant for that kind of investigation since it has really many definitionless entries: sh:Kategorija:Riječi bez definicije has 84 720 entries. An example entry is sh:adjekcija; it has pronunciation, hyphenation, gender, inflection table and a good further reading link, but no definition.

Per new v2 tool, Page views per country, stats.wikimedia.org, Serbo-Croatian Wiktionary had 64 000 page views in February from Serbia, Croatia and Bosnia and Herzegovina. By contrast, the same statistics for the English Wiktionary yields 626 000 page views in February from Serbia, Croatia and Bosnia and Herzegovina. Admittedly, the English Wiktionary has much more content in other languages than Serbo-Croatian so the two numbers are not directly comparable.

To get a bit more comparable number, let's consider the Czech Wiktionary. Per new v2 tool, Page views per country, the Czech Wiktionary had 960 000 page views in February from the Czech Republic. The Czech Wiktionary has almost no definitionless entries, as far as I know. Czech entries there have very good coverage of pronunciation and inflection since some editors are really passionate about it, but in that regard, the Czech Wiktionary does not seem to differ from the Serbo-Croatian Wiktionary. The Czech Wiktionary has rather small coverage of non-Czech languages, especially when compared to the English Wiktionary. To account for inhabitant number, let us note that there are around 10 600 000 people in the Czech Republic, while Serbia has around 7 000 000, Croatia has around 4 000 000, and Bosnia and Herzegovina has and around 3 500 000, all per Wikipedia. To account for the total number of entries: The Czech Wiktionary has 109 407 content pages per cs:Special:Statistics, many of which are inflected form entries; the search insource:/Kategorie:Tvary/ yields 25 947 entries. The Serbo-Croatian Wiktionary has 911 552 content pages per sh:Special:Statistics, many of which are inflected form entries since sh:Kategorija:Srpskohrvatski flektirani oblici has 746 984 entries; lemma entries seem to be in sh:Kategorija:Srpskohrvatski indeks, which has 137 030 entries.

I would argue that the above numbers suggest that people from Serbia, Croatia and Bosnia and Herzegovina are for the most part not interested in definitionless entries for the language they speak; pronunciation and inflection does not make up for the missing semantics. The kind reader can do their own analysis from the provided data sources or other sources.

--Dan Polansky (talk) 18:00, 3 March 2019 (UTC)

The above picture lacks the Serbian Wiktionary and the Croatian Wiktionary. The Serbian Wiktionary had 321 000 page views from Serbia in February 2019 per new v2 tool, Page views per country. The Croatian Wiktionary had 85 000 page views from Croatia in February 2019 per [1]; it had 30 000 page views from Bosnia and Herzegovina. This compares to 23 000 page views of the Serbo-Croatian Wiktionary from Croatia, and 15 000 from Bosnia and Herzegovina. Croatian lemmas in the Croatian Wiktionary appear to be in hr:Kategorija:hrvatski (indeks), which has 8021 items; hr:Kategorija:engleski (indeks) has 2171 items and hr:Kategorija:srpski (indeks) has 518 entries in Latin script. sr:Категорија:Српска именица (Serbian nouns) has 81 806 items.

As per above, if we focus on Croatia and Bosnia and Herzegovina (Serbia uses a different script), it seems that 8 000 Croatian lemmas in the Croatian Wiktionary (^) produced more page views than all the 137 030 Latin-script Serbo-Croatian lemmas in the Serbo-Croatian Wiktionary (^), of which 84 720 are definitionless (^). This could be explained by users not being interested in definitionless entries, but also by users refusing to consult a resource that they consider to mistreat what they consider to be separate languages; I do not have data to select between the two hypotheses, and other hypotheses could be possible. It is rather unclear what is going on since if we disregard the definitionless entries, there remain 52 310 Serbo-Croatian lemmas that one might think have some definitions; it could be that they are deficient or uninteresting in some way. --Dan Polansky (talk) 11:25, 8 March 2019 (UTC)

Return to the user page of "Dan Polansky".