User talk:Dan Polansky

Pronunciation of 'w' in CzechEdit

Hi Dan, How would you pronunce 'w' when written in Czech? (e.g. Wikipedie, Winchester) – AWESOME meeos * (「欺负」我) 12:51, 2 January 2017 (UTC)

@Awesomemeeos: It is pronounced like v. Thus, Wikipedie would be /vɪkɪpɛdɪjɛ/. --Dan Polansky (talk) 09:09, 7 January 2017 (UTC)
You can hear the Czech pronunciation of v at váza. --Dan Polansky (talk) 09:09, 7 January 2017 (UTC)

Czech lemmatizerEdit

A Czech lemmatizer is at http://lindat.mff.cuni.cz/services/morphodita/

My favorite setting for the lemmatizer is as follows:

  • Task: Lemmatize
  • Tag set: Raw lemmas
  • Output: Plain

Example input: Komu není shůry dáno, v apatyce nekoupí. Komu se nelení, tomu se zelení.

Example output: kdo být shůry dát, v apatyka koupit. kdo se lenit, ten se zelený.

One use of this lemmatizer is that you pick a piece of Czech text, run it through the lemmatizer, wikify words and fill redlinks by creating Wiktionary entries. This redlink-filling activity was suggested by SemperBlotto some time ago, without the lemmatization part. Since I am interested in creating lemmas rather than inflected forms, I need a lemmatizer.

--Dan Polansky (talk) 18:39, 13 January 2017 (UTC)

The following Python script grabs clipboard content, wikifies words and puts the result back to clip:

import re
from Tkinter import Tk
newContent = re.sub(r"([^ ,\.:;]+)", r"[[\1]]", Tk().clipboard_get())
Tk().clipboard_clear()
Tk().clipboard_append(newContent)

The regex may need finetuning. --Dan Polansky (talk) 18:54, 13 January 2017 (UTC)

CFI voteEdit

I made the CFI vote start at 19:00 today (my local time is 18:49). I put back the end date as well. I removed the 'premature' tag.I hope that was the right thing to do.

John Cross (talk) 16:50, 17 January 2017 (UTC)

What you did in Wiktionary:Votes/pl-2017-01/Trimming CFI for Wiktionary is not an encyclopedia 2 was fine, thank you. --Dan Polansky (talk) 18:51, 17 January 2017 (UTC)

Czech words for femalesEdit

I hesitate how to mark up definition lines for Czech words for females such as učitelka, lékařka, ředitelka and prezidentka. The problem obviously applies to other languages as well, e.g. German Professorin.

One option that I have often used and that is quite possibly prevalent in the Czech entries is like this:

  1. female teacher

A disadvantage of that is that the word "female" does not usually appear in translation; you do not say "she is a female teacher" but rather "she is a teacher".

Another option that I must have used at least once is this:

  1. teacher (female)

What I do not like about this is that the disambiguator "female" appears only in the gloss, but maybe it's okay. Furthermore, I like the gloss to be an abbreviated definition, which "female" isn't; it would be "female teacher", which would lead to a repetition of "teacher" in:

  1. teacher (female teacher)

Another option that I must have seen somewhere is the use of a context label:

  1. (female) teacher

That actualy looks okay since, in an English sentence like "she is a teacher", the subject of the sentence (she) is in the context of the predicate (be teacher).

Based on the above, I may stay with "female teacher", or I may switch to "(female) teacher".

--Dan Polansky (talk) 11:49, 22 January 2017 (UTC)

  • It would be good if {{cs-noun}} allowed you to put "m=učitel" of "f=učitelka" in the headword. But, anyway, I think "teacher (female)" or "teacher (male)" is the way to go. SemperBlotto (talk) 11:54, 22 January 2017 (UTC)
I don't see "female teacher" (or indeed "male teacher") being a problematic definition: they don't have to be an exact word-for-word phrase that you can insert into a translation without thinking. There are plenty of English entries of the same kind, like usherette. Equinox 20:54, 22 January 2017 (UTC)
Return to the user page of "Dan Polansky".