User talk:Mormegil

Czech declensionEdit

Hi, I'm going to revert your edit and before I do I'll tell you why. I wikilinked the parameters to facilitate creation of form of entries without wasting the time that would be wasted in creating them if there were no links in the tables. If no other templates were like that one then I probably either forgot about doing it, got lazy or moved onto doing something else. Since I have noticed this again now I will go about doing this to all the Czech templates and also making the links to nonexistent entries black instead of red. 50 Xylophone Players talk 18:46, 28 June 2009 (UTC)


Good to see you here adding etymologies. I just want to explain my recent edit at kapitán: the first etymon (item) in the etymology chain can be formatted as any other etymon. That is, the etymology does not say "the word so-and-so means so-and-so"; the meaning is passed as a parameter to {{term}} template instead: {{term||kapitán|deputy of a king}}. The I omitted the second parameter of "term", which leads to the creation of no hyperlink, as I do not know whether Old Czech is going to be explicitly represented at Wiktionary. I see no language code for Old Czech at Wiktionary:Index_to_templates/languages, hence the uncertainty.

Hope you don't mind. --Dan Polansky 12:35, 23 November 2009 (UTC)

OK. I don’t think “Old Czech” can be considered a proper language, it’s IMHO just a quite vaguely defined historical period of Czech language, so I believe your solution is correct. Thanks. --Mormegil 16:45, 25 November 2009 (UTC)

Etymologies, Rejzek and possible copyvioEdit

Hi, I see that you are entering etymologies from Rejzek 2007, such as in "klokan". I have Rejzek 2001 at home, but have refrained so far from entering etymologies from Rejzek 2001 into Wiktionary, for the fear of copyright violation. If a substantial portion of Rejzek 2001 or Rejzek 2007 is entered into Wiktionary, then I see no way how this avoids constituting copyright violation of Rejzek. Stating Rejzek as the source in the references section of Wiktionary entries is the right and honest thing to do, but it does not prevent copyright violation, from what I understand. But I am not a lawyer.

One thing that could prevent copyright violation is the doctrine that ideas cannot be copyrighted, only their fixed expressions (W:Idea-expression divide). Thus, the etymological information would be free from copyright protection. But I doubt that etymological information is viewed as free from copyright protection. --Dan Polansky 10:45, 4 September 2010 (UTC)

Any information is free from copyright protection – the exact expression is what is protected. When I enter etymological information, I try to read more sources and then create my own expression of the underlying information. The information that klokan was coined by Presl is a fact, it’s not like nobody can repeat this fact without infringing on Rejzek’s copyright; especially given the exact same information is given by Machek (as I have also indicated).
I agree this has to be done carefully and I am trying to be careful. I am also not going through Rejzek’s dictionary alphabetically, copying it into Wiktionary or anything like that. But if we want to have etymological information here, there is no other option than using already existing works (again: not copying them verbatim, but using information from them). Of course, ideally, we want to use as many sources as we can, and synthesize.
--Mormegil 14:58, 4 September 2010 (UTC)
Thank you for the explanation. Re "But if we want to have etymological information here, there is no other option than using already existing work": in the worst case, there would be no etymological information here other than the obvious. We are not acting under compulsion to have etymological information here; we are acting under compulsion to avoid copyright violation. And in fact, there is another option: to do again the etymological research work done by Rejzek, using only sources free from copyright.
The problem with synthesizing from several Czech etymological sources is that there are very few Czech etymological sources: I know of Rejzek and Machek (and Machek seems rather unreliable at that), and there is also Josef Holub, Stanislav Lyer: Stručný etymologický slovník jazyka českého, 1967.
The part in "klokan" that says "the initial k- probably according to German Känguruh" is only sourced from Rejzek, it seems from the page "klokan". And the statement is likely not even a fact but rather Rejzek's assessment. I am rather unclear how copyright deals with assessments and hypotheses, hence my doubt.
The situation is very different for English etymologies, as they can be sourced from several public domain sources. --Dan Polansky 17:23, 4 September 2010 (UTC)
An addendum: on another note, Rejzek does not state positively that "klokan" was coined by Pressl; he states "Asi Preslův výtvor podle skokan", so "Probaly coined by Presl on the model of "skokan". --Dan Polansky 17:31, 4 September 2010 (UTC)
Once again, “assessments”, “hypotheses”, “facts”, “theories”, etc. are all not copyright protected. The only thing protected by copyright is the exact expression (sequence of words) which an author used to describe his ideas/theories/…
If your idea that using information from copyrighted sources (to write an independent article based on the information, not to copy the text of the sources) constitutes a copyright violation were true, there could be no article in Wikipedia about anything recent. But, once again, that is not true. As you have linked to w:Idea-expression divide yourself, I have nothing more to add.
--Mormegil 19:31, 4 September 2010 (UTC)
You know, what makes me puzzled about this is that this would mean I am free to take over Rejzek 2001 completely to Wiktionary. I would be translating to English, so the exact phrasing would be slightly different. If the work were being done in Czech, I would just perform some synonymous replacements in Rejzek content, any copy that content to Czech Wiktionary, claiming that Rejzek's statements as contrasted to their particular phrasing are free from copyright. In English, I could even stop worrying about tweaking the phrasing per "merger doctrine", related to the idea-expression divide: if the particular idea can be phrased in only very few ways, then even that particular phrasing is not protected.
Whether I copy systematically or haphazardly, and whether the copying is done by one person or several people, none of that should make much difference to copyright issues.
This would apply not only to etymologies but also to translation pairs. But there are people in Wiktionary who seem to think that copying a significant portion of translation pairs from a copyrighted dictionary can constitute copyright violation, as follows from this discussion: User_talk:Razorflame#Czech_entries, June 2010. Now I do not know who is right, but all this is rather unclear and suspect to me.
On another note, I am not presenting the idea that using information from copyrighted sources constitutes copyright violation; I am presenting the idea that using a lot of information from only one copyrighted source could constitute copyright violation. This idea may be wrong, but it is not as naive as the other one. --Dan Polansky 09:08, 5 September 2010 (UTC)
First of all, using a translation of somebody’s copyrighted work does constitute a copyright infringment, the translation is a derivative work. Similarly, doing just a mechanical transformation of a copyrighted work does not free you from the original copyright. What you need to do is to “extract the information” from the original work, then use this information to form your own expression (ideally, you could explain the underlying information to somebody else, who have never seen the original work at all, and let him write the new expression). If the new expression is very similar to the original, it might be a signal the original is, indeed, not a copyrightable work (merger doctrine). (Or, it might be a signal you were not careful enough when extracting the information.)
The biggest threat I see against copying a substantial part of a dictionary is the sui generis database right. With regards to this right, it would be illegal to systematically copy the dictionary (without regard if the content itself is copyrightable, or even if it is public domain because of its age). But I repeat: I am not systematically copying substantial portions of the dictionary, and I am not telling anybody else to do it. I am only using very limited amount of information which I acquired as a legal user of the dictionary, which is legal.
As for whether translation pairs per se are copyrightable, I am not sure, but I would definitely not recommend anyone to copy a copyrighted dictionary in bulk (and I do not see a reason to do that, anyway).
--Mormegil 15:21, 5 September 2010 (UTC)>
(unindent) Thank you for your detailed and clear response.
The part on extracting the information and then presenting the extracted thing again in your own way seems rather inapplicable to etymologies: I see not many ways how to read a Rejzek etymology, extract the information, and then restate the information accurately without sounding much like Rejzek. This would have to fall under the head of "merger doctrine", then.
I am not accusing you in particular of systematically copying the etymological dictionary. But if a group of people haphazardly transfers (by reading and rephrasing) information from Rejzek into a wiki as happens in Czech Wikipedia, the result is much like as if a single person systematically tranferred Rejzek into a wiki, using some theoretically clean decoding-and-encoding process. If one person starts haphazardly transferring Rejzek into wiki, other people are likely to follow, and a collective transfer (by reading and rephrasing) of Rejzek is likely to happen, as there simply are not many other Czech etymological dictionaries than Rejzek, and there is no other modern one. In the end, significant portions of Rejzek end up in the wiki, just mildly rephrased. No single person would have violated copyright, it would seem, but the collective of the people would have created something that looks to me like a mildly rephrased copy of Rejzek, one that would be barred from copyright violation merely by the merger doctrine. Well, I don't really know and I have bothered you enough with this already. Thank you for your patience. --Dan Polansky 20:31, 6 September 2010 (UTC)

Rejzek templateEdit

Hello, I have created {{R:Rejzek 2007}}, so you do not need to paste the details of the reference work again and again. In formatting, I have followed the one of some of the most used reference templates: {{R:Webster 1913}} and {{R:Century 1911}}. The publisher could be added to the template, yet the author, the title and the year are sufficient for the identification of the work, so omitting the publisher seems okay to me, making the text that identifies the reference work shorter. What could theoretically be needed for the identification would be ISBN, for the case that the author and the publisher would publish more editions within one year, but I estimate this is very rarely the case. In any case, most reference templates don't mention ISBN. --Dan Polansky 10:09, 28 April 2011 (UTC)

Etymologies and words vs symbolsEdit

I saw you entering "<" in etymologies rather than "from". You may do it if you like. I want just to point out that a significant majority of editors prefer the use of "from" instead of "<". There is not yet enough support to make "from" a Wiktionary standard, though. I was entering "<" in the past, but now I am planning to enter "from". --Dan Polansky 11:43, 28 April 2011 (UTC)

I tend to use “<” only inside a multi-step derivation in a single language, I am using „from“ otherwise, e.g. “from English XYZ < XY from Latin QQ < Q”, I consider repeated „from“ („from English XYZ from XY from Latin QQ from Q“) to be a bit tedious. --Mormegil 14:29, 28 April 2011 (UTC)
I am speaking of the second, third, and further "<". I personally prefer "<", but there was a poll that showed that more people prefer "from" in all locations in an etymology chain (WT:BP#Poll:_Etymology_and_the_use_of_less-than_symbol). There was a vote to make "from" a standard, but it failed with 65.6% support (Wiktionary:Votes/pl-2011-02/Deprecating_less-than_symbol_in_etymologies). I will use the formatting proposed by the vote nonetheless. You do as you see fit; there is no standard. --Dan Polansky 16:28, 28 April 2011 (UTC)

Machek templateEdit

FYI, I have created {{R:Machek 1968}}. --Dan Polansky 16:15, 30 June 2011 (UTC)

Czech palindromesEdit

Hi, just a little question : Are you sure that a word like tápat for example are truly a palindrome ? Because, in some others languages where letters á and a are considered different and own a different place in the alphabet, this is not considered as a palindrome. However, I don't know Czech language, so maybe it's possible. Thanks in advance for your answer. Unsui (talk) 15:43, 18 August 2013 (UTC)

You are right that somebody might argue those are different letters, however, many Czech sources (most of them, I believe) do not consider accent differences (especially acute accents, which are of lesser relevance in Czech than e.g. č/c) to be important when talking about palindromes. --Mormegil (talk) 12:58, 19 August 2013 (UTC)
Ok, thanks for your answer. Unsui (talk) 19:19, 19 August 2013 (UTC)

Rejzek 2001 vs. Rejzek 2007Edit

We have Template:R:Rejzek 2007, which I think I have created based on what I saw you entering to mainspace, but I am not sure. I have an edition from 2001, ISBN 80-85927-85-3. I cannot find a mention of the 2007 edition on the web. Do you happen to have an edition from 2007 and its ISBN?

In W:sk:Český_etymologický_slovník, they tell me that there is a 2012 edition, which is unchanged from 2001. If there is also a 2007 edition and if it is unchanged from 2001, we should move the template to Template:R:Rejzek 2001 I think, and mention further unchanged edition years in the documentation of the template. --Dan Polansky (talk) 15:32, 30 August 2015 (UTC)

As per[1], the 2001 edition is the first one, and the 2015 edition is the 2nd updated one. --Dan Polansky (talk) 17:08, 30 August 2015 (UTC)
This was the original reference I used. This is a page about the edition on the publisher’s website. --Mormegil (talk) 08:32, 31 August 2015 (UTC)
Thanks. So you used Český etymologický slovník - elektronická verze pro PC, with EAN 8594037280648, as per your link. It seems safe to rename the template to Template:R:Rejzek 2001. --Dan Polansky (talk) 19:14, 31 August 2015 (UTC) (Striken out. --Dan Polansky (talk))
I noticed the link states the year 2008. Do they perhaps have more electronic versions? Or is it like a reprint?
They have the following "Vydání":
  • Vydání*: Pro systémy Win XP, Win Vista, Win 7, 2008 | EAN: 8594037280648[2]
  • Vydání*: 2. , dotisk 2012 | Vazba: pevná | Stran: 752 | Formát: A5 | ISBN: 978-80-7335-296-7 | EAN: 9788073352967[3]
  • Vydání*: 3. , | Vazba: pevná | ISBN: 978-80-7335-393-3 | EAN: 9788073353933[4]
I will think twice before I rename it. --Dan Polansky (talk) 19:50, 31 August 2015 (UTC)
