Vietnamese entries with classifiers Edit

Please see Wiktionary:Requests for deletion#Non-idiomatic Vietnamese words and Wiktionary:Requests for moves, mergers and splits#Non-idiomatic Vietnamese words. Do you have any problem with the precedent that these deletions would set? – Minh Nguyễn (talk, contribs) 10:36, 3 January 2014 (UTC)Reply[reply]

Thanks for the heads up. I'll try to draw up a response at the first link: Wiktionary:Requests for deletion#Non-idiomatic Vietnamese words TeleComNasSprVen (talk) 21:29, 3 January 2014 (UTC)Reply[reply]

Re: Yellow background Edit

Do you use the MonoBook skin? The Vietnamese wikis customize that skin to show File:Headbgyellow.jpg in the background of non-article namespaces. (The English Wikipedia used to use a periwinkle content background in those cases.) Originally, the Standard (Classic) skin used to turn golden for non-article namespaces, presumably so it would be harder to pass off (say) a user page as an article. In Vector, the Vietnamese wikis still change all the light blue elements to light green. – Minh Nguyễn (talk, contribs) 23:23, 4 January 2014 (UTC)Reply[reply]

Yes, I use Monobook and thanks for the clarification. I was a bit confused at first why some pages were yellow and some retained the 'whiteness' that the Monobook skin has on other wikis. That makes sense, although it could be made more apparent. (Or maybe it is explained somewhere and I just haven't looked closely enough into viwiki's documentation). TeleComNasSprVen (talk) 01:15, 5 January 2014 (UTC)Reply[reply]
It used to be a software feature, so I guess we never thought to document it anywhere. I mentioned it at w:vi:Wikipedia:Bài bách khoa là gì?, which is quite obscure, but we always assumed that people would notice the connection between the color change and the namespace in the title. I'm currently toying with other background changes, like diagonal stripes on sandbox pages and "PREVIEW" wordmarks when previewing changes. – Minh Nguyễn (talk, contribs) 13:57, 5 January 2014 (UTC)Reply[reply]

Copyright Edit

Re: "I always thought copyrighting a dictionary is pretty silly given that definitions of words contain no original thought."

Be advised that definitions are copyrighted, whether you find it silly or not. "Original thought" is not a prerequisite for copyright; it is the "original expression" in a very weak sense of "original" that is a prerequisite. Actually, original thought itself is not protected, since it is the expression that is protected, not the thought. --Dan Polansky (talk) 07:48, 11 January 2014 (UTC)Reply[reply]

Yes, my original statement was a rationale for the {{MultiLicensePD}} above. There are more considerations on this at the Wikipedia page on the 'threshold of originality', which I'll quote here:

The threshold of originality is a concept in Anglo-American-based copyright law systems that is used to assess whether a particular work can be copyrighted. It is used to distinguish works that are sufficiently original to warrant copyright protection from those that are not. In this context, "originality" refers to "coming from someone as the originator/author" (insofar as it somehow reflects the author's personality), rather than "never having occurred or existed before" (which would amount to the protection of something new, as in patent protection).

As I understand it, because most definitions are common knowledge (read: facts) rather than expressions of thought, it is easy to take a number of non-original pages from a dictionary without having to comply with copyright. It is only when it reaches an arbitrary threshold, say of about a hundred or so pages, that it is necessary to comply with copyright or claim fair use. And even then, I do not believe we have any power, legal or otherwise, to stop people from simply not giving attribution to us, out of sheer laziness or otherwise: they could just claim they got it from their personal dictionary.
I've just been thinking harder lately about why we bother with the CC-BY-SA attribution at all. Perhaps the other Wiktionary users will think differently. Perhaps you may think differently about how copyright works on this site. I'm just exploring its ramifications here. TeleComNasSprVen (talk) 08:04, 11 January 2014 (UTC)Reply[reply]
Definitions are not common knowledge; meanings are. Definitions are specific phrasings, and these are not common knowledge. If you randomly ask people to give you definitions for various words, they will either give you no definition at all or they will give you one that differs from what you find in dictionaries, unless they actually check a dictionary. If you check OneLook dictionaries, you will find that different dictionary makers try to come up with slight variations in the definition phrasing.
Whatever you think, I ask you to refrain from copying English definitions from other copyrighted dictionaries (as contrasted to public domain ones, such as Webster 1913); I do not claim you have already added anything.
I do not know what you mean by "non-original pages from a dictionary". --Dan Polansky (talk) 08:12, 11 January 2014 (UTC)Reply[reply]
"Non-original pages from a dictionary" refers to the copyright context of "originality" in the quote above. It does not make sense for people to violate copyright when they quote dictionaries in everyday speech and forgetting to give proper attribution, in response to questions posed by others like "What does X mean?" Well, it is one thing to question copyright rules and another to comply with them. I can choose to disagree with copyrights, and still comply with them, just as I do the rest of the policies on this site if I have found a policy I disagree with. But I guess the ontological questions of thoughts, expressions, meanings and copyrights could be outside the scope of this dictionary.
Re: "Definitions are not common knowledge", then they are not expressions of fact? Expressions of meaning? Opinions are original thought, and copyrightable; the fact that the sky is blue is not, and a picture snapshot of the sky in its entire blueness is not copyrightable. To the extent definitions are also expressions of an author's opinion on what the meaning ought to be, then they are copyrightable. But most definitions are true to the fact they represent, which is common knowledge. Copyright on words is a maze of legal territory to get lost in.
Re: "refrain from copying English definitions" All the entries I have so far published are my own, and they are also everyone's knowledge, as I've indicated in the template. I've not gone even as far as to quote the great Webster 1913. Non-originality is saying fuego means fire, and that "fire burns". It is the presentation of that fact, how we think it should be, using the various templates and headers and metadata that we use, that is copyrightable. TeleComNasSprVen (talk) 08:30, 11 January 2014 (UTC)Reply[reply]

CFI and idiomaticity Edit

"CFI:idiomaticity" does not refer to "conveying meaning"; we have WT:CFI#Conveying meaning for that as a separate section in CFI, and "conveying meaning" is part of "attested", not "CFI:idiomatic". "CFI:idiomatic" is not the same as "idiomatic"; CFI:idiomatic is the same as "not sum of parts"; CFI:idiomatic is defined at WT:CFI#Idiomaticity. --Dan Polansky (talk) 20:57, 17 January 2014 (UTC)Reply[reply]

Then explain the quotes I brought up and the arguments presented therein. TeleComNasSprVen (talk) 21:03, 17 January 2014 (UTC)Reply[reply]
Wikitiki got it wrong in the quotes: SOP is the same as not CFI:idiomatic. There is nothing to explain other than human error.
Let us see what WT:CFI says:
  • "... of including a term if it is attested and idiomatic."
  • Attested” means verified through [...] use in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year (different requirements apply for certain languages).
  • An expression is “idiomatic” if its full meaning cannot be easily derived from the meaning of its separate components.
--Dan Polansky (talk) 21:06, 17 January 2014 (UTC)Reply[reply]
All you have done is refer back to CFI like a legal document; you've not provided any constructive argument for a possible vote that I called for to set up which would solve most of this. You're still making it about keeping or deleting the entry instead of responding to my question appropriately. Anyway, this is better discussed at the RFD which will gain better input. TeleComNasSprVen (talk) 21:09, 17 January 2014 (UTC)Reply[reply]
Okay, since that is what you proclaim to be 'human error', I suggest you leave that comment about that in the thread on WT:RFD#television show rather than here. TeleComNasSprVen (talk) 21:11, 17 January 2014 (UTC)Reply[reply]
Well yes, this is how RFD works, referring to what CFI sets up as criteria for inclusion. Most of the time, we actually do abide by CFI, and do check whether the nominated terms meet WT:CFI#Idiomaticity or not. Some RFD discussions may use non-CFI arguments (including mine), but that should be made explicit. --Dan Polansky (talk) 21:14, 17 January 2014 (UTC)Reply[reply]
I do not disagree that CFI is a good document to abide by; however, it's not the end-all and be-all argument-stopping cut-short-all-thought binding document that you seem to make it out to be. Sometimes the exception proves the rule, not just its adherents. RFD is one of many venues by which we can do a reassessment of CFI and how it applies to our treatment of words.
You've still not provided a solid argument in the RFD discussion to indicate whether or not you support what I've said. References to CFI do not adequately address the point I made earlier in the RFD discussion about how best to think about and treat repetitively occurring entries, which are not necessarily sums of parts. TeleComNasSprVen (talk) 21:47, 17 January 2014 (UTC)Reply[reply]
In WT:RFD#hahaha, I made my argument entirely within CFI. There, in 'Okay, does anyone think that our entries for "ha" units beyond a certain number, say three or four, convey any idiomaticity to them, ...', you spoke of "idiomaticity", which in RFD normally refers to WT:CFI#Idiomaticity. So I think it relevant to point out that "hahahahahahahahaha" is CFI:idiomatic. --Dan Polansky (talk) 22:05, 17 January 2014 (UTC)Reply[reply]
I believe I spoke about 'general' "idiomaticity", referred to by WikiTiki, not CFI idiomaticity. I've said it there and I've said it here, it does not have to pertain to the particular entry in question. It's about other things in general, like "great-great-" etc, and I think you've missed that part. And anyway, this discussion should be moved back to RFD. TeleComNasSprVen (talk) 22:09, 17 January 2014 (UTC)Reply[reply]
Since CFI defines the term "idiomaticity" and since failure of idiomaticity is the main reason for terms being nominated at RFD, as opposed to RFV, there is no way an experienced Wiktionarian can think that "idiomaticity" in RFD refers to general idiomaticity as opposed to WT:CFI#Idiomaticity. --Dan Polansky (talk) 22:13, 17 January 2014 (UTC)Reply[reply]
I deliberately chose to talk to you here rather than at RFD, since the explanations I am making are very basic, trivial for editors experienced with RFD and CFI. --Dan Polansky (talk) 22:15, 17 January 2014 (UTC)Reply[reply]
Yes, but I understand it well enough, having been here for some time now, and I've even cited WT:COALMINE as a reason to keep a few entries. The nomination I made may have been in error, but it's an opportunity for us to explore this issue beyond what CFI just tells us. TeleComNasSprVen (talk) 22:19, 17 January 2014 (UTC)Reply[reply]
──────────────────────────────────────────────────────────────────────────────────────────────────── When, in RFD, you use the word "idiomaticity" with the intent not to refer to CFI, that suggests you do not understand RFD and CFI well enough. When you further claim that "idiomaticity" refers to "coveying meaning", that suggests very poor knowledge of CFI. --Dan Polansky (talk) 22:22, 17 January 2014 (UTC)Reply[reply]
Of course, I would much rather have other things to do than dealing with RFD nominations created by people who contribute close to nothing of value to Wiktionary. --Dan Polansky (talk) 22:23, 17 January 2014 (UTC)Reply[reply]
What is your purpose here then? Using RFD to examine CFI's merits, whether to apply it or ignore it, or god forbid actually change it, seems fine with me. There's nothing wrong with using RFD as a means to take a closer look at it.
Re: "Of course, I would much rather have other things to do than dealing with RFD nominations created by people who contribute close to nothing of value to Wiktionary." That is your opinion and your choice, I have respected that. If you insinuate I have not contributed constructively to Wiktionary, that is close to a personal attack and I would ask you to voice that opinion on the RFD discussion rather than here on my talkpage, where I am free to remove to remove at my discretion. I believe that I've contributed my fair share of English entries as well as Vietnamese translations, which you can check if you'd like. TeleComNasSprVen (talk) 22:32, 17 January 2014 (UTC)Reply[reply]
Edit counters show you have contributed very little, yet you are busy clarifying CFI by making RFD nominations. I for one find that annoying. When that is combined with poor understanding of CFI and faulty arguing about "idiomatic" with no "I stand corrected", the annoyance is amplified. Your user name tops it all. --Dan Polansky (talk) 22:36, 17 January 2014 (UTC)Reply[reply]
Why don't you just choose not to participate in the discussion? I'm certain any unproblematic entries meeting CFI would be voted to be kept by other participants. I still don't see how RFD can break Wiktionary in any way. It's the same deal given to you when you nominated apple at RFV.
Edit counters are not everything. They only show the quantity of my work, not its quality, which I find severely lacking on this dictionary. But perhaps it is the nature of a wiki where everyone can edit that quantity naturally supersedes quality. It also neglects to show deleted edits, most of which I have made RC patrolling. Perhaps I'm wasting my time increasing it bit by bit responding to you now?
Unfortunately I chose this name at a time when most usernames had been taken by SUL, and for fear of breaking any backlinks to my name I'd rather keep this than have a rename or edit under a different name. Most other people do not bother to complain about it. Do you use your real name and prefer others do the same? For me, it is no. And besides, the username or how you address me is irrelevant so long as the message is communicated. TeleComNasSprVen (talk) 22:54, 17 January 2014 (UTC)Reply[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── Consider things like great-great-great-grandfather at the RFD discussion nominated by Connel MacKenzie and the recent RFD/RFV nominations by WikiTiki, though they seem idiomatic to you. The arguments presented at those discussions initially did not cite CFI, but pointed to problems with how we handle words using CFI. Would you consider them annoying as well? TeleComNasSprVen (talk) 23:03, 17 January 2014 (UTC)Reply[reply]

Well, now that we have a vote up and running to change CFI, which is what I had planned for when I wrote what I did at RFD, the point is now moot. TeleComNasSprVen (talk) 04:43, 26 January 2014 (UTC)Reply[reply]

Pointless rfe and rfp Edit

In diff, you added etymology and pronunciation section. This you should not do, IMHO, since both etymology are pronunciation of "cod liver oil" are sum-of-parts and obvious. And even if the etymology were worth having, then you should be adding it rather than placing rfe around, since the etymology must be clear to you. --Dan Polansky (talk) 07:42, 18 January 2014 (UTC)Reply[reply]

Why? This is to clearly distinguish it from a sense that is overtly "sum of parts": cod + liver + oil. You've basically supported its deletion at RFD... you know you can do so by actually participating in RFD, right? TeleComNasSprVen (talk) 07:52, 18 January 2014 (UTC)Reply[reply]
  • Ditto: diff. --Dan Polansky (talk) 10:40, 18 January 2014 (UTC)Reply[reply]
    • You can replace the rfp with an IPA markup template as appropriate. You'll have to explain that more, as I do not see anything wrong with that edit. WT:ELE states "Ideally, every entry should have a pronunciation section, and perhaps a sound sample to accompany it. However, pronunciations vary widely between dialects, and non-linguists often have trouble writing down pronunciations properly." The course of action would be, if one does not know IPA markup, for fear of misusing a wrong character in the markup, they should ask someone who is intimate enough with IPA to be able to do the job. TeleComNasSprVen (talk) 10:46, 18 January 2014 (UTC)Reply[reply]
      • The pronunciation of black hole is the pronunciation of black followed by the pronunciation of hole, right? Similarly for white dwarf, right? So placing conspicuous RFP, which cries "something's wrong", on an entry that does not really need pronunciation (thick skin) is borderline vandalism, right? --Dan Polansky (talk) 10:57, 18 January 2014 (UTC)Reply[reply]
        • I do not believe it is vandalism. I think that all entries should have a pronunciation section regardless, because the pronunciations of some entries may still behave differently together than they do apart. "white train" for instance is pronounced differently from "white" + "train". In fact most of the Vietnamese compound syllables here that are composed of more than two syllables also have different pronunciation sections even though they are pronounced the same apart, thanks to Wyang's new tool {{vi new}} which provides IPA for compounds only, like khách sạn even though khách and sạn can have two separate pronunciation sections and keys. Anyway, I will ask you again to stop posting on my talkpage about pointless matters like issues of copyright, and to let me get back to building the dictionary. TeleComNasSprVen (talk) 20:26, 18 January 2014 (UTC)Reply[reply]
          • It's not vandalism, but it's not harmless, either. We only have pronunciation sections for a fraction of our entries, so prioritizing their addition is important. What you're doing is cluttering the request category with entries that might be deleted, which would be a waste of time for those adding the pronunciations. If it were only your own time wasted- no problem. Wasting the time of the few people who know how to do proper IPA pronunciations- not a good idea. Please note that I'm not endorsing Dan's ways of interacting with others (I personally find them annoying), just asking you to consider this on the merits, and not by who brought it up. Chuck Entz (talk) 21:06, 18 January 2014 (UTC)Reply[reply]
            • Oh I see, so you're saying I should not be adding pronunciation sections to entries that might be deleted under RFD/RFV? Would it still be appropriate to add the sections to already existing entries then, which have not been tagged as such? And like I said to Dan, I try my best to consider the central points of his arguments rather than its unfriendly tone, so your input is much appreciated.
            • I'm tempted to consider his posts as borderline harassment by now, Chuck Entz. To be sure, the blocking policy would be quite clear on this: "Causing our editors distress by directly insulting them or by being continually impolite towards them." TeleComNasSprVen (talk) 21:13, 18 January 2014 (UTC)Reply[reply]
            • Creating pronunciation sections is ok, but I would advise against request templates such as rfe and rfp, since you would be asking someone else to spend their time on it. It's just a matter of consideration for those who would be wading through the requests. It's a judgement call, so I'm not saying "don't ever do that", but I would never do that, myself. As for Dan's behavior, it wouldn't be the first time he's been blocked for harassing others, though the block I'm aware of had some irregular aspects to it that kind of blunted the moral authority of the block. We're not all that clear or consistent when it comes to whether to block long-time, productive contributors. Chuck Entz (talk) 22:57, 18 January 2014 (UTC)Reply[reply]
  • black hole could be stressed in two ways, so pronunciation isn't completely obvious. —CodeCat 23:01, 18 January 2014 (UTC)Reply[reply]
    black hole can't be stressed in two ways. There is only one way to stress an adjective followed by a noun. --WikiTiki89 00:07, 19 January 2014 (UTC)Reply[reply]
    But just by looking at the term "black hole" as written, you don't see that the first word is an adjective. It could be another noun, which would turn it into a compound, and compounds are stressed differently from noun-adjective phrases. And there's nothing in the black hole entry that unambiguously confirms either possibility. —CodeCat 00:45, 19 January 2014 (UTC)Reply[reply]
    I guess that's why they're usually written as one word in other Germanic languages. But anyway, that information can be obtained by looking up the constituent words. --WikiTiki89 00:47, 19 January 2014 (UTC)Reply[reply]
    If I look up black, I find both an adjective and a noun, so that doesn't really tell me anything. —CodeCat 00:49, 19 January 2014 (UTC)Reply[reply]
    Then we are missing usage information at [[black]] and we are missing a sufficiently thorough etymology at [[black hole]]. --WikiTiki89 00:57, 19 January 2014 (UTC)Reply[reply]
    All of that, just to avoid putting a pronunciation in? —CodeCat 01:09, 19 January 2014 (UTC)Reply[reply]
    I never said that shouldn't put the pronunciation in. But the IPA transcription shouldn't be the only way to figure this stuff out. --WikiTiki89 01:22, 19 January 2014 (UTC)Reply[reply]

alt forms Edit

I'm not sure whether e.g. off with her head‎, off with his head‎ are alternative forms of off with one's head. They are more specific forms, limiting by gender. Perhaps redirects would make more sense, if these entries are useful at all? Equinox 04:25, 2 February 2014 (UTC)Reply[reply]

Perhaps, but I find there's no hard and fast rule for determining the relationship between inflected forms of "one" in entries. I've made many of my such entries to reflect some of the other practices I found on other parts of Wiktionary - they're not necessarily part of any policy so you're free to change as you see them fit into Wiktionary. I will just follow a different practice for my entries, however, until I find a rule that says that I cannot. I'm not sure what other Wiktionarians have to say on this, but I assume they follow their own particular idiosyncracies with the creation of their entries. TeleComNasSprVen (talk) 04:28, 2 February 2014 (UTC)Reply[reply]

Distinguishing Hán-Việt vs. Nôm and Hán-Việt vs. thuần Việt Edit

A lot of the Vietnamese entries have conflated Hán-Việt readings with thuần Việt meanings. For example, nghệ was given as a noun meaning "profession, trade". But you could never say "cái nghệ" to mean "a profession". Rather, nghệ is a reading of , and definitions like "profession" should go there. The existing Vietnamese Hán character entries also fail to distinguish between Hán-Việt and Nôm readings. To make matters worse, some well-meaning editors went through and added {{vi-readings}}, setting hanviet to Nôm readings or nom to Hán-Việt readings.

I've been correcting incorrect entries by hand based on the Vietnamese Wiktionary. (The Vietnamese Wiktionary only did a partial import of WinVNKey's Nôm dictionary: existing quốc ngữ entries got corresponding Hán-Việt and Nôm tables, and existing CJK entries got Hán-Việt and Nôm readings, but plenty of characters got left out.) Unfortunately, at this rate, it's going to take a long time to clean up the mess. Any ideas on how we could automatically detect and flag problematic entries?

 – Minh Nguyễn 💬 10:00, 16 December 2014 (UTC)Reply[reply]