User talk:-sche/Archive/2014
Periods at the end of sentences and consensus
editThere is no consensus to have periods at the end of definitions and thus you should not be using automated means of adding the periods. --Dan Polansky (talk) 21:31, 6 January 2014 (UTC)
The edit summary "misc" suggest a nefarious action, and indeed, there it is. A bit like when you started demoting various spellings without previous discussion and consensus. I think you should desysop yourself. --Dan Polansky (talk) 21:32, 6 January 2014 (UTC)
- Despite its name, AWB is not automated. I am using it to manually clean up various -meter and -metre entries, mostly by adding etymologies to the -metre forms and clarifying on the -meter forms that the -metre forms are nonstandard, but also to perform other minor cleanup operations as long as I'm there. AWB gathers all the -meter and -metre entries in one place, which is the only reason I am using it rather than Firefox or another browser. - -sche (discuss) 21:37, 6 January 2014 (UTC)
- The fact that I am performing several manual cleanup operations at once is the reason I supplied a general edit summary. (Had I been using Firefox, it's possible I would have followed the common practice of simply not supplying an edit summary at all for self-evident cleanup, but AWB requires one.) - -sche (discuss) 21:42, 6 January 2014 (UTC)
- WT:AGF. —CodeCat 21:38, 6 January 2014 (UTC)
- @-sche: You are mass adding periods to the end of definitions without there being a consensus for that; what specific means you have chosen for mass adding is of less importance. Was it your intention to be adding the periods, or was it not? What prevented you from using a meaningful edit summary, anyway? --Dan Polansky (talk) 21:43, 6 January 2014 (UTC)
- And what that action is questioned by me, you continue. No, I am not assuming a good faith; this is blatant bad faith. --Dan Polansky (talk) 21:44, 6 January 2014 (UTC)
- Responding to a misplaced response above: What makes you think that adding a period is "self-evident cleanup"? --Dan Polansky (talk) 21:51, 6 January 2014 (UTC)
- Because so many of us do it, probably. DCDuring TALK 21:53, 6 January 2014 (UTC)
- Do you know, DCDuring, that countless editors prefer definitions without periods, and that editors removing periods had to be stopped, and pointed out that there is no consensus for removing periods? --Dan Polansky (talk) 21:54, 6 January 2014 (UTC)
- Because so many of us do it, probably. DCDuring TALK 21:53, 6 January 2014 (UTC)
Language code templates causing script errors
editCan all of them be deleted? —CodeCat 16:21, 12 January 2014 (UTC)
- Yes; thanks for pointing that out. - -sche (discuss) 19:52, 12 January 2014 (UTC)
- There are a few more now. —CodeCat 23:36, 14 January 2014 (UTC)
Errors
editBy reinstating the original sitation in Module:links, you also made it trigger errors again, but they're still hidden so you see a void. The actual hiding of errors was done in MediaWiki:Common.css, so if you want it to show "script error" again you need to undo that. —CodeCat 21:02, 15 January 2014 (UTC)
Conundrum (Please help!)
editSche, this issue has been bothering me ever since I first became a linguist and became cognisant of its [the issue's] presence. Might you be able to help explain it to me?
My pronunciations (and the pronunciations held by everyone else I know in my area) of /u:/-type words seem to be quite different from the pronunciations listed on Wiktionary (and anywhere else I've checked, for that matter). Now, I'm not attempting to have the IPA transcriptions be changed or anything of that sort, I'm merely hoping that you might be able to explain these inconsistencies to me.
My pronunciations of the words "dew", "new", "newt", "knew", "sue", "lieu", "flute" and "lewd" (seem to) have the diphthong /ɪuː/. My pronunciation of the words "few", "pew", "mew", "feud" and "cue" (seem to) have /jɪu:/. My pronunciations of the words "rebuke" and "puke" have /ju:/. My pronunciation of the words "who", "mood", "food", "boot", "poot", "coup", "aloof", "rule", "tool" and "loot" have /uː/.
Yet my pronunciation of the word "you" seems to be /jɪu:/ when stressed, /ju:/ when slightly unstressed and mid-sentence, /jɪ/ when mostly unstressed and /jə/ when completely unstressed. Similarly, "to", "too" and "two" are /ɪu:/ when stressed, /u:/ when immediately following a another word (that may be stressed or partially stressed) [i.e. twenty-two is /twɛnˈtʰi.tʰ(ɪ)u:/ when stressed, /twɛnˈtiˈtʰu:/ when partially stressed, and /twɛnɾi'tu:/ when completely unstressed], /ɪ/ when mostly unstressed, and /ə/ when completely unstressed. My pronunciation of the word "mute" fluctuates between /juː/ and /jɪu:/, tending towards /juː/. My pronunciation of the words "rude" and "prelude" fluctuate between /u:/ and /ɪu:/, tending towards /u:/. My pronunciation of the words "dude" and "nude" fluctuate between /ɪu:/ and /u:/, tending towards /ɪu:/. My pronunciation of "shoot" and "chute" seem to fluctuate between /ʃɪu:/ and /ʃu:/ at unclear intervals.
So, might you be able to explain this huge pronunciation conundrum? If you can't explain the stuff I said in the fourth paragraph, can you at least explain the four way distinction I talked about in the third paragraph?
[NOTE: Before you ask, I'm not confusing /u:/ with /ʊ/ or anything like that. "Roof" for me has the same vowel as "who", and "who" for me doesn't have the same vowel as "wood" or "hood".] Tharthan (talk) 19:20, 20 January 2014 (UTC)
- Hmm. I may not be the best person to ask about the peculiarities of English dialects' pronunciations; I'm sorry; you might instead solicit the input of native speakers in the Tea Room. But the phenomena of yod-dropping vs yod-retention are at work. Elision of /j/ before /u/ after /tʃ/, /dʒ/, /j/, /ɹ/, /l/, /s/, /z/, /θ/, /t/, /d/ and /n/ is common in many varieties of American English, per WP, which is probably why yods are missing in those circumstances from the pronunciations Wiktionary marks as US/GenAm. (Some entries note that words can either have yods or drop them in the US, e.g. eschew, but it seems the default practice is to simply omit the yods.) In contrast, yod-dropping after /l/, /s/, /z/ and /θ/, and especially after /t/, /d/ and /n/, was formerly nonstandard in England. The speech of New England is similar to that of England, which may explain why you don't drop the yod* from "dew", "new", "sue" and "lewd".
*Or the yod-like vowel. The distinction between /Cjuː/ and /Ciu/~/Cɪu/ is not always easy to make.
WP also notes that there are some accents, such as Welsh, where pairs like chews/choose, yew/you and threw/through are distinct, with the first member having [ɪu] while the second has [uː]. Out of curiousity, are those pairs distinct for you, or homophonous?
Regarding the pronunciation of "you" as /jɪ/ vs /jə/: the distinction between unstressed /ɪ/ and unstressed /ə/ is also grey, and that unstressed /ɪ/ would reduce to /ə/ is unsurprising. Some dictionaries, e.g. the OED, recognise that some speakers pronounce words with a vowel that fluctuates or is intermediate between the two, which they transcribe /ᵻ/; (some of) our entries use /ɨ/ for much the same purpose.
Regarding the pronunciation of "you" as /jɪu:/ vs /ju:/: perhaps when speakers give a word particular emphasis, they introduce elements to the pronunciation that would not otherwise be there, such as the intrusive /ɪ/, or perhaps yod-dropping vs yod-retention is at work (recall that yods are sometimes dropped between /j/ and /u/).
The variation between "/ɪu:/" and "/u:/" in "rude", "dude", "nude" and "prelude" sounds like more yod-dropping vs yod-retention (you sometimes drop the yods and sometimes don't). - -sche (discuss) 21:57, 22 January 2014 (UTC)- "Chews" and "choose" are distinct for me in the way that you described, as are "threw" and "through". "Yew" and "you" are not distinct, however. Tharthan (talk) 12:17, 23 January 2014 (UTC)
- Actually I think [ɪu:] is a common realization of the /uː/ phoneme usually after alveolars or dentals, even with no yod-dropping (for example, in words such as soon). After labials or velars (in words such as pool), this is less common. --WikiTiki89 22:31, 22 January 2014 (UTC)
- Hmm. It's good to hear that I'm not the only one. Though I pronounce soon as /suːn/, not /sɪu:n. Tharthan (talk) 12:17, 23 January 2014 (UTC)
koq
edit
Currently the koq language code incorrectly contains the same information as the kfe language code (both called Kota). It seems that koq should refer to a language in the Bantu family spoken in Gabon. See Module talk:languages/data3/k. --WikiTiki89 23:52, 25 January 2014 (UTC)
- Disregard that, there was no issue. --WikiTiki89 00:41, 26 January 2014 (UTC)
Why are there two packages and what to do?
editDo I have to combine the core package and the compat package together? --kc_kennylau (talk) 08:55, 27 January 2014 (UTC)
- What is this in reference to? - -sche (discuss) 08:56, 27 January 2014 (UTC)
Please purge
editPlease purge Special:UncategorizedPages. --kc_kennylau (talk) 09:39, 3 February 2014 (UTC)
- I don't have the authority / ability to do that, but it will happen at some point. (I'm not sure if the devs do it or if it's set up to happen automatically.) - -sche (discuss) 19:11, 3 February 2014 (UTC)
Mioko language
editChallenge: figure out what the hell Mosel (1980) is talking about when he mentions the (unattestably named) Mioko language. It's something very closely related to the many-named languages {{ksd}}
, {{rai}}
, and {{lbb}}
, but I think he's a reliable enough source to assume that this language, whatever it may be, actually exists. —Μετάknowledgediscuss/deeds 02:24, 7 February 2014 (UTC)
- Hmm, it appears to refer to an island mentioned in the Wikipedia article for the Duke of York Islands, but I still can't identify it with an ISO code or Ethnologue entry, or even prove its existence/identity. —Μετάknowledgediscuss/deeds 02:31, 7 February 2014 (UTC)
- Ah, yes, the use of placenames as language names is a bane of students of Oceanic (region) languages (regardless of whether or not that's what happened here—I can't offhand tell). If it's an Oceanic (family) language, it's probably not the Miyako language. Hmm... Ethnologue lists Mioko — and Molot, which she (Mosel) also mentions in proximity to Mioko — as dialects of Ramoaaina. - -sche (discuss) 02:49, 7 February 2014 (UTC)
- That makes sense, given the alternate name 'Duke of York'. So now the policy question: whose lead shall we follow? Oh, and thank you not only for Mosel's book but for correcting my rather sexist assumption about Mosel (should've looked at the first name). —Μετάknowledgediscuss/deeds 02:53, 7 February 2014 (UTC)
- I am rather conservative about splitting languages, it seems. My inclination is to keep Mioko and Molot under
{{rai}}
for now, particularly because I don't see that anyone has actually taken the position that they are separate languages (has someone?). In listing rai’s dialects, Ethnologue says Makada is "very different, possibly not intelligible to speakers of other dialects", but it doesn't say that about Mioko or Molot. And Mioko and Molot words make their way into Mosel's comparative wordlists, but it's not unusual for dialects to make their way into wordlists where they have distinct terms. I could compare Danish foobar to Hamurgisch fubar and contrast it with Low Prussian kazoo (and compare that to Polish kazu) without implying that Hamburgisch and Low Prussian were separate languages. Entries can use{{context}}
and etymology sections can use "from Mioko{{etyl|rai|foo}}
" so that no information is lost.
You're welcome / no problem. Actually, that reminds me that I need to upload all the Cimbrian material I acquired (for my use and for others')... ugh, that's gonna take forever. - -sche (discuss) 05:54, 7 February 2014 (UTC)
- I am rather conservative about splitting languages, it seems. My inclination is to keep Mioko and Molot under
- That makes sense, given the alternate name 'Duke of York'. So now the policy question: whose lead shall we follow? Oh, and thank you not only for Mosel's book but for correcting my rather sexist assumption about Mosel (should've looked at the first name). —Μετάknowledgediscuss/deeds 02:53, 7 February 2014 (UTC)
- Ah, yes, the use of placenames as language names is a bane of students of Oceanic (region) languages (regardless of whether or not that's what happened here—I can't offhand tell). If it's an Oceanic (family) language, it's probably not the Miyako language. Hmm... Ethnologue lists Mioko — and Molot, which she (Mosel) also mentions in proximity to Mioko — as dialects of Ramoaaina. - -sche (discuss) 02:49, 7 February 2014 (UTC)
"bear" Usage notes
editThanks; see User_talk:Wikitiki89#"bear", specifically the last entry (Thnidu (talk) 04:07, 7 February 2014 (UTC)). --Thnidu (talk) 04:46, 7 February 2014 (UTC)
roa-ptg
editDo you think we should rename this code to roa-opt
to match the others? —CodeCat 21:50, 8 February 2014 (UTC)
- I have no strong feelings on the matter, if other people think it's better to let sleeping dogs / assigned codes lie, but yes, my weak preference would be for it to be renamed; as it is, "ptg" doesn't indicate that it applies to Old rather than modern Portuguese. Let's ping our Portuguese editor and see what he thinks. - -sche (discuss) 22:08, 8 February 2014 (UTC)
- No strong opinion. — Ungoliant (falai) 23:52, 8 February 2014 (UTC)
I think you made a mistake here...
editdiff —CodeCat 14:11, 11 February 2014 (UTC)
- Indeed; thanks for catching it. That was a manual error rather than the result of [what would thus have been faulty] search-and-replace regex, and it can't have happened on many pages, in part because there were not many pages that used
{{cx|obsolete}}
+{{alternative spelling of}}
to begin with. When the next dump comes out I'll check for any other instances of \{\{(context|cx)\|obsolete (spelling|form) of. - -sche (discuss) 18:16, 11 February 2014 (UTC)- I found it because it caused a hidden script error, so I think this is the only one. —CodeCat 18:29, 11 February 2014 (UTC)
roa-ptg to roa-opt
editDo you think you could do this move? You seem very experienced with working with language codes, and I don't really know how to track down all uses of a code. —CodeCat 03:41, 15 February 2014 (UTC)
- I can track down all the uses — it's simpler in this case than it would be with a code like "de"; the string "roa-ptg" isn't used anywhere except as a language code, so I can just search a dump for all pages which contain it. There are 2444 uses, most of which are in the main, Appendix and Category namespaces: User:-sche/roa-ptg. I don't foresee any problems if you have Mewbot update those to replace roa-ptg with roa-opt. (2400+ is a bit too many for me to do with AWB.) Then there are a lot of categories like Category:roa-ptg:Sound that incorporate the code into their names and will need to be deleted and "moved". There are only 26 pages that use the code outside those three namespaces; I updated all of those by hand just now. - -sche (discuss) 04:23, 15 February 2014 (UTC)
- You think that if I just do a text search for "roa-ptg" then that should be ok? There would not be any false positives? —CodeCat 21:47, 16 February 2014 (UTC)
- I can't think of circumstances under which "roa-ptg" would be used other than as a language code... maybe as part of a link to a Google Book? To be careful, you could search for |roa-ptg (as in {{l|roa-ptg|foo}}, {{head|roa-ptg|noun}}) and =roa-ptg (as in {{term|foo|lang=roa-ptg}}), and then see if there were any pages left that used roa-ptg not preceded by | or =. (To be even more careful, you could also add that the instances had to be followed by either | or }.) - -sche (discuss) 21:57, 16 February 2014 (UTC)
- I have a library that can parse wiki source code into templates and such, so it lets me go through all the templates, and change parameters. So I could just say "if template name is l, and parameter 1 is roa-ptg, replace parameter 1 with roa-opt". Of course, that means that you need to consider in advance which templates it should fix, which is hard in this case. Normally, when I fix template parameters, I use tracking categories, so that the category is automatically updated as the bot goes through the entries. But with a manual list like the one you generated, that's not possible. So I'm not sure how to check when I actually caught and fixed all instances of roa-ptg. —CodeCat 22:04, 16 February 2014 (UTC)
- OK, I have updated User:-sche/roa-ptg so that it now lists all pages which contain (\||\=)roa\-ptg(\||\}). I updated by hand the few entries that were on the first list but not the second; they were uses of
{{roa-ptg-noun}}
or the nonexistent{{l/roa-ptg}}
. On the remaining pages, you can just perform a simple replacement of all instances of roa-ptg → roa-opt, or if you want to be careful, you can make four replacements: |roa-ptg| → |roa-opt|, |roa-ptg} → |roa-opt}, =roa-ptg| → =roa-opt| and =roa-ptg} → =roa-opt}. At that point, the only remaining instances of roa-ptg will be in the pagenames of the staggering number of Old Portuguese topical categories, which will have to be "moved". - -sche (discuss) 22:48, 16 February 2014 (UTC)- Oh, and replace {roa-ptg → {roa-opt. (I'm not sure why so many uses of
{{roa-ptg-noun}}
didn't show up in my previous searches...) - -sche (discuss) 22:55, 16 February 2014 (UTC)- I made the four replacements you gave originally. I haven't changed the template names yet, nor the topical categories. I suspect many of the old topical categories will be empty now, though, because the changes I made already probably included a lot of categorising label templates too. —CodeCat 01:43, 17 February 2014 (UTC)
- Thanks! I've moved all the templates I could find — the templates themselves, not the uses of them (which still work, thanks to redirects). I started to fix the uses of them with AWB, since there were only ~160, and in doing so I noticed another replacement that can be made: :roa-ptg → :roa-opt, to update the topical categories in the entries. Since I can review each edit before I make it in AWB, I'm using regex that replaces all instances of roa-ptg (regardless of preceding or following characters), and I've yet to find a false positive, FWIW. - -sche (discuss) 02:02, 17 February 2014 (UTC)
- I made the four replacements you gave originally. I haven't changed the template names yet, nor the topical categories. I suspect many of the old topical categories will be empty now, though, because the changes I made already probably included a lot of categorising label templates too. —CodeCat 01:43, 17 February 2014 (UTC)
- Oh, and replace {roa-ptg → {roa-opt. (I'm not sure why so many uses of
- OK, I have updated User:-sche/roa-ptg so that it now lists all pages which contain (\||\=)roa\-ptg(\||\}). I updated by hand the few entries that were on the first list but not the second; they were uses of
- I have a library that can parse wiki source code into templates and such, so it lets me go through all the templates, and change parameters. So I could just say "if template name is l, and parameter 1 is roa-ptg, replace parameter 1 with roa-opt". Of course, that means that you need to consider in advance which templates it should fix, which is hard in this case. Normally, when I fix template parameters, I use tracking categories, so that the category is automatically updated as the bot goes through the entries. But with a manual list like the one you generated, that's not possible. So I'm not sure how to check when I actually caught and fixed all instances of roa-ptg. —CodeCat 22:04, 16 February 2014 (UTC)
- I can't think of circumstances under which "roa-ptg" would be used other than as a language code... maybe as part of a link to a Google Book? To be careful, you could search for |roa-ptg (as in {{l|roa-ptg|foo}}, {{head|roa-ptg|noun}}) and =roa-ptg (as in {{term|foo|lang=roa-ptg}}), and then see if there were any pages left that used roa-ptg not preceded by | or =. (To be even more careful, you could also add that the instances had to be followed by either | or }.) - -sche (discuss) 21:57, 16 February 2014 (UTC)
- You think that if I just do a text search for "roa-ptg" then that should be ok? There would not be any false positives? —CodeCat 21:47, 16 February 2014 (UTC)
As of now, the only pages that the site search finds containing "roa-ptg" are 15 main-namespace pages, which I just updated, 6 other pages, which I also updated, and various topical categories. :) - -sche (discuss) 07:36, 18 February 2014 (UTC)
Bot mistake?
editIn diff your bot used an invalid language code. It also added "m-f" as the gender which doesn't make any sense. "Masculine Feminine" isn't a gender that I know of, in the same way that "Masculine Plural" is. —CodeCat 01:50, 17 February 2014 (UTC)
- Ah, yes, I (manually) incompletely changed
{{roa-ptg-noun}}
to{{head|roa-opt|noun}}
. Good thing that causes a script error; it makes it easy to find! I copied the "m-f" bit from another entry, which had "{{roa-ptg-noun}}
{{g|m-f}}
". Should "m-f" be a shortcut for "g1=m|g2=f" (or, put another way, should it display "m, f")? - -sche (discuss) 01:56, 17 February 2014 (UTC)- It should be g=m|g2=f yes. Shortcuts would complicate the logic of the module, I don't think it's worth it. —CodeCat 02:11, 17 February 2014 (UTC)
mwparserfromhell
editHave you ever tried this for your bot? It's very useful because it more or less eliminates any danger of mis-parsing code. —CodeCat 21:59, 17 February 2014 (UTC)
- I've never tried it, but I'll bookmark it; thanks! I don't do many fully automated things with User:-sche-bot, though (for which reason, I've lately wondered if I should rename it User:-sche-AWB... but that's probably not worth the bother). - -sche (discuss) 02:13, 18 February 2014 (UTC)
Fraktur script
editI don't think this makes much sense. It's a typographical variant of the regular Latin script, not a separate script altogether. If we really wanted to be consistent about this, we'd need a separate script for Old English, Uncial script for Irish, Carolingian minuscule for Old High German, Old French and such, Roman cursive for Latin, and so on. —CodeCat 01:55, 18 February 2014 (UTC)
- But compare Cyrl/Cyrs, Grek/polytonic, Hani/Hans/Hant, etc. --WikiTiki89 02:04, 18 February 2014 (UTC)
- In each of those cases, they're mutually exclusive. Languages don't have more than one in each group. -sche on the other hand recently added Latf alongside Latn as one of the scripts for German. I think that's completely pointless. Script detection doesn't do anything because they're encoded the same way, so the only useful addition that Latf would give is on Category:German language. And that page currently lists Latin twice thanks to this change. —CodeCat 02:39, 18 February 2014 (UTC)
- In the case of Cyrl/Cyrs, that seems to be due to our decision to (rightly or wrongly) exclude modern and Cyrl-script Church Slavonic from the code cu. In the case of Hans/Hant, it is only because there exists the third code Hani that includes within it the first two. Re Category:German language, that is because whoever added Latf to the module failed to give it a distinct name; I have fixed that. The point that Latf is encoded the same as Latn is valid... but then, as Wikitiki points out, the same is true of Cyrs and Cyrl, leading to situations like the OCS headword line and the Russian headword line looking different on [[а]]. (If someone proposed deleting Latf and Cyrs, I'm not sure I'd oppose it...) - -sche (discuss) 05:18, 18 February 2014 (UTC)
- It's also due to inadequate font support for many of the characters in Cyrs. We need to use special fonts for older Cyrillic languages that are fine for that purpose but would look ridiculous if used for modern Cyrillic languages. --WikiTiki89 06:19, 18 February 2014 (UTC)
- In the case of Cyrl/Cyrs, that seems to be due to our decision to (rightly or wrongly) exclude modern and Cyrl-script Church Slavonic from the code cu. In the case of Hans/Hant, it is only because there exists the third code Hani that includes within it the first two. Re Category:German language, that is because whoever added Latf to the module failed to give it a distinct name; I have fixed that. The point that Latf is encoded the same as Latn is valid... but then, as Wikitiki points out, the same is true of Cyrs and Cyrl, leading to situations like the OCS headword line and the Russian headword line looking different on [[а]]. (If someone proposed deleting Latf and Cyrs, I'm not sure I'd oppose it...) - -sche (discuss) 05:18, 18 February 2014 (UTC)
- In each of those cases, they're mutually exclusive. Languages don't have more than one in each group. -sche on the other hand recently added Latf alongside Latn as one of the scripts for German. I think that's completely pointless. Script detection doesn't do anything because they're encoded the same way, so the only useful addition that Latf would give is on Category:German language. And that page currently lists Latin twice thanks to this change. —CodeCat 02:39, 18 February 2014 (UTC)
- Do those have ISO script codes? Fraktur does. Were those contrasted with Latin script as if they were different scripts? Fraktur was, hence Bismarck famously rejected gifts of books written in Latin script, saying „Deutſche Bücher in lateiniſchen Buchſtaben leſe ich nicht.“ And (although I don't necessarily agree with this!) several of our German entries do have usexes and citations which are explicitly in Latf. - -sche (discuss) 02:06, 18 February 2014 (UTC)
- Off topic question: How much exposure does the average modern German speaker have to Fraktur? Is the average modern German speaker able to read Fraktur easily? --WikiTiki89 02:11, 18 February 2014 (UTC)
- People can read Fraktur. Books aren't normally printed in it anymore, but plenty still exist on library shelves from the era when they were, and even if you don't read them (most people probably don't, I'm probably an oddity in that I do — often over the course of citing things on RFV), many pubs, apothecaries, newspapers, etc write their names in Fraktur. (I get the impression this is also true in the UK and US.) And most Fraktur letters are similar enough to their Antiqua equivalents that they're hard to misunderstand. People do mis- and disuse long s, though.
Really, Sütterlin is what people have trouble with... - -sche (discuss) 05:18, 18 February 2014 (UTC)- It's also easy to confuse the b's, v's, and h's, and ligatures are sometimes hard to parse. I was just wondering if Germans generally have less trouble than I do reading it. (It's not that I can't read it, it's just that it's harder.) --WikiTiki89 06:19, 18 February 2014 (UTC)
- People can read Fraktur. Books aren't normally printed in it anymore, but plenty still exist on library shelves from the era when they were, and even if you don't read them (most people probably don't, I'm probably an oddity in that I do — often over the course of citing things on RFV), many pubs, apothecaries, newspapers, etc write their names in Fraktur. (I get the impression this is also true in the UK and US.) And most Fraktur letters are similar enough to their Antiqua equivalents that they're hard to misunderstand. People do mis- and disuse long s, though.
- Off topic question: How much exposure does the average modern German speaker have to Fraktur? Is the average modern German speaker able to read Fraktur easily? --WikiTiki89 02:11, 18 February 2014 (UTC)
Although the lemma notes the offensiveness of the term, I think it would be prudent to include it here also, since someone may look up the gerund without going to the lemma. bd2412 T 18:00, 18 February 2014 (UTC)
- Our current practice is not to do that, because it wrongly conveys that the inflected form is more offensive than the lemma, and/or that "jewing" is an offensive present participle of "jew", as opposed some non-offensive present participle of "jew". Compare "boughten" and "laught", which correctly describe themselves as archaic/obsolete past participles of "buy" and "laugh", because they are more archaic/obsolete than "buy" and "laugh", and there are past participles of "buy" and "laugh" that are not archaic/obsolete ("bought" and "laughed"). Compare also Wiktionary:Votes/sy-2011-08/User:Mglovesfun for desysop, where Stephen notes that the words that sparked the vote "were already appropriately marked as Croatian on the lemma page (kolovoz). It is not usual to insert dialect tags on form-of pages" if the form-of is merely as restricted as the lemma/whole paradigm. If someone looks up the gerund, they have to go to the lemma to see the word's senses, and are presented at that time with information on which of those senses are offensive (in this case: all of them). Of course, if someone adds a ===Noun=== section to jewing, any offensive senses that it has should be so marked... - -sche (discuss) 18:24, 18 February 2014 (UTC)
- I may be wrong, but I am of the opinion that marking offensive derivations as offensive is a bit more immediate than noting their other contextual characteristics. bd2412 T 19:26, 18 February 2014 (UTC)
- (I apologise in advance for the verbosity of what follows...hopefully it spells out my thinking on the matter...)
I think the offensiveness/obsoleteness/etc of some information, such as one or more senses (even all currently attested senses) of a word, should be indicated in the place where that information is. In the case of jew (“bargain”), the offensive senses are stored in the lemma entry, [[jew]], so that is where the tag "{{cx|offensive}}
" belongs. The information that "jewing" = "the present participle of jew" is not what is offensive — there is not another present participle of "jew" that one can use instead to avoid giving offence — it is the use of "jew" to mean "bargain" (which happens to be the only verb-al use of the word that exists) that is offensive.
Similarly, the information that "abstruding" = "the present participle of abstrude" is not obsolete — there is not another present participle of "abstrude" that is not obsolete — it is the use of "abstrude" to mean "push away" (which happens to be the only use of the word that ever existed) that is obsolete, so [[abstrude]] is where the tag "{{cx|obsolete}}
" belongs.
Noting that the use of "jew" to mean "bargain" is offensive may well be more important than noting that the use of "abstrude" to mean "push away" is obsolete, but I don't think the note should be made in a different place in one entry vs the other. As I said, putting an "{{cx|offensive}}
" tag in jewing would convey incorrectly that the information that "jewing" = "the present participle of jew" was something that was offensive.
Contrastively, the information that "laught" = "the past participle of laugh" is obsolete. There is another past participle that is not obsolete, and the word "laugh" is not obsolete. Following the above-mentioned principle (that the obsoleteness of some information should be indicated in the place where the information is), the information that "laught" is obsolete is correctly stored in the entry [[laught]] (and also next to the mention of laught in [[laugh]]).
It might also be useful to consider terms with multiple senses, only some of which are offensive/obsolete/etc, e.g. eggplant: is the information that "eggplants" = "the plural of eggplant"{{cx|sometimes|slang|offensive}}
, because one sense of eggplant but not another is slangy and offensive? Or should we have two senses at [[eggplant]], one "#{{plural of|eggplant}}
" and the other "#{{cx|slang|offensive}}
{{plural of|eggplant}}
"? No, the information that "eggplants" = "the plural of eggplant" is not restricted to any context (such as slang, or offensive speech); all countable senses of eggplant have eggplants as their plural. It is the "black person" sense of eggplant that is offensive. - -sche (discuss) 21:50, 18 February 2014 (UTC)- I get what you are saying, but there is no inoffensive meaning of "jewing". I would not want to give the impression that "jew" is offensive but "jewing" is okay (the circumstances where this can occur are imaginable, as it is considered more offensive, for example, to say "that man is a Jew doctor" than to say "that man is a doctor and a Jew"). bd2412 T 22:11, 18 February 2014 (UTC)
- If I may butt in, I totally agree with -sche here. But on a separate note, I don't like the way we represent the term jew as offensive. Firstly, it didn't used to be offensive—the OED says "These uses are now considered to be offensive." (my emphasis). Secondly, even today not everyone would necessarily consider it offensive, Jews and non-Jews alike (in fact I often get the impression, perhaps wrongly, that non-Jews are much more sensitive to these kinds of words than Jews). But this can lead us into a wider discussion of the offensive context tag. Whether a word is considered offensive depends on the time, the place, and the specific person. It may be difficult, but I think we should include more detailed information about things like that. --WikiTiki89 22:13, 18 February 2014 (UTC)
- I frankly don't see how using "jew" to mean "defraud" or the like could fail to be offensive. bd2412 T 22:26, 18 February 2014 (UTC)
- You're just furthering my impression that non-Jews are more touchy about this stuff. But anyway, that's not the point. The point is, offensiveness varies across time and place and we should try to cover that. --WikiTiki89 22:36, 18 February 2014 (UTC)
- I frankly don't see how using "jew" to mean "defraud" or the like could fail to be offensive. bd2412 T 22:26, 18 February 2014 (UTC)
- If I may butt in, I totally agree with -sche here. But on a separate note, I don't like the way we represent the term jew as offensive. Firstly, it didn't used to be offensive—the OED says "These uses are now considered to be offensive." (my emphasis). Secondly, even today not everyone would necessarily consider it offensive, Jews and non-Jews alike (in fact I often get the impression, perhaps wrongly, that non-Jews are much more sensitive to these kinds of words than Jews). But this can lead us into a wider discussion of the offensive context tag. Whether a word is considered offensive depends on the time, the place, and the specific person. It may be difficult, but I think we should include more detailed information about things like that. --WikiTiki89 22:13, 18 February 2014 (UTC)
- I get what you are saying, but there is no inoffensive meaning of "jewing". I would not want to give the impression that "jew" is offensive but "jewing" is okay (the circumstances where this can occur are imaginable, as it is considered more offensive, for example, to say "that man is a Jew doctor" than to say "that man is a doctor and a Jew"). bd2412 T 22:11, 18 February 2014 (UTC)
- (I apologise in advance for the verbosity of what follows...hopefully it spells out my thinking on the matter...)
- I don't know to what extent this is comparable, but in German, Zigeuner is no longer used by sensitive speakers (the people who one might derogatorily call "PC"), and several Roma groups have denounced it in very strong terms, but individual Romanis still describe themselves as Zigeuner. (Our entry on Zigeuner only mentions the first half of this because I could not find a source that documented second half.) Likewise, in English, Gypsy is not used by sensitive speakers, and is denounced by some Romanis, but is still used by other Romanis (and by plenty of speakers who are simply ignorant of its potential offensiveness). Gypsy describes itself as
{{cx|sometimes|offensive}}
. Both terms have extensive usage notes. - -sche (discuss) 19:05, 19 February 2014 (UTC)
- I don't know to what extent this is comparable, but in German, Zigeuner is no longer used by sensitive speakers (the people who one might derogatorily call "PC"), and several Roma groups have denounced it in very strong terms, but individual Romanis still describe themselves as Zigeuner. (Our entry on Zigeuner only mentions the first half of this because I could not find a source that documented second half.) Likewise, in English, Gypsy is not used by sensitive speakers, and is denounced by some Romanis, but is still used by other Romanis (and by plenty of speakers who are simply ignorant of its potential offensiveness). Gypsy describes itself as
Descendants of Proto-Algonquian entries
editThese are all using {{term}}
, but they should use {{l}}
. Could you work on fixing that if you have the time? —CodeCat 21:35, 22 February 2014 (UTC)
- One links to predecessor terms in Etymology sections using
{{term}}
, why can't one link to descendant terms using{{term}}
? I don't see what difference it makes which template is used. - -sche (discuss) 22:29, 22 February 2014 (UTC){{term}}
is optimized for use in running text.{{l}}
is optimized for use in lists. Currently, that only manifests itself in italicization of the term (if it is in the Latin script) or its transliteration, but it is undoubtedly possible that there could be other things added to them as well in the future. --WikiTiki89 22:35, 22 February 2014 (UTC)
Don't forget to use the preview button before saving your edits. I know it may be annoying on a large page like water, but I noticed you misspelling templates ({{qualifer:less commonly}}
[sic]) and using templates that no longer exist ({{t-SOP|...}}
). If you don't check your own edits on such a large page, there is a good chance mistakes like that will go unnoticed for quite a while. --WikiTiki89 00:46, 24 February 2014 (UTC)
- Thanks for catching those mistakes. I had forgotten about t-SOP's deletion. - -sche (discuss) 01:26, 24 February 2014 (UTC)
List of headers
editI could probably clean up all the misspellings by bot, with much less effort than it would take you to do them all manually. I'd just give it a list of bad headers and their replacements and tell it to go through all the entries listed under "not recognised", and it would weed them all out. So it may be better if you focused on headers that are not misspellings? —CodeCat 02:56, 3 March 2014 (UTC)
- Good idea. :) You could probably bot-fix certain of the "wrong-level" headers, too, like L3 Antonyms (just up it to L4). - -sche (discuss) 03:07, 3 March 2014 (UTC)
- It's more efficient, but less effective as the multiple other problems often in the entries may remain unnoticed for quite a while. DCDuring TALK 03:12, 3 March 2014 (UTC)
- I would not try to automatically fix things like antonyms. Often they're just at the end of the entry, without regard for what sense or part of speech they should go with . DTLHS (talk) 03:13, 3 March 2014 (UTC)
ISO codes
editWhere can I find a full list of all current ISO language codes, preferably all on one page? --WikiTiki89 04:47, 8 March 2014 (UTC)
- Here is the list of current ISO codes I've been using to find out which ISO codes we're missing. (And here is their not-entirely-complete list of retired codes.) I don't know what you plan to do with the data, but note that it wouldn't be a good idea to automatedly import "missing" codes, since in many cases they are codes that have been intentionally excluded. Hope that helps, - -sche (discuss) 05:17, 8 March 2014 (UTC)
- Thanks! I just wanted the convenience of searching through them with Ctrl+F. --WikiTiki89 05:22, 8 March 2014 (UTC)
Does this mean real "silver" "foil", or just silver foil made from aluminium. (I'm pretty sure that Silberpapier means the aluminium product silver paper. SemperBlotto (talk) 17:52, 8 March 2014 (UTC)
- It has both meanings. You can wrap a pita in Silberfolie made of aluminium, or you can "gild" (err..."silver") something in a thin layer of Silberfolie made of silver. Silberpapier likewise has both meanings. google books:"silver foil" gild suggests that "silver foil" also has both senses, and google books:"silver paper" gold OR gild suggests that "silver paper", in addition to meaning aluminium foil, can also (rarely, and possibly SOP-ily) refer to a certain Oriental product — apparently paper which has been coated in a thin foil of silver. - -sche (discuss) 19:37, 8 March 2014 (UTC)
- I have never heard of "silver foil" referring to aluminum foil and would have always assumed it meant real silver. In the US, aluminum foil is generally known as tin foil. --WikiTiki89 23:43, 8 March 2014 (UTC)
Your proposal to merge the Norwegians made me think: do you think that we should continue to keep Sardinian divided? I thought that these were dialects, and I don’t think we are supposed to treat dialects as independent languages. --Æ&Œ (talk) 10:53, 14 March 2014 (UTC)
- Wow, our treatment of Sardinian is weird.
- There is quite a difference between the case of the written standards of Norwegian, which have existed since only the late 1800s / early 1900s, and the case of the dialects of Sardinian, whose predecessors have been separate from Italian's since the first century BCE and which may have started to distinguish themselves from each another only a little later than that. There is, however, an amusing parallel between our granting of codes to Norwegian and (only) two of its 4+ written standards, and our granting of codes to Sardinian and (only) two of its 3+ dialects (not counting sdc and sdn, since there is disagreement over whether they are Sardinian, Corsican, or independent languages).
- There does seem to be general agreement that src and sro are mutually intelligible. WP says they "differ mostly in phonetics, which does not hamper intelligibility among the speakers". Roberto Bolognesi, in his Phonology of Campidanian Sardinian, does assert that "is only for the Campidanian area, as already recognized in Wagner (1951), that it is possible to speak of a uniform variety of Sardinian and of a general mutual intelligibility of the different dialects". Nonetheless, the variety that is an official language of Sardinia is one that unifies the two dialects, and moreover, (you must already see this, I'm just saying it so I can copy and paste this comment into any discussion about unifying Sardinian) we ourselves already unify them: we have a Category:Sardinian language with most of our Sardinian entries in it, we just also have a Category:Campidanese Sardinian language with 36 entries and a Category:Logudorese Sardinian language with 13 entries.
- Do you think the topic of merging Sardinian should be raised now, or would it be better to wait until some of the other major lect-merger discussions we're having have been resolved? - -sche (discuss) 19:32, 14 March 2014 (UTC)
- Interesting. I’m not exactly sure if I am understanding you when you say that they’re already unified; I’m assuming that you are refering to the hypernymous code (sc) much like we have Norwegian no. I would feel more comfortable if we postponed the Sardinian debate since I’d rather we focus on problems individually. --Æ&Œ (talk) 20:01, 14 March 2014 (UTC)
- Yes, I'm referring to sc. I think the fact that we already have most of our entries using sc shows that we de facto accept that Sardinian is possible to unify (rather than taking Bolognesi's view and baulking at the idea of a unified Sardinian the way we baulked at e.g. the idea of there being a unified "Berber" language). We're just being "schizophrenic" by also having dialect codes. Waiting till the other discussions resolve is my preference as well, so, great. - -sche (discuss) 20:50, 14 March 2014 (UTC)
- Interesting. I’m not exactly sure if I am understanding you when you say that they’re already unified; I’m assuming that you are refering to the hypernymous code (sc) much like we have Norwegian no. I would feel more comfortable if we postponed the Sardinian debate since I’d rather we focus on problems individually. --Æ&Œ (talk) 20:01, 14 March 2014 (UTC)
- The Norwegian vote ended recently. Is it time to start a new election? --Æ&Œ (talk) 09:48, 19 April 2014 (UTC)
- Yes. I'm sorry for not replying to this until now.
My inclination would be to start a discussion on WT:RFM, since the issue seems minor (we have few entries in the affected dialects) and unlikely to be controversial, but you could start discussion in the BP if you would prefer. (In any case, I don't imagine there will be enough controversy to merit moving to a formal vote on WT:V.) - -sche (discuss) 00:02, 6 May 2014 (UTC)
- Yes. I'm sorry for not replying to this until now.
JA phonetics shifts
editI noticed this edit, changing ">" to "from". As introduced to me at Wiktionary by User:Bendono's edits, this is meant to convey "this older reading became this other reading", so "[kɨ] > [ki]" is meant to convey "[kɨ] became [ki]", kind of backwards from your edit. I'm not sure what the best notation would be, and I'm tired enough that my brain's not fully firing on all cylinders; I'd appreciate it if you could rework that phrasing as you see appropriate. ‑‑ Eiríkr Útlendi │ Tala við mig 05:01, 16 March 2014 (UTC)
- Ha, wow, this is exactly why words are better than greater-than-signs: some of the places on en.Wikt that I've seen people write ">", they've meant "from", other places, they meant "to". (Words are also more helpful than ">" to screen readers.)
So many etymologies contain phonological information of that sort that someone (possibly me) should design a template for them, firstly to automatically apply the class that is currently applied by{{IPAchar}}
, and secondly to standardise whatever wording we decide to put around and between "[kɨ]" and "[ki]". Perhaps the template could take readings as numbered parameters, and {{ja-reading-etymology|kɨ|ki}} would display The phonological evolution was from [kɨ] to [ki]. And if a third parameter were supplied, {{ja-reading-etymology|kɨ|ki|ke}}, it would display The phonological evolution was from [kɨ] to [ki] to [ke]. And so on. Would that be a good wording? - -sche (discuss) 05:36, 16 March 2014 (UTC)- Where have you ever seen ">" meaning "from"? --WikiTiki89 07:48, 16 March 2014 (UTC)
- I have AWB set to point it out to me whenever a page I'm on uses ">" or "<". I've encountered maybe ~500 instances of those characters (not counting uses in HTML tags), of which there have been maybe a dozen that had "[newer language] [word] > [older language] [word]" in their etymologies, which have been the most unambiguous cases of ">" = "from". I've seen about as many entries with "<" for "to". Using "greater than" to mean either "from" or "to" is such a strange idea to begin with that I wouldn't even try to guess whether such uses were intentional or the result of someone's finger slipping and typing the wrong one of the two. I don't feel like tracking down specific examples at the moment, but I'll let you know next time I see one. (And BTW, that's not to speak of the variation between entries that present their etymologies in the format "[current form] [some terse symbol or word, typically either ">" or "<" but sometimes "←", meaning "from"] [older form] [terse symbol] [oldest form]" vs those that use the format "[oldest form]" [terse symbol or word meaning "to"] [older form] [terse symbol] [current form]". That variation can itself lead to confusion, particularly when — as in the case of numerous Finnish etymologies — the "[current word]" is omitted, and the string just starts or ends with ">"/"<"/"←"/"→".) - -sche (discuss) 08:47, 16 March 2014 (UTC)
- The only way I've ever seen them is "<" = "from" and ">" = "to". It may be a strange idea, but it is widespread in historical linguistics. --WikiTiki89 09:00, 16 March 2014 (UTC)
- I have AWB set to point it out to me whenever a page I'm on uses ">" or "<". I've encountered maybe ~500 instances of those characters (not counting uses in HTML tags), of which there have been maybe a dozen that had "[newer language] [word] > [older language] [word]" in their etymologies, which have been the most unambiguous cases of ">" = "from". I've seen about as many entries with "<" for "to". Using "greater than" to mean either "from" or "to" is such a strange idea to begin with that I wouldn't even try to guess whether such uses were intentional or the result of someone's finger slipping and typing the wrong one of the two. I don't feel like tracking down specific examples at the moment, but I'll let you know next time I see one. (And BTW, that's not to speak of the variation between entries that present their etymologies in the format "[current form] [some terse symbol or word, typically either ">" or "<" but sometimes "←", meaning "from"] [older form] [terse symbol] [oldest form]" vs those that use the format "[oldest form]" [terse symbol or word meaning "to"] [older form] [terse symbol] [current form]". That variation can itself lead to confusion, particularly when — as in the case of numerous Finnish etymologies — the "[current word]" is omitted, and the string just starts or ends with ">"/"<"/"←"/"→".) - -sche (discuss) 08:47, 16 March 2014 (UTC)
Lukewarmer
editHello, I've been trying to mark lukewarmer for deletion because of the discussion taking place at w:Wikipedia:Articles for deletion/Lukewarmer. However the edit filter won't let me. It would be awesome if you could delete the page yourself. Thanks, Jinkinson (talk) 17:15, 21 March 2014 (UTC)
A request
editBased on this piece of comment, is it okay to make Perfective Counterpart and Imperfective Counterpart headers? --KoreanQuoter (talk) 09:46, 23 March 2014 (UTC)
- I wouldn't. It doesn't seem necessary to add a new header just for that: if a word has only one or two perfective/imperfective counterparts, they could just be listed on the headword line; if there are more (or even if there are only a few), they could be listed in the ====Synonyms==== or ====Related terms==== section, whichever is appropriate. If you want to list them on the headword line, the current
{{ru-verb}}
templates could probably be expanded to accommodate that; you could ask about that in the Grease Pit. - -sche (discuss) 19:04, 23 March 2014 (UTC)
protestant
editYou can't call protestant a capitalisation, it's the exact opposite; the uncapitalised form of Protestant. In my opinion that template doesn't apply in this situation, and using it leads to confusion. I have come across this paradox elsewhere, but I can't remember where. Donnanz (talk) 18:25, 24 March 2014 (UTC)
- That's like saying that you can't call a nanometer a length because it's not long. --WikiTiki89 18:32, 24 March 2014 (UTC)
- What's a nanometre got to do with it? Donnanz (talk) 18:41, 24 March 2014 (UTC)
- Because we use "capitalization" to mean some sort of measure of how capitalized a word is, and not capitalized at all is one possibility. The word "length" is another example of how we take a biased word and use it as a neutral term for a measurement. We don't have to switch to the word "shortness" when talking about nanometers. --WikiTiki89 18:46, 24 March 2014 (UTC)
- What's a nanometre got to do with it? Donnanz (talk) 18:41, 24 March 2014 (UTC)
- Wikipedia's w:Capitalization (disambiguation) oddly does a better job of defining "capitalization" than our entry: one of the things it refers to is "choice of case in text", and one choice of case is "lowercase". This is the sense used by the template. As I noted in my edit summary, the template is regularly used for variations in capitalization in either direction. If you look at Special:WhatLinksHere/Template:alternative capitalization of, roughly half of the uses of it are on uncapitalized pages, soft-redirecting them to capitalized pages, and the other half are soft-redirects in the other direction. - -sche (discuss) 18:42, 24 March 2014 (UTC)
- Do you expect the man in the street (i.e. average user) to understand this inflexible philosophy? I, for one, do not. It's rather daft in situations like this. Donnanz (talk) 19:00, 24 March 2014 (UTC)
- The only "inflexible philosophy" I see is your philosophy that "alternative capitalization of" has to mean "alternative form written with an uppercase letter of". "Choice of case (whether ALL CAPS, CamelCase, Sentence case, or all lowercase)" is a regular meaning of "capitalization" in English. I've expanded our entry on capitalisation accordingly. - -sche (discuss) 19:05, 24 March 2014 (UTC)
- My ""inflexible philosophy"? Huh! I know my interpretation of "capitalisation" is correct; the statement "Choice of case ...... or all "lowercase"" (sic) is WRONG (that's capitalisation for you). I see I'm not going to win this argument, but I hope you have a light-bulb moment one day. In the meantime, I have better things to do. End of argument. Donnanz (talk) 20:01, 24 March 2014 (UTC)
- Condensed version of your comment: "I don't have an inflexible philosophy, I'm just right and you and Wikitiki [and all the books that use 'capitalization' to mean 'choice of case'] are wrong!" lol. - -sche (discuss) 20:11, 24 March 2014 (UTC)
- Correct. Donnanz (talk) 20:30, 24 March 2014 (UTC)
- Condensed version of your comment: "I don't have an inflexible philosophy, I'm just right and you and Wikitiki [and all the books that use 'capitalization' to mean 'choice of case'] are wrong!" lol. - -sche (discuss) 20:11, 24 March 2014 (UTC)
- My ""inflexible philosophy"? Huh! I know my interpretation of "capitalisation" is correct; the statement "Choice of case ...... or all "lowercase"" (sic) is WRONG (that's capitalisation for you). I see I'm not going to win this argument, but I hope you have a light-bulb moment one day. In the meantime, I have better things to do. End of argument. Donnanz (talk) 20:01, 24 March 2014 (UTC)
- The only "inflexible philosophy" I see is your philosophy that "alternative capitalization of" has to mean "alternative form written with an uppercase letter of". "Choice of case (whether ALL CAPS, CamelCase, Sentence case, or all lowercase)" is a regular meaning of "capitalization" in English. I've expanded our entry on capitalisation accordingly. - -sche (discuss) 19:05, 24 March 2014 (UTC)
- Do you expect the man in the street (i.e. average user) to understand this inflexible philosophy? I, for one, do not. It's rather daft in situations like this. Donnanz (talk) 19:00, 24 March 2014 (UTC)
- This may actually be an American thing. Compare definition 4 in the Collins English Dictionary to definition 5 in the Collins American English Dictionary. --WikiTiki89 20:29, 24 March 2014 (UTC)
- I find I'd be inclined to follow Donnaz's usage. To me capitalization in "His capitalization of the letters is wrong" is equivalent to "That he capitalized all of the letter is wrong". I could accept the other meaning as a possibility, but it would not be my favored interpretation. DCDuring TALK 20:38, 24 March 2014 (UTC)
- How about (discussing e.g. a student's paper) "her capitalization is all over the place". To me, that implies that her paper contained things like "the Russian armed Forces attacked budapest with Tanks and planes", not usually (and certainly not exclusively) that it consisted of "THE RUSSIAN ARMED FORCES ATTACKED BUDAPEST WITH TANKS AND PLANES". - -sche (discuss) 20:51, 24 March 2014 (UTC)
- I was just about to add a comment on that. There are situations in which I could accept the meaning as "choice of orthographic case", but I would interpret "her capitalization is all over the place" as "her use of capital letters is inconsistent." Note that this interpretation would probably not lead to not getting the point intended. I think that my English has a narrow construction of words derived from the orthographic sense of capital and capitalize. I wouldn't say "Her capitalization is wrong." if she failed to use any capital letters and would not give that interpretation to someone else's utterance of that sentence. DCDuring TALK 21:06, 24 March 2014 (UTC)
- The sole definition of capitalization at MWOnline is "the use of a capital letter in writing or printing", completely in accord with my idiolect's usage and interpretation of the word. DCDuring TALK 21:13, 24 March 2014 (UTC)
- How about (discussing e.g. a student's paper) "her capitalization is all over the place". To me, that implies that her paper contained things like "the Russian armed Forces attacked budapest with Tanks and planes", not usually (and certainly not exclusively) that it consisted of "THE RUSSIAN ARMED FORCES ATTACKED BUDAPEST WITH TANKS AND PLANES". - -sche (discuss) 20:51, 24 March 2014 (UTC)
- I find I'd be inclined to follow Donnaz's usage. To me capitalization in "His capitalization of the letters is wrong" is equivalent to "That he capitalized all of the letter is wrong". I could accept the other meaning as a possibility, but it would not be my favored interpretation. DCDuring TALK 20:38, 24 March 2014 (UTC)
- This may actually be an American thing. Compare definition 4 in the Collins English Dictionary to definition 5 in the Collins American English Dictionary. --WikiTiki89 20:29, 24 March 2014 (UTC)
Can this go? —CodeCat 20:41, 26 March 2014 (UTC)
- I'd like to keep it as a model of how entries would look if we gathered citations from all stages of a language's development onto one page. That's something that is not done at present except partially for English, where somewhere between 50–100+ entries include Middle English (but AFAIK never Old English) citations.
- As an aside, I wonder how hard it would be to find all the files like File:Ég.ogg (with no langcode prefix), and find what langcode refix they should have, and [have Commons admins] move them...
- - -sche (discuss) 21:24, 26 March 2014 (UTC)
- Not too hard if we add some code to the template we use to add the files to entries. It could check the parameter to see if it begins with the given language code, and categorise it if not. —CodeCat 21:49, 26 March 2014 (UTC)
- Great, especially if it could categorise (or a list could be made) based on which prefix was missing. I think Commons has tools for moving things in batches, e.g. "move all these 200 files to have a de- prefix". - -sche (discuss) 22:22, 26 March 2014 (UTC)
- I don't really know anything about how audio files are handled on Wiktionary, though. Which templates should this be added to? —CodeCat 23:39, 26 March 2014 (UTC)
- Great, especially if it could categorise (or a list could be made) based on which prefix was missing. I think Commons has tools for moving things in batches, e.g. "move all these 200 files to have a de- prefix". - -sche (discuss) 22:22, 26 March 2014 (UTC)
- Not too hard if we add some code to the template we use to add the files to entries. It could check the parameter to see if it begins with the given language code, and categorise it if not. —CodeCat 21:49, 26 March 2014 (UTC)
Shortcuts to templates
editI've updated {{shortcut}}
so that it can be used easily to indicate shortcuts for template names. The parameters can optionally include "Template" as part of the name, the template strips it out anyway. See {{context}}
and {{label}}
for examples. —CodeCat 23:06, 28 March 2014 (UTC)
- Oh, good idea; thanks for doing that. :} - -sche (discuss) 01:17, 29 March 2014 (UTC)
The noun is really just the gerund of the verb. I wanted to remove the entry altogether as our verb entries generally don't include the gerund, but it has a quotation that I didn't want to remove. So I did diff instead. It looks kind of odd though, having a form-of definition pointing to the same page. What do you think? —CodeCat 02:10, 30 March 2014 (UTC)
- To whatever extent that the gerund is considered a form of the verb and doesn't have distinctly nounal features like a plural (like English gerunds or would-otherwise-be-gerunds sometimes have), the quotation could be placed under the relevant verb sense and the noun section could be removed. The translation could be amended to "the disclosing of our DNA". If the noun section is to be retained, I think it would be helpful if the same terminology were used in it as in the verb's conjugation table, which means either the verb's conjugation table could be adapted to label the gerund as such, like "infinitive (and gerund)", or the template used to define the noun could be adapted to use the same label as the conjugation table, like "gerund (infinitive) of". - -sche (discuss) 02:37, 30 March 2014 (UTC)
- Dutch gerunds are certainly nouns. They don't have plurals, but they do have gender. It's the same in German. —CodeCat 03:05, 30 March 2014 (UTC)
- Then why do you want to remove the noun section? - -sche (discuss) 03:15, 30 March 2014 (UTC)
- Because all verbs have a gerund and it's always identical in form to the infinitive. So it would imply that we'd always have a noun section on the same page as verbs. I'm not sure if that's practical. —CodeCat 03:17, 30 March 2014 (UTC)
- All German verbs can be substantivised, too. It happens that they get capitalised as part of the process, and Wiktionary puts different capitalisations on different pages... would it be consistent to exclude Dutch gerunds because they don't get capitalised? Substantivisations also inflect nounally in German, as in des Schwimmens — but then, Dutch gerunds also inflect nounally, or did in the past, right, as in willens en wetens? Also, it's only in theory that all verbs can be substantivised; in both Dutch and in German, there are probably cases where a verb is attested while its substantivisation isn't. (Strictly in theory, even the reverse could be true: there could be three citations of a gerund and only two of its verb.) So, Dutch gerunds could be allowed but not made a high priority. Or, if not, then as I said: to whatever extent the gerund is subsumed into the verb, quotations of it can go under verb senses. - -sche (discuss) 04:40, 30 March 2014 (UTC)
- Because all verbs have a gerund and it's always identical in form to the infinitive. So it would imply that we'd always have a noun section on the same page as verbs. I'm not sure if that's practical. —CodeCat 03:17, 30 March 2014 (UTC)
- Then why do you want to remove the noun section? - -sche (discuss) 03:15, 30 March 2014 (UTC)
- Dutch gerunds are certainly nouns. They don't have plurals, but they do have gender. It's the same in German. —CodeCat 03:05, 30 March 2014 (UTC)
Reversal
editHi,
Your reversal and the summary don't make sense to me. --Anatoli (обсудить/вклад)
- I was looking at the edit backwards, I apologise; Wikitiki clarified that in his subsequent reversal of my reversal of your reversal of the IP's removal of the links. - -sche (discuss) 21:05, 30 March 2014 (UTC)
aWa
editWiktionary:Sandbox/aWa is a page specially designated for testing. And you could have just asked me about adding archiving capability to Wiktionary:RFM unresolved requests subpages… — Keφr 05:55, 4 April 2014 (UTC)
- I wasn't aware of that sandbox, but it's alright, I was doing a "real conditions" test of the archival of a section that didn't contain a pagename in its header to a manually-supplied 'target' talk page. When I archived the discussion which ultimately ended up here, the archiver initially put it here. I was testing whether that was because of a bug in the archiver or because I accidentally copied the whole string from the top of the page I wanted to put the discussion (having navigated to the category to be sure of what it was called), as opposed to just the pagetitle. It turned out to be the latter. - -sche (discuss) 07:02, 4 April 2014 (UTC)
{{en-noun}}
doesn't support pl2=
anymore and hasn't for a while now. Just letting you know. —CodeCat 18:05, 5 April 2014 (UTC)
- I actually worked on some Garifuna stuff for Wikipedia. Are there any entries for Garifunan words on here? Tharthan (talk) 15:11, 6 April 2014 (UTC)
Hello. I note your reversion on "Catholic", but don't agree. You may not be much aware of Christian Orthodoxy or its relation to Roman Catholicism, but both churches lay claim to catholicity. This is one result of the East-West schism which separated them in 1054, which may seem like a long time ago, but the ramifications are still very much alive. The Roman church came to be known as "Roman Catholic" specifically because of this claim to catholicity, and chose that name as part of establishing its claim. What you may not know is that what we know here in the west as the "Eastern Orthodox Church" also calls itself officially the "Orthodox Catholic Church", for much the same kinds of reasons. That knowledge is simply not widely known in the west. Even in the east, the title itself is not unduly emphasized. Instead, the teaching of the church as to its catholicity is made more prominent. But the counter-claim to the Roman church is just as firmly established. For "Catholic" then, with a capital C, it is indeed used as an adjective in modifying references to the Orthodox Church.
I think the official church title is enough to establish the capitalized form in relation to eastern orthodoxy. Other orthodox church references in English can probably be found, but Orthodoxy is spreading in the west only in more recent times, from areas that don't historically speak English so much, and rules of capitalization vary in other languages. Translations from original documents in other languages may or may not be done by people who are fully aware of such specific English usage.
I would argue that the entry ought to be changed back to something similar to what I had put there. What do you say? Evensteven (talk) 18:29, 10 April 2014 (UTC)
- As far as I know "as opposed to" means "contrasting with" and has nothing to do with their opinions. --WikiTiki89 18:40, 10 April 2014 (UTC)
- No argument there. And "in contrast to" also has the same meaning. But it's a fact of human psychology that how something (including a wording) is interpreted can sometimes be subject to association: "opposition" in this case. The wrong association can sometimes result in misinterpretation, and non-native speakers of English (or the less fluent) are more subject to this kind of difficulty. "Contrast" or "differentiation" are more neutral, less likely to get there. It's not about being incorrect; it's about being helpful. This is a totally different issue than "Catholic" in reference to the Orthodox Church, however. Evensteven (talk) 21:34, 10 April 2014 (UTC)
- I imagine citations (passages from books that use the term "Catholic" in its various senses) will do more than anything to establish what senses the word has. It'd be particularly useful to see under what circumstances (C|c)atholic is used, with a particular meaning, in a way that actually contrasts with other possible meanings. (Uppercase "Catholic" in the sense "Roman Catholic" contrasts with e.g. "Protestant" when people speak of "Protestant England passing anti-Catholic laws", laws directed against the church that follows the Pope/Bishop of Rome, not specifically against e.g. Orthodox churches.) The inclusion of "Catholic" in the name of Orthodox churches that claim catholicity is easy to see as merely an instance of honorific capitalisation, in the absence of evidence that "Catholic" is used by itself to mean (or include) the Orthodox. Compare the inclusion of "Holy" in the names of churches — it doesn't mean "Holy" with a capital "H" means "sacred", it means "holy" (like "catholic") has one of its usual meanings, and was given honorific capitalisation as part of a Proper Noun/Name of an Important Thing. - -sche (discuss) 00:11, 11 April 2014 (UTC)
- Well, I edit mostly on Wikipedia, and don't know what your standard ways of looking at things is. As I said, the official title of the eastern church is "Holy Orthodox Catholic Church", capitalized "Catholic" because it is part of the title, hence a proper name. That is seen, of course, in formal references. Outside that context, Catholic is not generally used (although nothing would prevent it). Then, it's Orthodox Church, or perhaps Eastern Orthodox Church. The one other instance of a capitalized Catholic that I see normally lies in the phrase from the Nicene Creed, "one, holy, Catholic, and Apostolic Church", when the Orthodox church is identified as being just that. I would argue that's not an honorific or a proper name, but a reference to its catholicity. I see both these types of usage in Timothy Ware's The Orthodox Church, chapter 16, The Reunion of Christians, pp 315, 330, Penguin Books, 1991. (Note the edition is not the more recent 1993 one, with revisions, but both are current in use.) Ware also quotes an Anglican, Bishop Ken the Non-Juror, as saying "I die in the faith of the Catholic Church, before the disunion of east and west", p 325. Here the Catholic reference is again clearly to catholicity, and it was taken in a context where the Bishop was expressing his affirmation as an Anglican to catholicity, and his connection to Orthodoxy as a result of it. That usage can cut a number of ways, but it is not in this case separated from Orthodoxy either. I don't know if these quotes are retained in the 1993 edition of Ware, but if they are, the page numbers and placement in the volume might be different. The 1991 edition also has many references to "Catholic", alone and with "Roman", meaning the Roman church, in the common abbreviated uses we tend to see often in English. Frankly, it's not that many books on Orthodoxy that tend to mention other churches in multiple contexts like this one does. And "Catholic" definitely tends to be restricted in context with respect to Orthodoxy to just the title, and to affirmations of catholicity.
- One more thing I can attest to: that the usage is notable among Orthodox. Both uses I mention are the guarantor of official recognition on the part of the church itself that it identifies with being "Catholic". Secondly, not all rank and file Orthodox, especially in more isolated regions, actually know this about the Orthodox church. It has been surprising to me how often the WP article on the "Eastern Orthodox Church" has been edited to reject the "Catholic" label, by strongly anti-Roman Orthodox. A number of those have even rejected official pronouncements from reliable Orthodox sources, saying that "Catholic" cannot mean anything but Roman Catholic. One was asked by another editor if he said the Nicene Creed in services. (If he doesn't, he doesn't go to Orthodox services; it's in virtually all of them). The WP article is burdened by proofs. I suspect that some of that controversy can also spring from lack of expertise with English, as references to Orthodoxy are found in English much less than references to Catholicism, and the latter may be all some of those more isolated Orthodox have ever seen in English.
- In any case, the Orthodox church takes its catholicity very seriously, though an Orthodox attitude on that point does not get so emotional when confronted with rival claims. (It talks regularly with Catholics, Anglicans, and others about ecumenical issues, and this one comes up all the time.) Whatever the sensitivity of an individual, reference to these matters in English is as common as might be expected in relation to the frequency with which these matters come up in English, and are quite notable there. In addition, English Wiktionary is (I am sure) referenced quite often by speakers (principally) of other languages, in other nations, and I would imagine that Catholics and Orthodox alike in those locations would reference this particular entry more often than the general populations.
- You know my opinion fairly completely now, and have one explicit source that gives all the senses I have seen in English. I'll leave it to you Wikitionarians to decide what to do next. Just don't make the decision based only on conformity with the way things are normally done on Wiktionary, because this item has its own peculiarities that should be taken fully into account. I think the significance to readers may alter the normal balance point. Oh, and if you've been wondering, no, I'm really not trying to push some anti-Catholic point of view, just trying to expose a smaller-scale but important usage. The existing material relating to the Roman church is clearly correct and suitable in its essence.
- Thanks for listening. Evensteven (talk) 03:13, 11 April 2014 (UTC)
- One more thing. "Catholic" within the title to both Roman and Orthodox churches are in fact references to catholicity also. Note above the Anglican's use of "Catholic", capitalized, to highlight catholicity. The capitalization in the titles might be thought to come from their being in titles, but the capitalization for reference to catholicity is retained in all contexts. Use of "Catholic" as in Roman church can be seen as taken wholesale from the title, but it can also be seen as reference to catholicity. The origin of the capitalization is not clear, as usage is mixed. Capitalization is a mixed art in English anyway. Look at most 18th century documents. Evensteven (talk) 04:01, 11 April 2014 (UTC)
- A Wiktionary question for you. - Would it have been better of me to create a discussion page for the "Catholic" entry when raising this whole question? That's what I would have done on Wikipedia, but I note that most entries have no discussion page already. I hesitated, not knowing if that's as much a part of the working culture here. I expect to drop in on Wiktionary from time to time, and would like to establish my bearings a little better. Evensteven (talk) 04:25, 11 April 2014 (UTC)
- See the edit notice. — Keφr 14:31, 11 April 2014 (UTC)
- A Wiktionary question for you. - Would it have been better of me to create a discussion page for the "Catholic" entry when raising this whole question? That's what I would have done on Wikipedia, but I note that most entries have no discussion page already. I hesitated, not knowing if that's as much a part of the working culture here. I expect to drop in on Wiktionary from time to time, and would like to establish my bearings a little better. Evensteven (talk) 04:25, 11 April 2014 (UTC)
RfD close
editWhen you deleted financial service, did you also intend to delete financial institution? The latter is under the header of the former. Cheers! bd2412 T 14:16, 25 April 2014 (UTC)
- Oh, thanks for catching that, and sorry I didn't notice this message until now. (I wonder why I didn't get one of those bright orange notifications until Anatoli's comment, below. Oh well, I've deleted the entry now...) - -sche (discuss) 23:55, 5 May 2014 (UTC)
kv language code
editHi,
I must have missed something. What happened to "kv"? Is it retired? It's still in Module:languages/data2. Or are you orphaning it first. I noticed that Komi changes to Komi-Zyrian in translations. Which module does this conversion? --Anatoli (обсудить/вклад) 23:30, 5 May 2014 (UTC)
- I am converting kv to kpv per WT:RFM#Komi_language. (Note also several older discussions, including WT:RFM#Category:Komi_language and Template talk:kpv.) Since it was (not unanimously, but sufficiently) decided to treat the two varieties of the macrolanguage Komi (kv) as separate lects, and we already have entries using the varieties' codes (kpv and koi), kv is no longer necessary or desirable: having it allows people to add words without specifying variety, which we don't want if (and insofar as) we consider the varieties to be separate languages. According to their headers, our kv entries are all Komi-Zyrian, anyway (as opposed to pan-Komi), so I am converting them to kpv. Then kv can be removed from the module. - -sche (discuss) 23:51, 5 May 2014 (UTC)
- OK, agreed. For "pan-Komi", if we get any, we can have duplicate entries. --Anatoli (обсудить/вклад) 23:58, 5 May 2014 (UTC)
You had added Sirup as a Middle High German word with a quote from 1563. It seemed to be an obvious mistake to me, since Middle High German is normally used for German until ca. 1350, maybe 1400~50, but certainly not 1563. So I deleted it. Only then did I see that you were the one who'd made this entry. I don't know, but it seems that sirup was one of the spellings used in (actual) Middle High German, but we don't normally use capitalized lemmas for MHG, do we? As far as I know capitalization of noun swas still very rare at that time... Anyway, you decide whether you want to put it back up, but you shouldn't use that quote, because it's Early Modern German, not Middle High German. Best regards.Kolmiel (talk) 11:38, 14 May 2014 (UTC)
Unblock script
editHey -sche,
I know I haven't been keeping up with Wiktionary affairs much nowadays, since I've been so busy with Commons and MediaWiki.org and all, but do you still remember Wiktionary:Beer parlour/2014/January#Infinite-duration blocks of IP addresses? I cooked up a script in Python which I believe should do the job, and if you want to run it on your account I can email the script if necessary. TeleComNasSprVen (talk) 09:50, 21 May 2014 (UTC)
- Have the blocks not been lifted already? — Keφr 15:54, 21 May 2014 (UTC)
- I lifted or shortened all of the blocks except the blocks of Tor nodes and proxies, and the blocks of IPs which were also subject to global bans, blocks, or locks. Hundreds of those blocks remain. TeleComNasSprVen and Amgine argued in the BP that it would be good to lift them, too, because "even those addresses get reassigned", and because the "WMF makes exceptions for certain users which have justified to the stewards their use of such IPs[, ... and] for such users the local blocks prevent their participation in project even though they have been cleared by community Stewards". We just have to decide if we do, in fact, want to lift those blocks. Note that prior to running the script we may want to manually shorten (rather than entirely lift) our handful of recent (post-2011) permablocks: of 110.173.0.18, 193.63.87.227, 108.62.89.242, and 81.70.250.48.
If we do want to lift all the remaining local blocks (I think it would be reasonable to), you should probably send your script to someone with enough Python skillz to check it before running it (not me, lol). :p - -sche (discuss) 18:32, 21 May 2014 (UTC)- I think we might need to poke someone from the WMF for that. And tell them to fix the database in the meantime. — Keφr 19:01, 21 May 2014 (UTC)
- I lifted or shortened all of the blocks except the blocks of Tor nodes and proxies, and the blocks of IPs which were also subject to global bans, blocks, or locks. Hundreds of those blocks remain. TeleComNasSprVen and Amgine argued in the BP that it would be good to lift them, too, because "even those addresses get reassigned", and because the "WMF makes exceptions for certain users which have justified to the stewards their use of such IPs[, ... and] for such users the local blocks prevent their participation in project even though they have been cleared by community Stewards". We just have to decide if we do, in fact, want to lift those blocks. Note that prior to running the script we may want to manually shorten (rather than entirely lift) our handful of recent (post-2011) permablocks: of 110.173.0.18, 193.63.87.227, 108.62.89.242, and 81.70.250.48.
Yes, the existing definition was not too good. I inserted what I think is a better definition for the sense of wash after the one you worked on. I think it fits the citation perfectly. Importantly, I think this sense of wash only applies to an outcome, actual or expected. DCDuring TALK 18:01, 25 May 2014 (UTC)
- Thanks. I've moved the citation and the wording about "no net change". Should the "something where no progress is made" sense be removed now, or do you think it is attested and distinct from the "losses and gains are equivalent, no net change" sense? - -sche (discuss) 21:17, 25 May 2014 (UTC)
- In my idiolect, insofar as it is distinct from the new sense, it is not correct. But the English-speaking world has lots of variation.
- A couple of OneLook dictionaries say it is US. Neither Collins nor Macmillan have it, suggesting it is not UK. It was not in Websters 1828 or 1913. A related sense of wash = wash sale appeared in Century. The wording in the dictionaries that have the sense usually includes no explicit mention of "progress", sometimes "yield" or other indication of result, so I would be happy deleting or "merging" the senses.
- Relatedly, I don't get the sense development from the other senses to the "equivalence/balance" sense(s) or to the "wash sale" sense. I could see sense development from the "wash sale" sense to the "equivalence/balance" sense. DCDuring TALK 00:24, 26 May 2014 (UTC)
- On stackexchange, one commenter suggests that "the usage derived from "wash out" back in the 20's". Nothing about their evidence suggests to me that the usage derives from "wash out", but ngram data does at least suggest that the dating is broadly right: it seems to have arise sometime between 1880 and 1950. Stackexchange also suggests that "wash sale" may derive from this sense of "wash". - -sche (discuss) 21:02, 26 May 2014 (UTC)
- On Google Books, I spot a couple instances of "wash loan", an interest-free loan, which seems like another derivation of this sense of "wash". - -sche (discuss) 21:06, 26 May 2014 (UTC)
- "Wash sale" is related, but it mostly leads to discussions of the specific US tax law provisions. "Wash transaction" is more informative, showing that the term covered a variety of types of transactions in which an apparent "real" transaction is offset by an exactly corresponding undisclosed transaction, to achieve some financial gain by fooling someone. I still have trouble grasping the metaphorical connection with other senses of wash. DCDuring TALK 06:37, 28 May 2014 (UTC)
- On Google Books, I spot a couple instances of "wash loan", an interest-free loan, which seems like another derivation of this sense of "wash". - -sche (discuss) 21:06, 26 May 2014 (UTC)
- On stackexchange, one commenter suggests that "the usage derived from "wash out" back in the 20's". Nothing about their evidence suggests to me that the usage derives from "wash out", but ngram data does at least suggest that the dating is broadly right: it seems to have arise sometime between 1880 and 1950. Stackexchange also suggests that "wash sale" may derive from this sense of "wash". - -sche (discuss) 21:02, 26 May 2014 (UTC)
Hi, please take a look at this, especially at any languages you have renamed. DTLHS (talk) 22:18, 1 June 2014 (UTC)
- Will do; thanks for regenerating the list. - -sche (discuss) 15:24, 2 June 2014 (UTC)
You added the Oroko language section with a Bantu language header and language code (bdu), but cited Blust's Austronesian comparative dictionary, which has oi as the Bimanese (our Bima, code bhp) word for water to match your definition.
The obvious questions:
- Is oi really the Oroko word for water?
- Do we need to create a Bima section?
- What should I do with the Oroko categories I just created based on Category:Oroko nouns' presence in Wanted categories. Chuck Entz (talk) 23:37, 14 June 2014 (UTC)
- Hm, it looks like I got the codes for w:Bima language (Bantu) and w:Bima language (Austronesian) mixed up in adding the translation to [[water]] and in creating that entry. (Or possibly WT:EDIT's autocomplete-the-language-name function mixed them up.) I'll fix it up. Good of you to notice. I'd just leave the Oroko categories; I've never been a fan of deleting empty POS and language categories (the latter in particular should never be deleted, IMO), since they're bound to fill up eventually. - -sche (discuss) 01:49, 15 June 2014 (UTC)
I'm confused.
editThere is no question that I don't merge "cot" and "caught", but I still wonder something...
I've been wondering exactly how to pronounce /ɒ/ all of this time, because I assumed that I didn't have it. The closest that I have come (or so I thought) was by rounding my lips whilst making the /ɑ/ sound. Yet all that sounded like to me was a short "/ɔ/".
So I decided to compare the audio files listed on Wikipedia for /ɔ/ and /ɒ/. These are the aforementioned audio files: http://upload.wikimedia.org/wikipedia/commons/0/02/Open-mid_back_rounded_vowel.ogg , http://en.wikipedia.org/wiki/File:Open_back_rounded_vowel.ogg
In comparing them, I noticed something. The sound file that supposed represented /ɒ/ sounded closer to my "/ɔ/". In fact, only when I dropped the "r" from "north" and stressed the remaining vowel did I actually produce a sound that was more-or-less identical to the sound clip attached to /ɔ/.
In addition, my "/ʌ/" resembles the sound clip attached to "/ɐ/" far more than it does the sound clip attached to /ʌ/ (the sound clip attached to that sounds more like /ɜ/ to me. Though... not identical.)
However, in all of the cases that I mention something sounded "closer" to something, I merely meant just that. More specifically, my "/ʌ/" does not sound exactly like the sound clip of /ɐ/; in particular, my "/ʌ/" is a little gentler sounding (to explain, if I try to mimic the sound clip for /ɐ/, then my teeth are exposed and my front teeth push back. Meanwhile, my "/ʌ/" has the softness of a non-rhotic /ɶ/ if that's possible (by that, I mean that the sound clip for /ɶ/ [1] sounds rhotacised to me). In addition, my "ɔ" indeed sound more like the sound clip given for /ɒ/ than the sound clip given for /ɔ/, but it has a similar issue in "softness/gentleness" as the previous.
Any ideas at what the problem here is/what sounds I'm actually producing? Tharthan (talk) 18:35, 24 June 2014 (UTC)
- It's not always easy to tell what precise phoneme someone is producing even when one hears them speak, but when one merely reads text, it's very difficult indeed. I can make two comments, though:
- I've never been convinced that all of WP's IPA phoneme audio files, especially the audio files of unusual sounds like /ɶ/, are as perfectly spot-on as WP's decision to provide them in the infoboxes of the IPA phonemes suggests they are.
- If your /ɔ/ sounds more like WP's /ɒ/ than like its /ɔ/, and your /ʌ/ sounds more like WP's /ɐ/ than like its /ʌ/, then it could be that either the person who did WP's audio files was pronouncing things slightly too high, or you are pronouncing things slightly lower than is canonical. (Apologies if this is a trivially obvious and unhelpful observation.)
- - -sche (discuss) 00:45, 25 June 2014 (UTC)
- No apologies necessary, I am very much appreciative of both your observations and your opinions on this particular matter. From what you said, I was able to figure that (if we presume that the audio files for the vowel sounds that are used on Wikipedia are spot-on) this is how my vowels work. Areas where I seem to deviate are coloured green:
- My question is, how would I then more accurately transcribe these questionable vowels? Should I transcribe my "/ʊ/" as /ʊ̈ /, or as /ɵ/(there is no audio clip for /ʊ̈ /, so I don't know whether it would be the best representation of my "/ʊ/" or not, but I can say for certain that my "/ʊ/" is very much like the audio clip for /ɵ/; only a tad softer)? I'm already leaning towards transcribing my "/ʌ/" as /ɞ/ if it comes to that, but I am unsure about how I should transcribe the other "green vowels".
- Now, the reason I ask about how I should transcribe these "green vowels" is because someone else listened to my audio clip for Boston on here, and decided it would be more accurate to transcribe my pronunciation of it as [ˈbɒːstɪn] rather than what I was already transcribing it as. The thing is, I can't refute the idea that my "/ɔ/" is more of an /ɒː/, due to what we have just been discussing, and I don't want to consciously lie to people about my pronunciations, so...
- How would you recommend I transcribe my pronunciations on Wiktionary from now on? Should I just stick with how I was already transcribing them because such particular accuracy might not be necessary here, or should I only change my transcription of some of the "green vowels", OR should I change the transcription of all of the "green vowels", and (if so) how do you think I should transcribe them? Tharthan (talk) 18:13, 25 June 2014 (UTC)
- I had the same observation about the /ɒ/ in the sound files, it sounds more closed and I would interpret it as /ɔ/. I don't speak a language variety that contrasts these, but I do contrast Dutch /ɔ/ and /ɑ/ with English /ɒ/. And to me, the sound file is closer to my Dutch /ɔ/ than to my English /ɒ/. —CodeCat 18:24, 25 June 2014 (UTC)
- The way I see it, IPA symbols are more relative than absolute and thus have different absolute locations for each accent of each language. Since the Rhode Island accent (which is mostly what Tharthan speaks) has all three of (using Wikipedia-style phonemes) /ɑː/, /ɒ/, and /ɔː/, it makes sense to use different letters for each of them. I prefer, respectively, [a(ː)], [ɒ(ː)], and [ɔ(ː)]. The second one is rounded or at least semi-rounded, which is why I don't think it should be [ɑ(ː)]; and the third one is more closed than the second, and the next closest symbol is [ɔ(ː)]. --WikiTiki89 18:41, 25 June 2014 (UTC)
- @CodeCat I wasn't aware that Dutch didn't distinguish between /ɔ/ and /ɒ/. Is it a dialectal thing, or is it standard all around? Like, would somebody from Friesland who is speaking Dutch maybe have that phoneme, whilst a Hollander would not (or something like that)? Also, if the audio clip for /ɒ/ actually represents /ɔ/, then what does the audio clip for /ɔ/ represent?
- @Wikitiki89 But wasn't IPA supposed to clear up a lot of that relativity when it comes to transcription? That's the whole reason why I've loved IPA ever since I've first seen it (which was when I was a wee lad of but four and a half years old. I saw it in my father's copy of Oxford, you see). In either case, my "/a/" is much softer (and not all my vowels are soft, by the way. My /æ/ is pretty much identical to that of the voice clip on Wikipedia) than actual /a/, hence why I've transcribed it up till now as /ɑ/ (but perhaps /ɑ̈/ would be a better way to transcribe it?) Either way, how would you recommend I transcribe the vowels marked in green? Tharthan (talk) 20:46, 25 June 2014 (UTC)
- IPA was supposed be a single set of symbols that can be used for any human language. No system can clear up all the relativity. How you should transcribe the vowels in green depends on the other vowels you distinguish. Can you perhaps re-create this image with the other vowels you use bolded? --WikiTiki89 20:52, 25 June 2014 (UTC)
- @Tharthan: Dutch has /ɔ/ and /ɑ/, but no /ɒ/. I suppose the latter might be an allophone of /ɑ/ for some speakers or in some places in a word, but it's not really contrastive. —CodeCat 21:18, 25 June 2014 (UTC)
- The way I see it, IPA symbols are more relative than absolute and thus have different absolute locations for each accent of each language. Since the Rhode Island accent (which is mostly what Tharthan speaks) has all three of (using Wikipedia-style phonemes) /ɑː/, /ɒ/, and /ɔː/, it makes sense to use different letters for each of them. I prefer, respectively, [a(ː)], [ɒ(ː)], and [ɔ(ː)]. The second one is rounded or at least semi-rounded, which is why I don't think it should be [ɑ(ː)]; and the third one is more closed than the second, and the next closest symbol is [ɔ(ː)]. --WikiTiki89 18:41, 25 June 2014 (UTC)
@Wikitiki89 Sure. I'll do that tomorrow. It's wicked late at the moment, so I need to get to sleep. @CodeCat Ah. I see... Tharthan (talk) 03:27, 26 June 2014 (UTC) @Wikitiki89 I have now done what you requested. The vowels that I have that sound exactly the same as Wikipedia's sound clips and I don't question the transcription of are marked in red. In addition, another green vowel was added because I forgot to mark it before. So the red vowels and green vowels make up my vowel inventory that I use every day when speaking English. Tharthan (talk) 13:58, 26 June 2014 (UTC)
- Thanks! If I were you, I would continue to transcribe exactly as you have them there, except that you could switch to using [ɐ] for /ʌ/. You can't use [ɞ] as you mentioned above because [ɞ] is the rounded counterpart of [ɜ], and I doubt your pronunciation is close enough to [ɜ] anyway. I really don't know how to tell apart [ɐ] and [ʌ], despite how far apart they seem to be on the chart. Even Russian has a sound that is interchangeably transcribed as either [ɐ] or [ʌ]. You can't use [ɒ] for /ɔ/ because it's already taken. You could also use [ɵ] for /ʊ/ (I know [ɵ] is even used for the "feminine" RP realization of /ʊ/). And just to make sure I understood you correctly, your /e/ and /o/ are meant to represent the first part of the diphthongs /eɪ̯/ and /oʊ̯/, correct? --WikiTiki89 14:17, 26 June 2014 (UTC)
If I can't use /ɒ(ː)/ for my "/ɔ/, then what should I use for it? It clearly sounds different than a real /ɔ/ (which I have before words like "north" [/ɔɹ/] and the like), yet it's not the same as plain /ɒ/ either.
My /o/ is usually /oʊ/. HOWEVER, when pronouncing stressed words like "ohhhhh" or something like that, I do actually pronounce it with a monophthong vowel. This vowel is actually in between /o/ and /o̞/. This could be because (when I was around nine or ten years old) I started studying Japanese, which I think has an /o̞/ sound. Or it could have something to do with my Polish ancestry (I am Irish, Polish, French and Portuguese, as well as about 1% Algonquian, along with maybe a few other things because my father didn't ever know his natural mother [except that he visited her once when he was quite little and remembers that she was blonde] and was instead adopted by his aunt [who was the sister of his father]. The first two of those five [Irish {Some dialect of Irish English was spoken by my Irish ancestors} and Polish] have been the most influential on the idolect-bund of my family.) Specifically, the fact that my Gran (my mother's mother) spoke Polish (almost as well as my Babci and Dziadziu, but they passed away when my Mom was seven; the same age I was when my Gran passed away), and taught my mother a good several words of Polish herself, along with proper pronunciation, is likely the reason behind why my /ɑ/ sounds far different than an /a/ to me, and why I've always tended to pronounce Japanese words/Japanese loan words ending in "e" with /ɛ/ instead of /e/ (because in my household we grew up saying "na zdrowie" [which we actually usually pronounced as /nɑz‿stroβvijɛ/ or {when speaking very fast} /nɑz‿stroʋyjɛ:/] when we did a toast), why I have always had both /ɾ/ and /r/ in phonemic inventory (this is mostly due to the fact that we ate pierogis [which we pronounced /pɛˈɾoːɡi/] a lot). Other things we often said were "dziękuję" (which we pronounced /dʒ‿ɛ̃kujɝ/), and "jak się masz" (which we pronounced /jakʔʃɛmɑʃ/). For reference, by the way, all of my ancestors (whether Irish, Polish, French or Portuguese) initially came to New England around the late 1800s. The first generation Irish part of my family lived until about 1990 or so.
As is the case with my /o/, my /e/ tends to be a diphthong (much more so than my /o/ tends to be a diphthong, however). Nonetheless, in rare cases I will pronounce a true /e/ in a word. Tharthan (talk) 17:13, 26 June 2014 (UTC)
- Well if you use [ɒ(ː)] for your /ɔː/, then how would it be differentiated from your /ɒ/? At least /ɔɹ/ is differentiable by the /ɹ/. The other option is to start using diacritics, but I think of that as a last resort or for super narrow transcriptions. --WikiTiki89 21:06, 26 June 2014 (UTC)
- Yeah, I suppose you're right. I'd rather not use diacritics for IPA if I don't have to, so I'll just stick to something similar to what you recommended. I think I'll use /ɵ/ for /ʊ/ unless I somehow find an audio clip for /ʊ̈ / (or maybe I'll just stick with /ʊ/ à la what I am doing with using /ʌ/ instead of /ɐ/) Tharthan (talk) 18:11, 27 June 2014 (UTC)
Usexes at приказать долго жить
editSince you have not actually created an RFC discussion, I will make my comment here: The usexes are actually just usage examples quoted from a Russian phraseological dictionary. I guess it's possible they are actually quotations of literature, but the phraseological dictionary does not say where they came from, implying in my mind that they are just usage examples. --WikiTiki89 23:24, 26 June 2014 (UTC)
- I don't know whether we actually allow this, but if not, blame Anatoli. --WikiTiki89 23:26, 26 June 2014 (UTC)
- That's my thinking too. The usage examples only look like they're from literature but they may have been made up by the creators of the dictionary. I couldn't find the same texts in Google books. --Anatoli (обсудить/вклад) 23:32, 26 June 2014 (UTC)
- I didn't create a thread on WT:RFC because I figured it was an easy issue to fix and one of the page's watchers would fix it quickly.
I think it would be desirable to either find actual examples of use, or make up our own usexes, or at least cite the dictionary that's being quoted (i.e. cite it in the manner other quoted works are cited in, "Year, Author, Work")... but it's certainly odd to cite another dictionary's usex... - -sche (discuss) 00:08, 27 June 2014 (UTC)- Well that's why I added the Tolstoy quote. I was too lazy to look for more after that, but I think ideally there should be an old quote and a recent quote each for both senses. --WikiTiki89 00:22, 27 June 2014 (UTC)
- I have changed for a simpler usage example. Too much hassle! --Anatoli (обсудить/вклад) 02:42, 27 June 2014 (UTC)
- Well that's why I added the Tolstoy quote. I was too lazy to look for more after that, but I think ideally there should be an old quote and a recent quote each for both senses. --WikiTiki89 00:22, 27 June 2014 (UTC)
- I didn't create a thread on WT:RFC because I figured it was an easy issue to fix and one of the page's watchers would fix it quickly.
- That's my thinking too. The usage examples only look like they're from literature but they may have been made up by the creators of the dictionary. I couldn't find the same texts in Google books. --Anatoli (обсудить/вклад) 23:32, 26 June 2014 (UTC)
Riemen
editHi, thanks for adding the etymological info that I had asked about to [[Riemen]]. Could you add a References section and list the full names of the works by Chudinov, Dahl, and Vasmer? Thanks! —Aɴɢʀ (talk) 13:55, 27 June 2014 (UTC)
- Done; cheers! - -sche (discuss) 15:46, 27 June 2014 (UTC)
I tried to change "Greece" to something else ("Greeks") because the sentence looks a bit odd to me, since as you know there was concept of polis, but not concept of a unified society or state called Greece to which Greeks can be loyal back then. Could you reword that sentence without mixing state and people, or could you think of an alternative? --Z 18:46, 29 June 2014 (UTC)
- Hmm, what about "to be loyal to Persians rather than Greeks"? - -sche (discuss) 18:51, 29 June 2014 (UTC)
Guess what? Three citations from sources having nothing to do with the subject, and more, from Google Books. Gotcha there! Rædi Stædi Yæti {-skriv til mig-} 02:05, 6 July 2014 (UTC)
TR
editWhat does TR mean as said here? Pass a Method (talk) 14:49, 15 July 2014 (UTC)
User:Rakkalrast added a word in this language to *šan-, but we don't have a language code for it, since it does not have an ISO code. Should we create this language (LINGUIST List gives it the code: lsd-bet) or merge it with Leshana Deni (code: lsd)? --WikiTiki89 15:35, 20 July 2014 (UTC)
- A variety of Neo-Aramaic spoken by ≤ 17 families?... there comes a point, for me at least, when it even considering something a dialect as opposed to an idiolect becomes iffy; certainly, I don't see any evidence that we need to handle Betanure as a full language rather than using qualifier tags and one of our existing headers. And Hezy Mutzafi's 2008 work on The Jewish Neo-Aramaic Dialect of Betanure →ISBN affirms that Betanure is part of the Lishana Deni Jewish Neo-Aramaic dialect cluster. - -sche (discuss) 16:06, 20 July 2014 (UTC)
Proto-Algonquian
editI wonder why the Proto-Algonquian words have all those dots in them. Are they morpheme boundaries? —CodeCat 17:16, 21 July 2014 (UTC)
- Placing a mid-dot after a [Proto-Algonquian] vowel is the usual way of indicating that the vowel is long. One also sees colons used for this purpose (presumably approximating IPA /ː/), mostly though not exclusively in works that also lack other "advanced" characters and substitute e.g. "?" and "0" for "ʔ" and "θ". (On rare occasion, I've even seen authors just double the vowel, or use circumflexes as in Fox orthography.) Morpheme boundaries are indicated when necessary by a hyphen, which does mean that reading old messily-printed or poorly-scanned texts sometimes requires one to have enough knowledge to tell whether a long vowel or a morpheme boundary is to be expected in the particular place. This is documented on WT:AALG#Vowels (though perhaps not clearly enough—what do you think?), and will be documented on WT:AAQL once I add info on phonology to that page. - -sche (discuss) 17:58, 21 July 2014 (UTC)
- I wonder why they didn't go with the more common practice of using macrons... —CodeCat 18:12, 21 July 2014 (UTC)
- I don't know. The practice of using mid-dots for PA seems to have originated with Leonard Bloomfield, who first reconstructed PA, and who also used mid-dots for Menominee. Neither language uses any diacritics. - -sche (discuss) 18:33, 21 July 2014 (UTC)
- I wonder why they didn't go with the more common practice of using macrons... —CodeCat 18:12, 21 July 2014 (UTC)
Prakrits in Module:etymology language/data
editWT:FSCK spews warnings about these not having a parent; I guess they should be put under pra, pka, pmh or psu. All of these were added by User:DerekWinters in June. Since you are the one who usually maintains our languages lists, what do you think should be done with them? — Keφr 22:55, 23 July 2014 (UTC)
- I've put them under "pra". If anyone wants to make a case for sorting them more narrowly (e.g. under "pka"), they're welcome to. The codes don't seem to be used anywhere, though, and I'm not sure it makes sense to have them in Module:etymology language/data rather than (a) having them in Module:languages/datax or (b) not having them at all. I guess the logic is that they're not different enough from "pka" to be worth giving separate L2s, but they might still be worth mentioning in etymologies (although so far they haven't been)? - -sche (discuss) 00:03, 24 July 2014 (UTC)
I have nominated this template for deletion. I don't see how it is useful for anything that we do, and it seems to create a large number of unattractive and intractable problems. Cheers! bd2412 T 16:44, 24 July 2014 (UTC)
- I admit it's not without its problems. However, I think more good than harm, and you may be surprised to learn that the situation prior to the creation of the template was even more unattractive and problematic. I'll comment on RFD with some links to evidence of that, among other things. - -sche (discuss) 19:09, 24 July 2014 (UTC)
Quick question
editIs the nonstandard pronunciation of trough a result of an "anti-th fronting" of sorts? Mayhap... a "hypercorrection"? Or did some dialects actually have a /x/ → /θ/ shift?
I remember seeing that there was once a time when some (Middle?) English dialects confused /ʍ/ and /θ/, but I don't recall seeing anything about a dialectal /x/ → /θ/ shift anywhere. Tharthan (talk) 18:39, 25 July 2014 (UTC)
- I don't know. Based on my feeling that /f/ and /θ/ alternate more often than /x/ and /θ/, I would have hazarded a guess that the shift was not an early one of /x/ → /θ/ but rather a later one of /f/ → /θ/. However, William Dwight Whitney, in his 1870 Language and the Study of Language, writes:
- Thus, when the English gave up in pronunciation its palatal spirant—still written in so many of our words with gh—while it usually simply silenced it, prolonging or strengthening, by way of compensation, the preceding vowel, as in light, bough, Hugh, it sometimes substituted the labial spirant f, as in cough, trough; and, in the latter word, a common popular error, doubtless going back to the time of the first abandonment of the proper gh sound, substitutes the lingual spirant th, pronouncing troth.
- Elsewhere I can find a note from 1917 by American linguist Edgar Howard Sturtevant that "until my thirtieth year I pronounced 'trough' as 'trouth'", and confirmation from Peter Davies' 1983 Success with words: a North American guide to the English language that even at that date "trough is [still] sometimes pronounced /trōth/". The comments on this blog post contain some more information on the subject.
- Perhaps someone in the wt:Tea Room can tell you more.
- Incidentally, Merriam-Webster contains the interesting note (using their non-IPA notation) that the term is pronounced "by bakers often /trō/".
- - -sche (discuss) 19:17, 25 July 2014 (UTC)
- Ah, I see then. That word in particular must have been very variable in its pronunciation for a good long while now. Thanks for the explanation. Tharthan (talk) 13:41, 26 July 2014 (UTC)
{{alternative form of}}
doesn't even categorise anymore, so there's no problem removing the key. —CodeCat 21:33, 1 August 2014 (UTC)
Mordvin
editjade#Etymology 2 mentions the language Mordvin, but we consider Mordvin to be two different languages: Erzya (myv) and Moksha (mdf). Is it worth creating a small language family for Mordvinic languages (probably not)? If not not, how can I determine which language was meant in the etymology? --WikiTiki89 19:58, 4 August 2014 (UTC)
- As a first step, I'd declare the term's language to be "und", and say it's from "either Erzya or Moksha". This obviates the need to create a code for Mordvinic (though one could still be created if there happened to be other reasons why it would be useful). Next, knowing that Moksha and Erzya are both written in Cyrillic, I'd test various possible Cyrillic spellings of the term combined if possible with various possible Russian translations, to see if I could find any Russian linguistic texts that mentioned the term — I've been able to verify the identity of some Lak and other Caucasian terms that way.
- PS #1: that reminds me of how useful it would be if we had entries for the Russian abbreviations of various languages' names. I've added some (д.-в.-н.), but I think I stupidly didn't record the Caucasian abbreviations at the time I had them in front of me, even though it took me a while to figure them all out with the help of ru.wikipedia. Maybe I'll go looking for them again; shouldn't be too hard to find them again, and you and Anatoli can help verify what they're abbreviations of.
- PS #2: do you think it's redundant to say "obviates the need" or "obviates the requirment", since obviate already specifies "bypass a requirement" in its definition? I've never been sure... - -sche (discuss) 20:54, 4 August 2014 (UTC)
- After taking a look at the languages' orthographies, the only Cyrillic spelling of al'd'a that makes sense is альдя, for which Google shows several results in some strange language that might be either Moksha or Erzya, or might be something else entirely. I don't know nearly enough about these languages to be able to identify them, and none of the results are dictionaries. RE PS #1: Russian abbreviations always confuse me too. I'm not even sure whether the language abbreviations are standardized enough between dictionaries for it to make sense to add them. RE PS #2: I think the definition is supposed to be "bypass [a requirement]" in other words the requirement (or the word requirement itself) is meant to be the direct object of "obviate". --WikiTiki89 02:12, 5 August 2014 (UTC)
- I checked such spellings as алда, ал'д'а and альдьа after I posted, and I couldn't find anything in a Uralic language, either. Some hits were Kazakh(!).
- Per Thorson's 1936 Anglo-Norse studies: an inquiry into the Scandinavian elements in the modern English dialects, volume 1, derives dialectal English yad / yaad / yaud (used in "Sc Nhb Lakel Yks Lan", which I take to be Scotland, North Humberside?, Lakeland?, Yorkshire, and Lancashire) from Old Norse jalda (dialectal Swedish jäldä), from a Finnish word "elde" (citing "FT p. 319, Torp. p 156 fol."), but says "Eng. jade is not related." Likewise the Saga Book of the Viking Society for Northern Research, page 18, says "There is thus no etymological connection between ME. jāde MnE. jade and ME. jald MnE. dial. yaud etc. But the two words have influenced each other mutually, both formally and semantically." I'll see about expanding jade and yaud with this information. - -sche (discuss) 03:04, 5 August 2014 (UTC)
- One last question, though. Should "Mordvin" be added as an alternative name for both Moksha and Erzya? --WikiTiki89 13:45, 5 August 2014 (UTC)
- Yeah, enough references (especially old ones but even some modern ones) speak of "Mordvin" as a language made up of Moksha and Erzya dialects, rather than as a family, that recording that asan alt name would be helpful. - -sche (discuss) 15:26, 5 August 2014 (UTC)
It's эльде both in Moksha and Erzya. See Имяреков, Мокшанско-русский словарь, 1953, page 124b, and Серебренникова Б. А., Бузакова Р. Н., Мосина М. В. (ред.), Эрзянско-русский словарь, 1993, page 781b. If you can't find a spelling for any Uralic, Altaic or a Caucasian language, ask me, I have a lot of sources. --Vahag (talk) 08:44, 7 August 2014 (UTC)
- Awesome, that's good to know. I knew you had resources on Caucasian languages, but didn't know about Finno-Ugric. I'll add a Moksha section to [[эльде]]. :) - -sche (discuss) 17:47, 7 August 2014 (UTC)
The power of 'and'
editWe COULD have both fixing AND fixing to be intelligible on their own, something we do with many comparable situations. DCDuring TALK 21:57, 14 August 2014 (UTC)
- I've replied at WT:RFM so as to keep discussion in one place. Cheers! - -sche (discuss) 22:31, 14 August 2014 (UTC)
Attestability of "yellowman"
editThe search for attestability seems to yield mostly references to a White Jamaican reggae artist. Purplebackpack89 04:41, 22 August 2014 (UTC)
- Thanks for looking. I tried searching for the plural, "yellowmen", and although that turned up some scannos, it also turned up enough valid hits that I've now created yellowman. - -sche (discuss) 21:56, 22 August 2014 (UTC)
- Two is enough? DCDuring TALK 23:55, 24 August 2014 (UTC)
- The search turned up more than the two hits I typed up. CFI doesn't require that citations be typed up and put in entries unless the entries are challenged, but I have typed up a third citation. Incidentally, it also contains "whitemen" and "blackmen". - -sche (discuss) 00:56, 25 August 2014 (UTC)
- Two is enough? DCDuring TALK 23:55, 24 August 2014 (UTC)
I gathered these from the book mentioned in the Appendix at a WP edit-a-thon held today at a local library. You provided an etymology for Mamaroneck that was better than that in the book, by Richard Lederer (or his father?). A few of the toponyms in the Appendix (eg, Osceola, Mohegan) are taken from native American tribes not from the immediate area, a few from neighbors on the west side of the Hudson, Connecticut, farther north in New York, or possibly from Long Island, but at least 80% are from tribes that lived in what are now Westchester, Putnam, or Bronx counties. The spellings are the only ones Lederer had. I assume he rejected some for good reason. He seems to have taken many of them from land purchase records of the 17th century. DCDuring TALK 23:51, 24 August 2014 (UTC)
- Oh, neat. I will look over the list and see if I can clarify / expand any of the etymologies. Should I remove placenames from the list once we have entries for them with complete etymologies (as in the case of Ossining), or what? - -sche (discuss) 00:58, 25 August 2014 (UTC)
- Let's keep them as examples of what can be achieved, at least for now.
- Lederer seems to have worked fairly diligently through his sources, which include hundreds of primary documents and secondary works. I didn't see any works in the bibliography that seemed to be specifically books or articles on the native languages themselves, but I ran out of time so I didn't look all that carefully. I'll be able to take a closer look soon. I may also extract the Dutch origin names. The English ones are fairly uninteresting, even to locals.
- Why are Germans so fascinated by native Americans? DCDuring TALK 04:27, 25 August 2014 (UTC)
- BTW, I have the towns there to provide a hint where in the county these places are, in case geography might have a bearing on the language of the toponym. There are a few from the Long Island Sound area, more from Bronx and Yonkers and along the Hudson to Peekskill, and others inland in northern Westchester. HTH. DCDuring TALK 04:33, 25 August 2014 (UTC)
- I'm sure there are books written on that subject. I think it's partly the earlier European Noble Savage myths, combined with the lack of territorial conflicts that might have provided motivation for negative stereotypes, but also just the lure of the exotic and safely far away. Chuck Entz (talk) 05:05, 25 August 2014 (UTC)
- I wonder that myself sometimes.
- If you're asking why so many materials on native American tribes and languages were compiled by Germans, a large part of the answer is prosaic. Germany has long produced large numbers of ethnographers and linguists. A lot of materials on Pacific and African peoples and languages were also compiled by Germans.
- If you're asking why so many non-linguists love "Indian" things ... well, that's Karl May's doing. He bought into and sold others on the romanticized notion Chuck mentions of simple and noble, exotic people living "authentic lives".
- The town names should be helpful. - -sche (discuss) 07:01, 26 August 2014 (UTC)
Neologisms and "Web Words"
editPersonally, I've always taken "Web words" happily so long as they met certain criteria. I've always been particularly fond of Germanic or otherwise native ones, due to my love of writing "native" poetry.
Anent neologisms... it's been somewhat iffy. I am accepting of some, but not of others. For instance, "selfie" is a term that I never use; opting for the fairer "self-snapshot" or "snapshot of oneself". On the other hand, "troll" (as in the sense of "to bait and wait so as to start trouble" or the like) is one that I have happily accepted with open arms (mayhap due to its origins in angling terminology, though I honestly can't say for sure).
Now, the reason why I bring this up is because it seems that Wiktionary's methods of determining which "web words" and which neologisms are acceptable for inclusion are somewhat murkily composed. Whilst terms like "halgi" are included, others are not. I can't really tell what the "criteria for inclusion" entails sometimes, because it seems a bit vague.
Might you be able to shed some light on this? Tharthan (talk) 17:02, 31 August 2014 (UTC)
- Yeah, numerous discussions have made it apparent that Wiktionary's policy on citing the internet is not as clear as it could be; in particular, it can take a while to unpack the ramifications of the words "durably archived" / "permanently recorded" in WT:CFI. But once those words are unpacked, "web words" and "print words" are subject to the same criteria for inclusion. Words in major languages have to be used, as in "he took a selfie", and not just mentioned, as in "he used the word 'selfie' to describe the picture he took of himself". (Lines like "he took what he called a 'selfie'" fall into a grey area of debatable use-vs-mention-ness.) The uses have to span a year, to weed out fad words that are only popular for a month, like the Russian translation of "pink slime" (which was only somewhat less of a fad in English). And the uses have to be in "durably archived"/"permanently recorded" media.
- What is durably archived? Books, newspapers, journals and magazines are durably archived. (Google Books and Issuu are good ways of using the internet to search through those media.) Websites are not durable, because they go offline (and moreover are edited and reworded) without warning. Even articles on the websites of news organizations can be taken down — a Wikipedia article I just edited discussed a story which was removed at the request of the journalist, allegedly after he was intimidated. Even the Internet Archive, which has been discussed in the past, is not a durable archive, because it removes pages if site owners request that. The only online corpus which is durably archived is Usenet, because it is decentrally archived, and attempts to censor things from it have indeed failed (e.g. someone at one point tried to delete alt.religion.scientology, and failed). This failure of most web sources to be durably archived can make it harder to cite "web words" (cf. this). However, if a web word is attested per those criteria, it can have an entry just like any other attested word.
- Does this clear things up any?
- Note that because of the nature of Wiktionary (it's a work in progress, and it's a wiki anyone can edit in real-time), some unattested words may have entries (you can RFV those), and some attested words may not have entries yet (you can create those). Also note that strings that are analysable as misspellings (e.g. strings like licencise, and probably also uncommon strings from lolcat-speak or doge-speak) may be excluded as such. - -sche (discuss) 23:02, 31 August 2014 (UTC)
- Yes. That clears up a lot. I now have a more adequate understanding of how the process works. Thanks much.
- So citations from Usenet are considered to be among those of the "durably archived" / "permanently recorded" variety? Or, are they only somewhat so, and are thusly taken with a grain of salt? Tharthan (talk) 23:55, 31 August 2014 (UTC)
- Usenet is as durably archived as print media, so a use of a word on Usenet is 'worth' as much as a use of a word in a book. But Usenet is more likely than print media to contain typos/misspellings, so if a string is analysable as a typo/misspelling, and it is only supported by Usenet citations, people may be more likely to analyse it as a misspelling and not an intentional use of a certain spelling/word. (For example, book citations might have done more to convince people of the word-hood of licencise than these Usenet citations did in this discussion.) But even books contain typos : I can't find an example offhand, but in RFV, if a book uses an unusual spelling sometimes and the usual spelling other times, it's usually assumed that the instances of the unusual spelling are typos. And when it's clear that something isn't a typo/misspelling, like "Rightpondian" or the video-game sense of "pull", then it doesn't matter whether the citations come from books or Usenet. There seem to be about 1100 entries that cite Usenet. - -sche (discuss) 01:56, 1 September 2014 (UTC)
I believe this rollback was done in error. The alternate pronunciations that were there were intentional. I intend to restore them. - Gilgamesh (talk) 13:52, 26 September 2014 (UTC)
- I should have undone your edit with a more informative summary, I'm sorry. The pronunciations you added are unattested and dubious, per discussion on Angr's talk page, so I've removed them until such time as evidence of them comes along. Rollback is sometimes/often used as a quick way of undoing edits around here (if the edits are merely felt to make the entry worse, without the implication 'rollback' has on Wikipedia that the edits are vandalism), since Wiktionary's relatively small number of admins tend to be a lot busier than Wikipedia's larger number of admins... but it can tend to cause confusion, like now, when the edit was intentional and in good faith, but still made the entry worse. - -sche (discuss) 14:18, 26 September 2014 (UTC)
- I've started a thread at Wiktionary:Tea_room/2014/September. It's important that this be sorted out, because bowl-bull, cull-coal, etc. have indeed become homophones, and it effects even General American for most people certainly my age (34) and younger. - Gilgamesh (talk) 14:21, 26 September 2014 (UTC)
Hey, erm...
editI would have e-mailed you this or sent this message to you via a more private method if I could have, because I feel posting this here might come off as rude to the person in question (though I do not intend it as such).
User:Angr and I seem to be in disagreement over what should be allowed transcription-wise for a certain word, and we seem to be at a deadlock. As such, I thought that maybe a third party could be brought in so as to maybe give their opinion on the matter.
Now, I don't know really anything about your dialect, -sche, (and I don't mind being blissfully ignorant on that subject, since I think it's irrelevant to most parts of editing on Wiktionary) [though I remember seeing a reference to you at some point being in the Inland-North, though I don't really know the relevance of that] so I don't know where you'd fall anent this matter, but I would honestly hope (and truly do think) that that wouldn't (and shouldn't) matter, considering the argument here is transcription, and any linguist worth their salt knows how to properly transcribe vowel phonemes, and knows the difference between two different phonemes, whether monophthong, diphthong, or otherwise, irrespective of whether or not the vowel phonemes in question occur in his or her dialect.
Now, I firmly trust your knowledge and expertise in this field, hence why I have come to you. I think you may be able to help in settling this issue. So, if you'd be willing to offer your tuppence-worth on this matter, I'd be very grateful.
The aforementioned discussion can be found here: http://en.wiktionary.org/wiki/User_talk:Angr#Affair Tharthan (talk) 16:13, 28 September 2014 (UTC)
- My "e-mail this user" link should be enabled (in the toolbar on the side of this page, a few items below "what links here"); if it's not, let me know. (Not that I check my e-mail with any frequency at all...)
- My "expertise in this field" is amateur compared to Angr's. But since I've been asked, I'll give my thoughts:
- I remember noticing during a previous Tea Room discussion of the M-m-m merger that one of the problems one faces if one wants to transcribe 'marry', 'merry' or 'Mary', or for that matter 'air' or 'ear', is that the IPA doesn't have symbols that denote these sounds perfectly, so one is left using approximate transcriptions. That's not automatically problematic — if a language's "e" sound is actually 15% closer to canonical /ɛ/ than canonical /e/ is, it's fine to nonetheless transcribe it as /e/, or if necessary /e̞/; one needn't invent a whole new letter for it. It does, however, mean that discussions of whether or not sounds are distinct (and discussions of how to transcribe them) are more difficult. For example, according to our entry and dictionary.com, 'merry' is /ˈmɛɹi/ and 'Mary' is /ˈmɛəɹi/ for speakers who don't have the M-m-m merger, while both are /ˈmɛɹi/ for speakers who do. However, both our audio clips and dictionary.com's contain a vowel that is distinct from the /ɛ/ in 'bet' (i.e., the Vr sequences in the audio clips aren't just /ɛ/ followed by /ɹ/). That means that someone who was trying to figure out whether her pronunciation of 'Mary' used /ɛə/ or /ɛ/ would run into trouble if she tried pronouncing 'Mary' and then pronouncing words with /ɛ/ in them like 'bet' to see if she used the same vowel in both — she'd probably conclude that she didn't use the same vowel for the two words, even if the vowel she used in 'Mary' was the one we transcribe as /ɛ/.
- However, setting that issue to the side...
- According to our entry and dictionary.com, 'air' is /ɛəɹ/*, with the same vowel as unmerged 'Mary'. Our audio clip is curt and sounds like it contains only a single (non-diphthong) vowel, but dictionary.com's has more of a /ə/. Likewise, 'affair' is /əˈfɛəɹ/ per dictionary.com, and the vowel in the audio file is the same as the vowel (diphthong) in dictionary.com's 'Mary' audio file.
- That means it would be reasonable to transcribe the sound as /ɛəɹ/ (or /ɛɚ/, which is synonymous) for (some) American accents. But is /ɛɹ/ wrong? Well, is there an American accent that contrasts /ɛɹ/ and /ɛəɹ/ in this (non-intervocalic) context? If not, then the worst one can say is that /ɛɹ/ is potentially confusing, but as long as there's a page explaining how the symbols are used, it's not wrong, and it's possibly not even any more confusing (or any less accurate) than our use of /ɛ/ to mean one thing in merry and another in bet.
- Merriam-Webster and Random House use the same transcription for 'merry' and 'affair', but also for 'Mary' (apparently they treat the M-m-m merger as standard). The various dictionaries that make up thefreedictionary.com transcribe 'merry' and 'affair' differently.
- You can raise the issue in the Tea Room for broader discussion if you think the default transcription of the 'air'/'affair' sound should be switched from /ɛɹ/ to /ɛəɹ/ (or /ɛɚ/). I have no strong preference, since I don't think either transcription is ideal (I don't think there is any ideal transcription of the sound).
- Note that transcribing 'air' (and 'affair', etc) narrowly, in square brackets, as [ɛəɹ] or [ɛɚ] is another matter entirely, and probably a lot more straightforward.
- (* Our entry also lists /ɛːɹ/ as a possible US pronunciation of 'air', but this is suspect, since vowel length is not phonemic in American English. Actually, that's another case where a small distinction is glossed over and one symbol is used for two slightly different but non-contrastive things: /i/, /u/, etc is longer in some words than in others in American English, but they're not distinguished as having /i/ vs /iː/ because vowel length is not actually contrastive.)
- - -sche (discuss) 09:02, 29 September 2014 (UTC)
- Oh, you're right. I didn't notice that on the sidebar.
- I actually agree with you there, because I initially transcribed the /ɛə/ vowel as /e/, because that's how my mind thought of it (this might be due to plain /ɛ/ indeed being a plain /ɛ/ in my dialect, whilst /ɛə/ is more of an /ɛ̝ə/ in my dialect). Nevertheless, I agreed that the sound was far closer to /ɛə/ than /e/, so I changed my transcription practices accordingly.
- Then is it fine to list both pronunciations /ɛɚ/ and /ɛɹ/? You're right to say that there is probably no English dialect that contrasts /ɛɹ/ and /ɛɚ/ (my non-mMm merger dialect doesn't, since, as far as I know, /ɛɹ/ doesn't end any word in the language [with the possible exception of "err", as I mentioned on Angr's talk page]), but it's still better to list both pronunciations /ɛɚ/ and /ɛɹ/ than to list just /ɛɹ/ and have people say "Wait a minute... "affair" has the same vowel as "fairy", which is /ɛɚ/ for me in my non-mMm merger dialect, but yet the only pronunciation listed here is /ɛɹ/. Am I wrong in pronouncing it /ɛɚ/?" Furthermore, it couldn't do any harm to have both pronunciations listed. So could we at least have both /ɛɚ/ and /ɛɹ/ pronunciations given for affair? Tharthan (talk) 11:03, 29 September 2014 (UTC)
bear
editThanks for resolving the mini-contretemps at "bear"... AnonMoos (talk) 17:17, 29 September 2014 (UTC)
Clarification
editI hadn't realized I had accidentally put words in the wrong category. Thanks for the heads up. — LlywelynII 23:50, 30 September 2014 (UTC)
- Discussion moved to Talk:lebendig.
Lewis and Clark
editI've borrowed Lewis and Clark: Pioneering Naturalists, which has two appendices of plants and animals "discovered" by Lewis and Clark. For my purposes the listed species name(s) and vernacular names are of greatest interest. The appendices don't have non-English names. But the discoveries have references to the volume and page in Thwaite's edition and most have a date and location for the discovery. Have you already mined Lewis and Clark for native names? Do you intend to do so? Are there other sources for that? DCDuring TALK 20:15, 2 November 2014 (UTC)
- I've only 'spot mined' Lewis and Clark, i.e. when Google Books let me know that a page of their journals mentioned pasheco, I checked the surrounding pages for other native / native-derived words. I haven't mined the whole work. If you'd like me to (try to) find and add native names for any of species or vernacular names you add, I'll see what I can do. I've been rather distracted from my Native American word documentation project. - -sche (discuss) 01:40, 24 November 2014 (UTC)
Geology
editI've started a page User:DCDuring/Geology and copied your items there, as well as a WP table. It suggests some lines for improving our entries as well as showing redlinks. I also came across the Geowhen Database, which is a convenient source of confirmation of the meaning of some of these terms. DCDuring TALK 17:01, 3 November 2014 (UTC)
- Just so you know, although I haven't had much time for editing lately, I'm still available to help with geological terms, as I have some training in the field. If you leave me a message on my talkpage or tag me in relation to any issue you have when adding geological jargon or etymologies thereof, I'll be sure to respond. —Μετάknowledgediscuss/deeds 20:57, 3 November 2014 (UTC)
- If I can find the time, I'll check out which terms are (a) most-linked to within Wiktionary or, probably more usefully, Wikipedia (I wonder if there's a toolserver/wmflabs tool that does that), and/or (b) most common in ngrams. It would make sense to tackle those first. - -sche (discuss) 01:40, 24 November 2014 (UTC)
Hi. I saw you reverted some of my edits on this word. I was mistaken to change the etymology in the way I did. I thought the theory of its deriving from Slavic was outdated, so I put that one into a "postscript". I've since seen that Kluge is also of this opinion and I was about to make that revert myself. -- As to the quotation I deleted, I just think that it misleads people to believe the word is obsolete and there are no more current quotations to be found. I don't think such quotations are very useful, but I will refrain from deleting them from now on. Sorry! And best regards!Kolmiel (talk) 00:20, 4 November 2014 (UTC)
- Yeah, and I made a little edit on the wording of your version, because I thought it might suggest that German Schmetten is from English (which of course you didn't intend).Kolmiel (talk) 00:22, 4 November 2014 (UTC)
think of the children
editHi there -sche, you had previously pitched in and helpfully formatted an entry I improved, Streisand effect, as Word of the day.
Equinox (talk • contribs) created the entry on think of the children and I recently improved it.
I nominated it at Wiktionary:Word of the day/Nominations, however Ungoliant MMDCCLXIV (talk • contribs) mentioned at user talk:Equinox that unfortunately these days most of those that appear on the Main Page are recycled entries from prior years because it's pretty inactive.
I was wondering if you could add it to one of the upcoming dates for Word of the day?
Thank you,
-- Cirt (talk) 20:57, 5 November 2014 (UTC)
- I was able to get help from others, but thanks for your time. :) -- Cirt (talk) 18:56, 16 November 2014 (UTC)
- I'm glad someone helped you, and glad a new word will be featured. I'm sorry I didn't respond sooner. Perhaps over the upcoming holidays, when people have time off from work and school, someone will have time to set a bunch more Words of the Day. 01:40, 24 November 2014 (UTC)
Non-Oxford British English standard spelling
editWhy put this at all? The fact that Oxford University Press uses the z spelling has nothing to do with the usage of the word. But I know you must have some reason for putting it in. What is it? Renard Migrant (talk) 21:12, 15 November 2014 (UTC)
- Hi; sorry for not responding sooner. It seemed like the best way of distinguishing the two British spellings. Everyone (in Britain) spells flavour the same, but with something like actuali[sibilant]e, some Brits (most noticeably those affiliated with the OUP) spell it actualize, while many others spell it actualise. As I mentioned to an IP on Stephen's talk page, there have been a few discussions of how to describe the spellings that are used by British people, and other people throughout the Commonwealth, and all of the wordings have problems. Calling the spelling Oxford uses "Oxford British", and the other by elimination "non-Oxford", seemed best to me, but I'm open to being persuaded that another wording would be better. - -sche (discuss) 01:56, 24 November 2014 (UTC)
Archiv(e)s
editRe diff: I do think this is "more worthy of an 'uncommon' label than other -es genitives vs -s ones", because Archives really is virtually unknown in any German written in the past 175 years. That's why I wanted to label it "archaic", but the anon changed it to "rarer" because of a single cite on b.g.c from 2006 (which I think is simply a mistake on the author's part, but I can't prove it). —Aɴɢʀ (talk) 21:38, 17 December 2014 (UTC)
- As the user points out on WT:RFD, there are more modern cites than just the one in the entry. And ngram data for both eines Archivs vs eines Archives and the compound Staatsarchivs vs Staatsarchives show that the -es version is still about half as common now as it was in the past (i.e. there does not seem to have been any sharp drop-off in usage), and it is about 1/25th as common in the modern era as the -s version, which is not an unusual ratio for an -es vs an -s form. Compare how, in the other direction, Geschäftsfreunds is now about 1/25th as common as Geschäftsfreundes, and Jubiläumsjahrs is about 1/15th as common as Jubiläumsjahres. (Those are two of the words the Duden cites in explaining how euphony helps decide which genitive ending to use.) - -sche (discuss) 22:39, 17 December 2014 (UTC)
- But those are compounds, which are always skewed toward using the e-less form (eines Hofes is 15× more common than eines Hofs, but Hauptbahnhofes is only half as common as Hauptbahnhofs). The fact that Archiv isn't a compound would lead us to expect Archives to be more common than Archivs, not 25× rarer. —Aɴɢʀ (talk) 23:37, 17 December 2014 (UTC)
I'd be shocked if you found this as the imperfect subjunctive is a literary tense and fucker is new and extremely informal. Previous discussions have been favourable to creating all hypothetical verb forms because RFVing them would be a monstrously time consuming issue. See for example défragmentassions and the definition of défragmenter. Renard Migrant (talk) 20:36, 24 December 2014 (UTC)