Wiktionary:Votes/pl-2022-06/Disallowing typos as misspelling entries

Disallowing typos as misspelling entries edit

Voting on: Changing WT:CFI#Spellings as follows:

Current text:

Spellings

Misspellings, common misspellings and variant spellings:[1] Rare misspellings should be excluded while common misspellings should be included.[2] There is no simple hard and fast rule, particularly in English, for determining whether a particular spelling is “correct”. Published grammars and style guides can be useful in that regard, as can statistics concerning the prevalence of various forms.

Most simple typos are much rarer than the most frequent spellings. Some words, however, are frequently misspelled. For example, occurred is often spelled with only one c or only one r, but only occurred is considered correct.

It is important to remember that most languages, including English, do not have an academy to establish rules of usage, and thus may be prone to uncertain spellings. This problem is less frequent, though not unknown, in languages such as Spanish where spelling may have legal support in some countries.

Regional or historical variations are not misspellings. For example, there are well-known differences between British and American spelling. A spelling considered incorrect in one region may not occur at all in another, and may even dominate in yet another.

Combining characters (like this) should exist as main-namespace redirects to their non-combining forms (like this) if the latter exist.[3]

Proposed text:

Spellings

All attested spellings except for some misspellings (see below for more) should be included on Wiktionary. There is no simple hard and fast rule, particularly in English, for determining which category (correct spellings, misspellings, variant spellings) a specific spelling belongs to. Published dictionaries, grammars, style guides, and statistics can be useful guides in this regard but they are not necessarily binding.

Misspellings

Only common misspellings should be included.[1] For example, occurence with one r is a common misspelling of occurrence. This also applies to intentional misspellings.

Typos are words whose spelling comes about by an accident of typing or type-setting, without the intention of the writer. Typos should not be included, not even if they are relatively frequent. When a misspelling only differs from a correct spelling by one character, especially if the alternate characters are adjacent on the keyboard layout that the writer has likely used, this is a good indication that the misspelling may be a typo. Typos can often also be recognized as such by the fact that the term also occurs in the correct spelling in the same work. Conversely, when an author consistently uses a misspelled form, it is a sign that the misspelling is not merely a typo.

Variant spellings

Regional or historical variations are not misspellings. For example, in English, there are well-known differences between British and American spelling, such as color (US) versus colour (UK). Both should be included. And musick, now archaic, was once the most common way to spell music. A spelling considered incorrect in one region may not occur at all in another, and may even dominate in yet another.

Combining characters

Combining characters (like the combining acute accent) should exist as main-namespace redirects to their non-combining forms (like the plain acute accent) if the latter exist.[2]

Schedule:

Discussion:

Support edit

  1.   Support as the proposer. — Fytcha T | L | C 00:34, 21 June 2022 (UTC)[reply]
  2.   Support - Sarilho1 (talk) 09:02, 21 June 2022 (UTC)[reply]
  3.   Support  --Lambiam 09:49, 21 June 2022 (UTC)[reply]
  4.   Support sounds like a good idea, and that some typo entries will be axed. Graeme Bartlett (talk) 11:54, 22 June 2022 (UTC)[reply]
  5.   Strong support - I like establishing the precedent that it is the intention of the writer that matters. The one part that I don't agree with is the section on how to deal with combining characters, but it is the same as the status quo so I won't take issue with it here. Theknightwho (talk) 12:19, 22 June 2022 (UTC)[reply]
  6.   SupportSgconlaw (talk) 15:46, 22 June 2022 (UTC)[reply]
  7.   Support Equinox 16:04, 24 June 2022 (UTC)[reply]
  8.   SupportGranger (talk · contribs) 20:50, 24 June 2022 (UTC)[reply]
  9.   Support Numberguy6 (talk) 17:33, 25 June 2022 (UTC)[reply]
  10.   Support. Imetsia (talk) 20:44, 25 June 2022 (UTC)[reply]
  11.   Support In practice, they mostly don't survive RFD. —Svārtava (talk) • 04:04, 26 June 2022 (UTC)[reply]
  12.   Support. PUC12:34, 26 June 2022 (UTC)[reply]
  13.   Support. Pablussky (talk) 14:06, 26 June 2022 (UTC)[reply]
  14.   Support. There are an open-ended number of typos. Not beneficial IMO to try to include them. BTW, in the talk leading up to this vote, about "scannos", scanned docs may be copy-pasted into new documents, with the scannos retained. (I just fixed a bunch on WS.) Hopefully we either won't use such docs, or will exclude the scannos as typos.
    And yes, the sectioning makes this guideline much easier to follow. kwami (talk) 00:22, 28 June 2022 (UTC)[reply]
  15.   Support Benwing2 (talk) 02:25, 28 June 2022 (UTC)[reply]
  16.   Support, and would like all misspellings out of the main namespace. Thank you ‑‑Sarri.greek  I 09:43, 28 June 2022 (UTC)[reply]
    Deliberate misspellings and very common errors should be kept, I think. Theknightwho (talk) 21:40, 30 June 2022 (UTC)[reply]
  17.   Support brittletheories (talk) 17:16, 30 June 2022 (UTC)[reply]
  18.   SupportFenakhay (حيطي · مساهماتي) 01:59, 3 July 2022 (UTC)[reply]
  19.   Support - TheDaveRoss 12:15, 5 July 2022 (UTC)[reply]
  20.   SupportThe Editor's Apprentice (talk) 23:29, 9 July 2022 (UTC)[reply]
  21.   Support Graham11 (talk) 01:26, 10 July 2022 (UTC)[reply]
  22.   Support Thank you for bringing this forward. John Cross (talk) 06:22, 10 July 2022 (UTC)[reply]
  23.   Support Ffffrr (talk) 20:46, 10 July 2022 (UTC)[reply]
  24.   Support PseudoSkull (talk) 14:11, 12 July 2022 (UTC)[reply]
  25.   Support Vininn126 (talk) 15:54, 14 July 2022 (UTC)[reply]
  26.   SupportEru·tuon 20:26, 20 July 2022 (UTC)[reply]

Oppose edit

  1.   Oppose One, the main reason we include entries for misspellings at all is to help people search for the correct spelling (as reflected by the guidance in WT:Misspellings); this is as useful for frequent typos as it is for the ordinary sort of misspelling. Two, although the author of a typo may well know what they intended to type, this does not necessarily hold true for the reader, especially for shorter words or non-native speakers (the latter may well not be able to recognize that [typo] is a typo of [non-typo] rather than a correct spelling and typing of a slightly-different word, even if [non-typo] occurs frequently in the same passage as [typo]); when this is the case, including entries for frequent typos allows the reader to know what word the author meant to type (and potentially even that the author meant to type something different from what they actually typed). Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 18:18, 22 June 2022 (UTC)[reply]
    @Whoop whoop pull up: If you search for "ciaplatin" on Wiktionary and then click on either of the search engines names where it says Or, try searching the site using Google, DuckDuckGo, it will show the desired entry as the first result. I don't think it is realistic to posit that, in the case of somebody looking up a typo on Wiktionary to no avail, the thought of using a search engine will at no point cross their mind. And further, while I acknowledge that we are hereby relying on third parties, the solution is to finally fix the horrible MediaWiki search function to allow for this kind of fuzziness, not to include and maintain an unbounded number of non-words. — Fytcha T | L | C 18:49, 22 June 2022 (UTC)[reply]
    Fuzziness is possible in MediaWiki searches. Searching [word]~n will give results with up to n changed characters. Theknightwho (talk) 12:26, 24 June 2022 (UTC)[reply]
    The template {{misspelling}} has been created specifically for this purpose. For example, if a user searches for naxalone, they will see the result naloxone with the clarification “Sometimes misspelled naxalone”. This allows us to enable searching also for less frequent misspellings (or frequent typos) without turning them into an entry.  --Lambiam 22:19, 22 June 2022 (UTC)[reply]
    That isn’t the kind of misspelling we are looking to exclude. In fact, that is precisely the kind of misspelling we want to include. Theknightwho (talk) 17:38, 25 June 2022 (UTC)[reply]
  2.   Oppose per User:Whoop whoop pull up. Old Man Consequences (talk) 11:26, 26 June 2022 (UTC)[reply]
  3.   Oppose I do not see a concrete way of telling whether the author implied the misspelling or did not. Thus, a misspelling is technically just an alternative spelling. It is not Wiktionary's job to decide whether an existing word is correct or not. GareginRA (talk) 16:59, 8 July 2022 (UTC)[reply]
    One can tell, reasonably well, by checking if the same author usually used the standard spelling, and only once used the wrong spelling: that's a typo. Or we can make an accurate guess based on our own collective expertise in the language. There is a reason why the OED would not include an entry for mispeling. Equinox 17:04, 8 July 2022 (UTC)[reply]
    One can often tell through knowledge of things like phonics and keyboard layout. A spelling like prinsiple for principle might be a misspelling, because it's phonetically motivated, whereas prinicple or ptinciple are more likely typos, because they are plausible slips of the finger but phonically incoherent. —Granger (talk · contribs) 21:00, 8 July 2022 (UTC)[reply]

Abstain edit

  1.   Abstain for now. Apologies for not bringing this up earlier, but "not even if they are relatively frequent." confuses me a bit as wouldn't common/frequent typos become general misspellings? Maybe I'm lost on that part. AG202 (talk) 02:49, 21 June 2022 (UTC)[reply]
    @AG202: No, frequency is not relevant when discriminating between typos and non-typo misspellings. What matters is whether the typist's intention aligns with what is found in the work: if they intended to write complete but by an accident the work then contained cojplete, that is a typo; if they intended to write occurence and then the work also contained that, that is a misspelling. — Fytcha T | L | C 10:00, 21 June 2022 (UTC)[reply]
    If it can be shown that these common uses are no longer typos – such as by the fact that several writers use them consistently – they have become common misspellings. But misspellings that arise by accident, such as fat finger errors, remain typos. A good example is the very frequent misspelling errror.[1] There is no way the authors meant to type this, and it just survived the proof reading, so this is clearly a typographical terrror not worthy of inclusion.  --Lambiam 10:04, 21 June 2022 (UTC)[reply]

Comments edit

  • Comment I am not sure how to vote. If a misspelling occurs with significant abundance before 1873 and QWERTY, would that be a disproof of typo status? See Citations:aqcuire where I am assembling a few cites, including one (so far) from 1795. I will look for more from before 1873. Also, on the basis of the cites I found, I took the liberty of making aqcuire a redirect page. --Geographyinitiative (talk) 00:29, 21 June 2022 (UTC)[reply]
    @Geographyinitiative: Typos have also occurred in typeset texts but the nature of the typo is generally different (typeset has relatively more missing letters, typewriter has relatively more transpositions). I want to point out however that at least the Bertrand Russell: Philosopher of the Century citation that you've provided is almost certainly a typo as the typist used the incorrect qc form only once whereas they used the correct cq form over 5 times. I also want to add Talk:aqcuire was deleted mainly because it doesn't appear to be a common misspelling, a clause that will be left untouched by this vote. — Fytcha T | L | C 01:01, 21 June 2022 (UTC)[reply]
    The same holds for the other citations. The book by Rollin has eight occurrences of acquire next to a single one of aqcuire. The railroad article has Acquire in the title, and many occurrences of acquisition. The typo in the Kokoschka letter was almost certainly introduced by the typesetter; compare the text on p. 15 here. Even if mistyped by the artist himself, the letter has two occurrences of acquainted.  --Lambiam 09:47, 21 June 2022 (UTC)[reply]
  • Comment Although I oppose disallowing typos (see my vote above for why), I do support subsectioning and reformatting the current version of the misspelling CFI in a manner similar to that in the proposal. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 18:20, 22 June 2022 (UTC)[reply]
  • Comment: I'm a bit late to the party, but how should one determine what the intent of an author was when a language's corpus is relatively small? I currently work with a language that has a complete corpus of approximately 30 books, most of which are re-written in a newer orthography (so, the actual number of original texts is around 16), and that's a pretty large corpus when compared to languages worldwide. Seems like intent would be quite difficult to define in such circumstances... Thadh (talk) 22:38, 2 July 2022 (UTC)[reply]
    @Thadh: While the proposed text gives some pointers, it does not provide any conclusive rule with which to divine the intentions of the typist (which is the only thing that matters at the end of the day). Only once the legwork of distinguishing has been done, the vote actually comes into play. So your question is technically out of the scope of this vote. However, in practice we should probably assume nothing to be a typo until convinced otherwise: innocent until proven guilty. — Fytcha T | L | C 13:08, 3 July 2022 (UTC)[reply]
  • Comment for pages of 'Misspellings' in all wiktionaries _1. (having in mind, second language learners): can misspellings and misconstructions get out of the SearchBox? Perhaps a distinct namespace? Especially in wiktionaries of languages we do not understand, one can never guess if the PoS.title or description is for an 'alternative spelling' or a 'misspelling'. Probablly all? people are expected to understand English and the "misspelling of". In analogy, who can recognize 的拼寫錯誤? Billions of people can, but not all. Should there be an interwiki coordination of how to present errors?
    _2. At section 'Misspellings', could we have a reference to w:Error (linguistics)#Difference between error and mistake​?
    _3.example: Can "occurence", perhaps get out of the Category:English non-lemma forms? The misspellings are disucssued at occurrence, but the double rr is not explained. I think, that this is not a mistake like a typo, but an error of misconstruction. Thank you, very much ‑‑Sarri.greek  I 16:41, 3 July 2022 (UTC)[reply]

Decision edit

Passed: 26-3-1 — Fytcha T | L | C 00:10, 21 July 2022 (UTC)[reply]