Wiktionary talk:Votes/pl-2022-07/Stubifying alternative forms
Derived and descendants from the stub
editI welcome the proposal for the most part, but I have some reserves about where to list derived and descendants. For instance, Gronelândia is listed as European Portuguese alternative of Groenlândia. From the first it was derived the adjective gronelandês, while the second yields groenlandês. I would find it very weird to see gronelandês listed as derived term of Groenlândia. The proposal doesn't mention these subsection, but I think a likely implication would be that they would all fall under the so called full article. Can't we also open a exception to list derived and descendant terms in the stub when it is clear that they came from the stub? - Sarilho1 (talk) 10:15, 5 July 2022 (UTC)
- @Sarilho1: Thanks for providing this practical example. In the first version of the vote, I actually put derived and related terms in the second section ("included if they differ") but I decided to remove them because I thought they should be listed in the main article. Your example shed new light on that decision for me, though I personally still think having a derived term in Gronelândia is unexpected and very easy to miss. If I were on the article Groenlândia it wouldn't occur to me to check the alt-form for more derived terms. I think I'll include them again for now, let's see if somebody thinks we should do otherwise. — Fytcha〈 T | L | C 〉 11:26, 5 July 2022 (UTC)
- Visibility is indeed a good point, though. So although I don't really like the solution, I wouldn't be too oppose to keep the derived terms or descendants sections in a single place. However, I would still like to present some possible alternatives. The easiest one is to include a pointer to the stub indicating that the user could find there more derived terms/descendants (similar to Template:see desc). This could also be an option for the pronunciation problem from below. Another one would be to transclude the terms with a function similar to Template:desctree. Thus, we wouldn't lose (much) visibility and still have the derived terms and descendants properly sorted. - Sarilho1 (talk) 11:41, 5 July 2022 (UTC)
- @Sarilho1: Thinking about it again, maybe adding the descendants to the list of exceptions is not a good idea. The current architecture with
{{desc|alts=1}}
/{{desctree}}
doesn't seem to mesh well with that.{{desctree}}
assumes all alt-forms to be present in the main descendant article and it only incorporates the recursive descendants found within that article. So if we split the descendants to the different alt-forms, then, while the alt-forms are found by{{desctree}}
, their descendants are not. - The derived terms I need to think about more; I have tentatively included them in the exceptions. Transcluding alt-forms' derived terms on the main article (with a nice little separator à la "derived terms of alternative form such-and-such) sounds like an idea, if technologically possible. — Fytcha〈 T | L | C 〉 11:54, 5 July 2022 (UTC)
- @Sarilho1: Thinking about it again, maybe adding the descendants to the list of exceptions is not a good idea. The current architecture with
- Visibility is indeed a good point, though. So although I don't really like the solution, I wouldn't be too oppose to keep the derived terms or descendants sections in a single place. However, I would still like to present some possible alternatives. The easiest one is to include a pointer to the stub indicating that the user could find there more derived terms/descendants (similar to Template:see desc). This could also be an option for the pronunciation problem from below. Another one would be to transclude the terms with a function similar to Template:desctree. Thus, we wouldn't lose (much) visibility and still have the derived terms and descendants properly sorted. - Sarilho1 (talk) 11:41, 5 July 2022 (UTC)
Pronunciations
editI'm opening a second topic because I think these should be two different discussions. Currently the policy I follow with Portuguese entries is to only list the pronunciations compatible with the writing of the word. For instance, Arménia contains the European Portuguese pronunciation, while Armênia contains the Brazilian one. Similarly, agourar and agoirar contain different pronunciations, each with their own diphthong. The only exceptions are cases like requisito which contains two pronunciations for European Portuguese, but the misspelling/eye dialect requesito contains a single one. I would like to know what should we expect the policy if the stubification comes to pass. - Sarilho1 (talk) 10:25, 5 July 2022 (UTC)
- @Sarilho1: If I understand this correctly, requisito is the correct spelling for both variants but requesito is the eye dialect only for European Portuguese? In that case, nothing has to be changed because the IPA differs between the two articles (being more restricted / containing less is also differing) and thus may be included per the vote's text ("that which differs and only applies to the alternative form"). — Fytcha〈 T | L | C 〉 11:44, 5 July 2022 (UTC)
- Indeed. My question concerned mostly the first two cases where the pronunciations are incompatible with the spellings. - Sarilho1 (talk) 11:48, 5 July 2022 (UTC)
- @Sarilho1: Arménia and Armênia's pronunciation sections will be left untouched by this vote, seeing that they differ. I only want to eliminate those that are identical (so that nobody can copy-paste the pronunciation section from naïveté to naiveté for instance). — Fytcha〈 T | L | C 〉 11:58, 5 July 2022 (UTC)
- Ok, that clears it. - Sarilho1 (talk) 11:59, 5 July 2022 (UTC)
- @Sarilho1: Arménia and Armênia's pronunciation sections will be left untouched by this vote, seeing that they differ. I only want to eliminate those that are identical (so that nobody can copy-paste the pronunciation section from naïveté to naiveté for instance). — Fytcha〈 T | L | C 〉 11:58, 5 July 2022 (UTC)
- Indeed. My question concerned mostly the first two cases where the pronunciations are incompatible with the spellings. - Sarilho1 (talk) 11:48, 5 July 2022 (UTC)
Form-of entries
editDevil's advocate: why shouldn't these rules apply to all form-of entries, not just alternative forms? Non-lemma forms obviously won't have inflection headers, and presumably not Description or Glyph origin either, but other than that, it seems plausible enough. This, that and the other (talk) 12:01, 5 July 2022 (UTC)
- Some of the form-of entries allow of extra senses, though. For instance, the template:diminutive of. For instance, see carrinho. - Sarilho1 (talk) 12:08, 5 July 2022 (UTC)
- I have some sympathies for that actually and funnily enough I was just about to start a BP discussion asking where our policy for forbidding nyms in inflected forms is because I couldn't find it. I would draw a line at
{{synonym of}}
,{{diminutive of}}
,{{clipping of}}
,{{ellipsis of}}
etc. however, allowing them to be full articles because they are separate terms and not mere variations. I am not sure however if I want to extend this vote to inflected forms because there are some slight variations in the headers that should be allowed for them (e.g. References / Further reading don't belong in inflected form entries IMO). — Fytcha〈 T | L | C 〉 12:12, 5 July 2022 (UTC)- Yes, there are some nuances now that you mention it. Certainly
{{synonym of}}
is a bit of an oddball one. This, that and the other (talk) 12:17, 5 July 2022 (UTC)
- Yes, there are some nuances now that you mention it. Certainly
This proposal effectively bans this template in the entries to which it applies. Intentional or not? This, that and the other (talk) 12:11, 5 July 2022 (UTC)
- Yes. Alternative forms of lemmas are themselves lemmas. naiveté is in its dictionary form and therefore also rightly part of Category:English lemmas. — Fytcha〈 T | L | C 〉 12:14, 5 July 2022 (UTC)
- Heh it was a silly question, wasn't it. Let me think about what I actually meant to ask, and come back to you. This, that and the other (talk) 12:34, 5 July 2022 (UTC)
- @Fytcha: But alternative forms of non-lemmas are non-lemmas!
- Additionally, redirecting text may look better than empty space under empty Etymology n headings. --RichardW57m (talk) 12:39, 28 July 2022 (UTC)
Etymologies
editWhat if some form is archaic, but you can trace the various forms back to specific dates? Would that all be in the etymology section of the main entry? E.g. biszkopt and biszkokt Vininn126 (talk) 20:58, 5 July 2022 (UTC)
- On that note, what about alt forms where one is an unadapted borrowing and the other is not? Would there not be use in having the unadapted borrowing category? E.g. biznes and business Vininn126 (talk) 10:54, 6 July 2022 (UTC)
- The etymology section is one of the sections that can be included in the stub, provided it is different from the main entry so there is no necessity to have all in the main entry. - Sarilho1 (talk) 11:12, 6 July 2022 (UTC)
- Ah, I must have missed that, thanks. Vininn126 (talk) 11:13, 6 July 2022 (UTC)
- I think this is actually unclear. If I'm interpreting the footnote correctly, it says that an etymology can only be included for senses other than the alt form entry. For example, august (“kind of clown”, noun), an alt form of auguste, cannot have an etymology. But august (adjective) can have an etymology. Again, if I'm understanding the proposal, this would result in, for example, throwing out all those etymologies explaining the different romanization schemes for Chinese localities, etc. I would consider that a loss of valuable information and strongly oppose if so. @Fytcha, This, that and the other: Could you please clarify? 98.170.164.88 04:23, 8 July 2022 (UTC)
- Can you give an example of a entry with an etymology referring to "different romanization schemes for Chinese localities"? I assume you are talking about English terms here. This, that and the other (talk) 04:27, 8 July 2022 (UTC)
- See Peking for a particularly notable example, or Taibei/Taipeh, or really most entries in Geographyinitiative's contributions list. But I don't mean to imply Chinese localities are the only cases where explaining the origin of an alt form is useful; it's just the first thing that sprang to mind. 98.170.164.88 04:33, 8 July 2022 (UTC)
- In that case, it seems to me that the additional ("differing") information about romanisation justifies the inclusion of the Etymology header there. I tried to clarify the footnote in a way that hopefully helps. This, that and the other (talk) 04:43, 8 July 2022 (UTC)
- Ah, now I see that the the conditions were intended to be disjunctive (either the etymology must differ between the alt forms, or the word must have multiple etymologies justifying a bare header). Thank you for clearing that up. 98.170.164.88 04:46, 8 July 2022 (UTC)
- I've tweaked the condition, because your clarification didn't work. --RichardW57m (talk) 12:10, 28 July 2022 (UTC)
- In that case, it seems to me that the additional ("differing") information about romanisation justifies the inclusion of the Etymology header there. I tried to clarify the footnote in a way that hopefully helps. This, that and the other (talk) 04:43, 8 July 2022 (UTC)
- Sometimes alternative forms have different origins from the main entry, but the words are still just alternative forms. Taipei is of late 19th century origin via Wade-Giles, and Taibei is of late 20th century origin, via Hanyu Pinyin. These words couldn't be anything but alternative forms, unless they were somehow synonyms right? They are intended to have the exact same pronunciation (as proven by the 1952 gazetteer on the Taipei page showing the "b" pronunciation) and they have the same meaning. BUT they have truly divergent etymologies. Also, sometimes words are alternative forms one of the other, but neither is a main entry, like Jiang and Chiang or Xie and Hsieh. --Geographyinitiative (talk) 10:04, 8 July 2022 (UTC)
- See Peking for a particularly notable example, or Taibei/Taipeh, or really most entries in Geographyinitiative's contributions list. But I don't mean to imply Chinese localities are the only cases where explaining the origin of an alt form is useful; it's just the first thing that sprang to mind. 98.170.164.88 04:33, 8 July 2022 (UTC)
- Can you give an example of a entry with an etymology referring to "different romanization schemes for Chinese localities"? I assume you are talking about English terms here. This, that and the other (talk) 04:27, 8 July 2022 (UTC)
- I think this is actually unclear. If I'm interpreting the footnote correctly, it says that an etymology can only be included for senses other than the alt form entry. For example, august (“kind of clown”, noun), an alt form of auguste, cannot have an etymology. But august (adjective) can have an etymology. Again, if I'm understanding the proposal, this would result in, for example, throwing out all those etymologies explaining the different romanization schemes for Chinese localities, etc. I would consider that a loss of valuable information and strongly oppose if so. @Fytcha, This, that and the other: Could you please clarify? 98.170.164.88 04:23, 8 July 2022 (UTC)
- Ah, I must have missed that, thanks. Vininn126 (talk) 11:13, 6 July 2022 (UTC)
Senses vs. entries
edit@This, that and the other: I'm not too fond of diff. Your new wording doesn't apply to e.g. haybag anymore (meaning sense 2 therein may now have synonyms associated with it) whereas mine did. — Fytcha〈 T | L | C 〉 02:04, 7 July 2022 (UTC)
- Of course. But the text as written was not very clear on whether it applied to individual senses or entire entries. I think that is worth clarifying, perhaps reverting the change I just made to the first sentence, and adding an extra paragraph at the end discussing what to do with lone alt-form senses in otherwise non-alt-form entries. This, that and the other (talk) 02:13, 7 July 2022 (UTC)
- @This, that and the other: I think I see what you meant and I have now revamped the wording completely. Tell me what you think. — Fytcha〈 T | L | C 〉 12:33, 7 July 2022 (UTC)
Excluding regional tags
editWhile I agree with the core of the vote, I foresee a problem, though cannot find an apt example to demonstrate it. The section I'm concerned with is the following:
- Only the labels that differ between the full entry and the stub may be included. For instance, the senses in anemia have the labels (American spelling), (uncountable), (countable), and (pathology) associated with them. However, the alternative form stub entry anaemia may only have the (British spelling) label, not (uncountable), (countable), and (pathology).
Say, we have an entry with three common regional spellings, one of which is used in North America and two in the UK, and the entry is lemmatised at one of the UK specific spellings. If I understand correctly, this vote would mean we could no longer label the UK specific alternative form as such because the same information is already found in the main entry. This, I think, would be rather confusing. brittletheories (talk) 07:14, 7 July 2022 (UTC)
- I understood that passage as meaning that labeling the regional spellings always occurs, but the other labels (the examples given are uncountable/countable and pathology) are only placed on the main entry. The entry anaemia is already written according to the example. - Sarilho1 (talk) 08:48, 7 July 2022 (UTC)
- @Sarilho1: No, I don't think so. The passage is quite clear: "Only the labels that differ between the full entry and the stub may be included." Fytcha seems to agree with my assessment.
- Also, anemia would only work as an example of this were it lemmatised at anaemia, at which point, according to the rule above, anæmia could no longer retain the UK label. brittletheories (talk) 12:03, 7 July 2022 (UTC)
- @brittletheories: Good point actually, I see what you mean. What do you think about changing that part of the text to something like "Stubs may only contain labels if that kind of label differs among the different forms."? This allows all three entries to have a regional label in the scenario you constructed. The "kind of label" phrase is a bit vague though so that would have to be clarified in a footnote. — Fytcha〈 T | L | C 〉 11:28, 7 July 2022 (UTC)
- @Fytcha: Yeah, I think that would be a more comprehensive and sensible solution that ouright excepting all regional tags. Maybe: "Only labels that differ among the alternative forms may be included." brittletheories (talk) 12:03, 7 July 2022 (UTC)
- @brittletheories: See diff. — Fytcha〈 T | L | C 〉 12:43, 7 July 2022 (UTC)
- That looks good to me. brittletheories (talk) 13:10, 7 July 2022 (UTC)
Inflection, Declension or Conjugation
editSo another question that occurred to me. Why are we restricting the inflection, declension or conjugations to the main entry page? For instance, the Portuguese verb agoirar is an alternative form of agourar. The conjugations of both forms are thus also different (e.g.: eu agouro vs. eu agoiro)), so imo, they should be listed somewhere. - Sarilho1 (talk) 09:13, 7 July 2022 (UTC)
- The vote doesn't do this. In fact, the inflection section is explicitly allowed to exist in the stub entries. This, that and the other (talk) 10:28, 7 July 2022 (UTC)
- I think we are all suffering from a severe case of misreadingness. :D - Sarilho1 (talk) 10:41, 7 July 2022 (UTC)
- @Sarilho1, @Fytcha I have tried to shuffle the proposed text around a bit so we don't keep misreading it. All the stub sense, full entry, ... stuff was just doing my head in. In particular, it wasn't clear how it was to apply to entries with mixes of form-of senses and normal senses (clubs - form-of sense next to normal sense under same POS header - or freeze - Ety 3 is a stub but the whole entry is not a stub entry). I have reduced the metalanguage to two terms: stub sense and primary form. What matters is not when an entry consists only of stub senses, but when the etymology section consists only of stub senses. This, that and the other (talk) 04:39, 8 July 2022 (UTC)
- I think we are all suffering from a severe case of misreadingness. :D - Sarilho1 (talk) 10:41, 7 July 2022 (UTC)
Split into Proposal 1 / 2
edit@AG202: I agreed with your objection that the section to which I wanted to append the new text was not that comprehensive. I've now included a second proposal to clarify the existing text. From what I can tell, it should be relatively clear now:
- Such and such lemmas are variations of other lemmas
- Variations are to be defined using the form-of templates
- Senses that are defined using form-of templates must be stubified
Against your personal opinion, I have included dialectal variations. It is to be noted however that the text still leaves a loophole open, if you can call it that, in that a consensus to not define Yoruba dialectal forms as forms of one another means that they don't have to be stubified. — Fytcha〈 T | L | C 〉 16:04, 7 July 2022 (UTC)
"may only be defined using a definition line that consists of a suitable form-of template."
editHey, how would Suchow be affected by this wording? Wiktionary kind of needs a little tiny bit of extra definitional content outside the form-of template (the province name) to make things easier on the reader. Please clean up the entry generally if you see fit (it needs work). Thanks. --Geographyinitiative (talk) 10:11, 8 July 2022 (UTC)
- @Geographyinitiative It might be best if it read, 'should most often be defined'. An exception to the rule is already mentioned in the text, so we should not use language that might insinuate a steadfast rule on editors' choice of template. As I see it, the passage ought only to serve as precedent to affirm that most stubified alternative form entries' definitions be as terse and regular as possible. brittletheories (talk) 14:34, 12 July 2022 (UTC)
- There must be quite a few cases where something is an alt form of only one meaning or etymology of another string of letters. I've seen these handled by including the clarifying information inside the form-of template as part of the gloss, which in the case of Suchow might look like:
- Alternative form of Xuzhou (“Jibin, a city in Sichuan”)
- but I agree the shorter format used at Suchow is also fine / should also be allowed. Probably we need a note that stub / form-of senses can or should include as much of a gloss or qualifier as necessary to clarify when they apply to only some of the senses or etymology sections of the target page. - -sche (discuss) 19:15, 28 July 2022 (UTC)
@Fytcha Previously, we have added {{trans-see}}
templates to English alternative forms so as to, on one hand, help the reader, and on the other, channel the contributions in one place. Both of these points, the first one in particular, will stand even after this vote. What I suggest is 1) adding "Translations" among the permitted headers with a note, saying 'Only the template {{trans-see}}
may be used in stub entries', and 2) removing the word "translations" later in the text. brittletheories (talk) 14:28, 12 July 2022 (UTC)
Attestation
editUnless a quotation can be part of a 'definition line', I think we have a problem. I notice that the 'Quotations' heading is not allowed for 'stub senses'. We sometimes need attestation evidence for forms in alternative scripts; the forms across scripts are not always totally predictable. For example, Burmese Pali has သံဃ (saṃgha) rather than the expected *သင်္ဃ (saṅgha). More familiarly, we would have issues if we tried to merge British draught and American draft. Some of the senses of draft are not normally spelt 'draught' in Britain. For example, in Britain we write 'a draft document'. Do we really want to put quotations for the use of 'draught' under 'draft' when we are demonstrating the British usage? Do we want Burmese script Pali under the Roman script usage when we are demonstrating the Burmese script spelling, as opposed to demonstrating cross-script Pali usage? --RichardW57m (talk) 17:07, 25 October 2022 (UTC)
- The second paragraph of Proposal 2 says "Quotations may be provided". I suppose the intention is that they be provided inline using
#*
under the relevant sense line, not under a separate "Quotations" header, because that header is only supposed to be used when there is ambiguity between senses, and that ought not to be an issue in an entry with only stub senses. (This, that and the other) (talk) 02:00, 27 October 2022 (UTC)
- But that text does not extend the meaning of 'stub sense definition line'. A solution is to modify the list in the third paragraph to "Stub sense definition line (as set out above)" to "Stub sense definition line (as set out above) plus satellite quotations". --RichardW57m (talk) 09:56, 27 October 2022 (UTC)
- In any case, you raise an interesting point about placement of citations. I too would generally prefer to see "interesting" quotations centralised at the primary form entry. However, if three otherwise uninteresting cites have been collected solely to satisfy RFV, it is probably better to keep them at the stub sense. Not sure what the best practice is here. This, that and the other (talk) 02:00, 27 October 2022 (UTC)
- What I've found myself doing with Pali is to use a non-Roman script quotation both for RfV in the subsidiary non-Roman script lemma, to satisfy RfV for the script form, and in the main, Roman script lemma, to demonstrate some point for the word in general, such as an irregular inflection. As I hold the text for the quotation in a central module, which makes sense as I collect text to ward off RfV's while economising on translation effort (I'm not protected by US pro-piracy laws such as fair use), I avoid the usual problem of duplicating text. I think I've also duplicated as a defence against RfV when I can't find the word in the major dictionaries - that's fairly rare. --RichardW57m (talk) 09:56, 27 October 2022 (UTC)
Synchronic Derivation
editAm I right in thinking that details of synchronic derivation across clusters of alternative forms are to be deleted if the vote passes. For example, under these proposals, the only hints that synchronisation is the verbal noun of synchronise rather than of synchronize will have to be in a headword line or an inflection section
For example, at present in පාචෙති (pāceti) we have the single definition line:
- Sinhala script form of pāceti, which is causative of පචති (pacati)
so that the user can see the base form in the same script, or directly click on the Roman script form of it to see the senses before the causative transformation is applied. As for Pali we have the policy that the Roman script lemma is the primary lemma, will this have to be hacked back to
- Sinhala script form of pāceti ?
I fear I've ignored the Johnsonian[1]* principle of 'Fuck the user!' by trying to approximate a one-click service. --RichardW57m (talk) 10:49, 11 July 2023 (UTC)
References
edit- ^ Boris, not Samuel, originally in relation to Business.