Wiktionary talk:Votes/pl-2020-02/CFI for chemical formulae
Stronger CFI requirement needed
editI foresee a problem with the requirement of contexts that don't make clear that they're formulae by e.g. explicitly discussing chemical formulae or by listing their component parts. I think this is too weak. There are many contexts in which the formulaic nature of terms is rather implicit. Take the article title “On the use of AsH3 in the molecular beam epitaxial growth of GaAs ”. (Note that AsH₃ was deleted by consensus.) This scientific article is not about formulae or components, but about uses and physical properties of these compounds. This is just an example; the scientific literature is rife with articles about the uses and physical properties of all kinds of compounds that are simply identified by their chemical formulae. With the proposed formulation, AsH₃ would (presumably) have survived the RfD. I think we need to exclude scientific and technical publications from the sources that can be used here for attestations. Readers of Appl. Phys. Lett. will know what AsH3 stands for anyway; they are not going to look it up on Wiktionary. We need to cater, though, for the reader who encounters such a term in mainstream media (including pop-science journals). Even there, I’d like to see a further strengthening of the inclusion criterion, namely by requiring that the contexts use the term without explaining its meaning. That will exclude uses like in “methanol (CH3OH) can well be blended with petroleum products like petrol and diesel”. --Lambiam 14:04, 24 February 2020 (UTC)
- @Lambiam: I think that's reasonable, but how would you specify non-scientific contexts? Should we simply insert "in works not written for a scientific audience"? —Μετάknowledgediscuss/deeds 16:46, 24 February 2020 (UTC)
- That should do most of the job regarding my first concern; I think though that we should say "scientific or technical audience” – the scientists among us may not think of chemical engineers or metallurgists as constituting a scientific audience. I further feel that “publications” is better than “works”, which makes me think of hefty tomes. --Lambiam 17:58, 24 February 2020 (UTC)
- @Lambiam, KevinUp, -sche: I have incorporated your wording, which is stricter (and therefore presumably better?), but it may not read as clearly. Please look it over again. —Μετάknowledgediscuss/deeds 18:56, 24 February 2020 (UTC)
- What about this?:
- ... attestations in publications that (1) are not written for a scientific or technical audience; (2) don't make clear that they're formulae by e.g. explicitly discussing chemical formulae or by listing their component parts; and (3) do not otherwise explain the meaning of the formula.
- And add to the rationale:
- The general idea is that the rule will still allow the inclusion of chemical formulas that are in common use in publications written for a general audience.
- What about this?:
- --Lambiam 20:05, 24 February 2020 (UTC)
- @Lambiam, KevinUp, -sche: I have incorporated your wording, which is stricter (and therefore presumably better?), but it may not read as clearly. Please look it over again. —Μετάknowledgediscuss/deeds 18:56, 24 February 2020 (UTC)
- That should do most of the job regarding my first concern; I think though that we should say "scientific or technical audience” – the scientists among us may not think of chemical engineers or metallurgists as constituting a scientific audience. I further feel that “publications” is better than “works”, which makes me think of hefty tomes. --Lambiam 17:58, 24 February 2020 (UTC)
- Don't mix "formulae" and "formulas" though! (They might explode.) Equinox ◑ 20:11, 24 February 2020 (UTC)
- Based on the BP straw poll results and the subjectivity of deciding if a work is written for a "scientific" audience (what of e.g. "popular science" books?), I am not as optimistic about the chances of passing this as I'd like to be; sorry that that's an unhelpful comment. Even the straw-polled proposal to exclude all formulas by default and allow individual ones by entry-specific consensuses seems like it might falter on people's perhaps reasonable fears that their favorite formulas won't be kept, though it seems like a possible starting point if this fails (or second option to provide here). - -sche (discuss) 20:27, 24 February 2020 (UTC)
- @-sche: The BP straw poll offered too many options; I think that one (reasonable) option at a time increases the chance of us getting any workable policy passed. Popular science is predicated on being science for an audience that is not scientific, so I think that one is clear-cut. Any thoughts about the wording of this proposal, though? —Μετάknowledgediscuss/deeds 04:48, 25 February 2020 (UTC)
Include formulas attested in use that have CFI-meeting names
editWe could include attested formulas for chemicals that have CFI-meeting names. I made this proposal in Talk:LiBr and then again in Talk:AsH₃. The proposal was not included in the poll. E.g. H₂SO₄ has sulfuric acid or LiBr has lithium bromide. This criterion ensures that the inclusion of chemical formulas no more than doubles the number of items in the dictionary; in fact, the amplification factor is much less than 2 given the chemical names are not only in English but also in other languages. I have not heard any counterarguments to this proposal. --Dan Polansky (talk) 07:58, 29 February 2020 (UTC)
- Yes. We should all such formulae that can be found in the literature. SemperBlotto (talk) 08:04, 29 February 2020 (UTC)
- Having an "idiomatic" name does not correlate to having a common or short or particularly "idiomatic" formula; as I noted in the straw poll, "AlF₆Na₃ would meet that criterion as cryolite while CO₂ would fail as carbon dioxide, which seems opposite to what most people would expect"; titin is another one with a short (in that case, single-word) name but a looooong formula. - -sche (discuss) 08:28, 29 February 2020 (UTC)
- A CFI-meeting name has to meet WT:CFI#Idiomaticity, which is "An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components". carbon dioxide meets WT:CFI, and therefore would support CO₂. I do not know what "idiomatic formula" is supposed to mean. If, on the other hand, someone thinks carbon dioxide does not meet CFI and should be deleted, that's a different discussion. Even if carbon dioxide fails WT:CFI#Idiomaticity, it seems supported by WT:THUB, and still thereby meets WT:CFI and thereby would support CO₂.
- As for titin (a protein), where is the formula stated, and is the formula attested in use, as required by CFI? --Dan Polansky (talk) 08:46, 29 February 2020 (UTC)
- On another note, we could introduce such as policy as tentative, just like we did with WT:THUB; thus, editors could still make ad-hoc exceptions to the tentative policy in RFD discussions. This would reflect our uncertainty about unintended consequences that we did not foresee. --Dan Polansky (talk) 11:33, 29 February 2020 (UTC)
- I support keeping H₂SO₄ because it can be cited as an idiomatic term in non-technical works (and I've just added such a citation to that entry to support it). However, whereas tryptophan synthase is citable, I do not support adding its full chemical formula even though it has been published in technical works (for example in: Colista Moore (2011), Student's Dictionary, page 524, →ISBN). I think that difference between technical works and non-technical works for CFI qualification purposes is crucial. -Stelio (talk) 09:20, 2 March 2020 (UTC)
- @Stelio: Why is the difference crucial? In general, we do define terms that are only found in technical works. Furthermore, published is not enough; the formula has to be used, and a quotation to the effect of "The chemical formula of X is Y" does not attest Y in use. By contrast, a quotation of the form "Y has a strong smell" where Y is a chemical formula does attest Y in use. --Dan Polansky (talk) 09:40, 6 March 2020 (UTC)
- Some thoughts that are not fully formed:
- The CFI note on repetition is somewhat of a parallel. We allow no, noo, nooo, but no elongations beyond that.
- Consider, say, only the alkanes. Allowing for every chemical formula used in technical literature leads to entries for CH₄, C₂H₆, C₃H₈, C₄H₁₀, C₅H₁₂, C₆H₁₄, C₇H₁₆, C₈H₁₈, C₉H₂₀, C₁₀H₂₂, C₁₁H₂₄, C₁₂H₂₆, C₁₃H₂₈, C₁₄H₃₀, C₁₅H₃₂, C₁₆H₃₄, C₁₇H₃₆, C₁₈H₃₈, C₁₉H₄₀, C₂₀H₄₂, C₂₁H₄₄, C₂₂H₄₆, C₂₃H₄₈, C₂₄H₅₀, C₂₅H₅₂, C₂₆H₅₄, C₂₇H₅₆, C₂₈H₅₈, C₂₉H₆₀, C₃₀H₆₂, C₃₁H₆₄, C₃₂H₆₆, C₃₃H₆₈, C₃₄H₇₀, C₃₅H₇₂, C₃₆H₇₄, C₃₇H₇₆, C₃₈H₇₈, C₃₉H₈₀, C₄₀H₈₂, C₄₁H₈₄, C₄₂H₈₆, C₄₃H₈₈, C₄₄H₉₀, C₄₅H₉₂, C₄₆H₉₄, C₄₇H₉₆, C₄₈H₉₈, C₄₉H₁₀₀, C₅₀H₁₀₂, C₅₁H₁₀₄, C₅₂H₁₀₆, C₅₃H₁₀₈, C₅₄H₁₁₀, C₅₅H₁₁₂, C₅₆H₁₁₄, C₅₇H₁₁₆, C₅₈H₁₁₈, C₅₉H₁₂₀, C₆₀H₁₂₂, C₆₁H₁₂₄, C₆₂H₁₂₆, C₆₃H₁₂₈, C₆₄H₁₃₀, C₆₅H₁₃₂, C₆₆H₁₃₄, C₆₇H₁₃₆, C₆₈H₁₃₈, C₆₉H₁₄₀, C₇₀H₁₄₂, C₇₁H₁₄₄, C₇₂H₁₄₆, C₇₃H₁₄₈, C₇₄H₁₅₀, C₇₅H₁₅₂, C₇₆H₁₅₄, C₇₇H₁₅₆, C₇₈H₁₅₈, C₇₉H₁₆₀, C₈₀H₁₆₂, C₈₁H₁₆₄, C₈₂H₁₆₆, C₈₃H₁₆₈, C₈₄H₁₇₀, C₈₅H₁₇₂, C₈₆H₁₇₄, C₈₇H₁₇₆, C₈₈H₁₇₈, C₈₉H₁₈₀, C₉₀H₁₈₂, C₉₁H₁₈₄, C₉₂H₁₈₆, C₉₃H₁₈₈, C₉₄H₁₉₀, C₉₅H₁₉₂, C₉₆H₁₉₄, C₉₇H₁₉₆, C₉₈H₁₉₈, C₉₉H₂₀₀, C₁₀₀H₂₀₂...
- If the aim is to identify a chemical's common name from its formula, there are sites that do that (like PubChem); I personally don't see that as being a goal for Wiktionary.
- But what about the common terms, which can also be theoretically extended without limit (practically limited only by citations of use): methane, ethane, propane, butane, pentane, hexane, septane, octane, nonane, decane, undecane, dodecane, tridecane, tetradecane, pentadecane, hexadecane, heptadecane, octadecane, nonadecane, icosane, heneicosane, docosane, tricosane, tetracosane, pentacosane, hexacosane, heptacosane, octacosane, nonacosane, triacontane, hentriacontane, dotriacontane, tritriacontane, tetratriacontane, pentatriacontane, hexatriacontane, heptatriacontane, octatriacontane, nonatriacontane, tetracontane, hentetracontane, dotetracontane, tritetracontane, tetratetracontane, pentatetracontane, hexatetracontane, heptatetracontane, octatetracontane, nonatetracontane, pentacontane, henpentacontane, dopentacontane, tripentacontane, tetrapentacontane, pentapentacontane, hexapentacontane, heptapentacontane, octapentacontane, nonapentacontane, hexacontane, henhexacontane, dohexacontane, trihexacontane, tetrahexacontane, pentahexacontane, hexahexacontane, heptahexacontane, octahexacontane, nonahexacontane, heptacontane, henheptacontane, doheptacontane, triheptacontane, tetraheptacontane, pentaheptacontane, hexaheptacontane, heptaheptacontane, octaheptacontane, nonaheptacontane, octacontane, henoctacontane, dooctacontane, trioctacontane, tetraoctacontane, pentaoctacontane, hexaoctacontane, heptaoctacontane, octaoctacontane, nonaoctacontane, nonacontane, hennonacontane, dononacontane, trinonacontane, tetranonacontane, pentanonacontane, hexanonacontane, heptanonacontane, octanonacontane, nonanonacontane, hectane...
- I see chemical formulae as being tangentially related to numerals (which are also translingual). We can find many attested citations of numerals in use, but we only allow them as entries in the range 0...100 or if they have an idiomatic sense. Compare with including the translingual symbols for all individual elements, but only a subset of the molecular formulae where "idiomatic sense" is paralleled by "used in non-technical literature".
- -Stelio (talk) 11:51, 6 March 2020 (UTC)
- Some thoughts that are not fully formed:
- @Stelio: Why is the difference crucial? In general, we do define terms that are only found in technical works. Furthermore, published is not enough; the formula has to be used, and a quotation to the effect of "The chemical formula of X is Y" does not attest Y in use. By contrast, a quotation of the form "Y has a strong smell" where Y is a chemical formula does attest Y in use. --Dan Polansky (talk) 09:40, 6 March 2020 (UTC)