The purpose of this page is to openly develop my view about how a Wiktionary article should look. Naturally it would be un-Wiki to insist on this structure, but if this structured set of "rules" is accepted it will have served its purpose. The listings at any given level are in order of importance. Comments are welcome, but please make them after the double line. I'm making a point of numbering my proposals here for easier reference. Eclecticology
1. Breaking up articles
- Breaking up articles into smaller ones will in many cases be inevitable, but if this is done pre-maturely before there is a big picture grasp of what the word involves such an action is likely to generate confusion.
- 1.1. When to Break-up
- This outlines the criteria to be applied in determining if an article should be split.
- 1.1.1. Size
- An article should be broken up when it is too big. It is rarely necessary to break-up an article that is shorter than 10K; it becomes almost essential to break-up articles longer than 30K.
- 1.2. Where to break-up
- Break-ups should be made at the highest possible hierarchical level available.
- 1.3. Designating broken-up articles
- Broken-up articles should as much as possible be titled in a way that reflects the criterion applied in 1.2.
2. Data hierarchy
- This defines how we progress through information in a way that recognizes that some separation criteria are more important than others. In theory, all the criteria used in subdividing at a given level are available to be applied for each and every category in the next higher level.
- 2.1. Language
- This is the highest level criterion for dividing an article. Dealing with this may not be particularly difficult, but getting it right is key to having a Wiktionary that functions smoothly in a multilingual environment. I use the term "base language" here to mean the language of the active Wiktionary.
- 2.1.1. Language order
- 2.1.1.1 "General comments" about about a word that transcend languages. I anticipate that this will not be relevant for most words.
- 2.1.1.2. The word in the base language.
- 2.1.1.3. The word in other languages ordered by language name in the alphabetical order of the base language.
- 2.1.2. Headings
- Headings should be in H2 size. (i. e."" with "==" at the beginning and end of a title. The heading for the root language may be omitted if there are no "General comments."
- 2.2. Topics
- This refers to the range of things we can say about a word, such as variants, pronunciation, etymology, definitions, etc. The list is ordered to move from the more general to the more specific.
- Alternative and obsolete spellings
- Etymology
- Pronunciation
- Part of Speech
- Definition
- Quotations
- Synonyms
- Translations
- Derived words and phrases
- References
- 2.2.1. Alternative and obsolete spellings
- Words spelled in alternative ways are essentially the same word. The alternatives may be archaic and representing uses in a former version of the language. They may be geographic as in the "labour/labor" distinction between British and American English. There may be sub-headings that categorize the bases for these variants, but generally no further forking will result, and no further mention of these variants will be needed. Separate articles for the alternatives will often be necessary and may consist of nothing more than a "see" reference.
- 2.2.2. Etymology
- This topic explains where the word comes from, including its bases in other languages. It is essentially not concerned with what happened to the word once it was incorporated into the base language. With most words there is a single source and no fork will arise at this level. A fork will occur when different sources result in identically spelled homonyms.
- 2.2.3. Pronunciation
- It is impossible to provide precise pronunciations in any format. Even the best phonetic systems can only provide an approximation. Distinguishing most dialects would require an in-depth and subtle understanding of phonology well beyond the scope of this dictionary. Nevertheless, most readers will appreciate a rough understanding of the difference between British and American pronunciation. The most essential entry here will be in terms of pronunciation variations that make a difference, as with the usage of minute when a noun or an adjective. Using SAMPA or IPA or something else remains an open question to be discussed elsewhere, as are ways of distinguishing which system is being used if more than one is permitted.
- 2.2.4. Part of Speech
- This is one of the most important points of forking. Fortunately, in most instances with a basic understanding of grammar it will be one of the easiest to apply. It is here that we should also make note of plurals, tenses and other grammatical variations of a word.
- 2.2.5. Definition
- This is the heart of any dictionary entry. It is mostly descriptive. Sometimes a synonym can serve this function, but most often not. It is here that the lexicographer's art is most apparent. By the time you get here most of the easy forks in the word are past, and it becomes necessary for the various definitions to make clear the different ways that a word can be used, while retaining the concept that any two uses at this level have a common etymology. There is no one rule that can determine the order in which definitions should be arranged, but there are rules of thumb which can be used for guidance, with no one of them having precedence over the others.
- 2.2.5.1. The most frequent uses should precede the less frequent.
- 2.2.5.2. The uses closest to the etymological source should precede the more distant.
- 2.2.5.3. The most ancient uses should precede the most recent ones.
- 2.2.6. Quotations
- The purpose for using quotations is to illustrate the actual use of a word at a particular time. Unlike examples which can be made up, quotations must also be accompanied by information about their sources. The format for these has been described elsewhere.
- 2.2.7. Synonyms
- Although some synonyms may appear in the definition of a word, they are not the same thing. They represent alternative words which may be substituted for the word in some contexts and some circumstances. The sphere of connotations which a word carries is often very different from that of its synonyms. A wrongly used synonym can give an impression of unfamiliarity with the language.
- 2.2.8. Translations
- In one sense a translation is nothing more than a synonym in an other language. One must be wspecially careful under this heading about introducing false friends.
- This refers to the range of things we can say about a word, such as variants, pronunciation, etymology, definitions, etc. The list is ordered to move from the more general to the more specific.
- 2.3. Sub-topics
- This refers to issues that come under one topic, most notable under definition. It includes quotations and translations which often relate to a very specific usage of a word.
- (more to come.)
I don't think we should break up articles because of size, we shold either break up on logical grounds (definition, nouns/verbs/etc, etymology, whatever) or not at all like the OED does it. --Imran 23:31 Dec 24, 2002 (UTC)
- Size is only one criterion - and a fairly obvious one when the article is just too big. I still need to think about the other possible criteria. Eclecticology
I use several computers, including a couple that are very slow and old. Wiktionary and Widkipedia load relatively easily and fast so long as they arn't more that 50k or so. What really screws you up with an oldar slower computer is haveing to load and reload go to a diffrent page, flip back to the first page etc. So Imran's notion that not having everything on one page and short is somehow helpful to someone with older equipment and a slower connection is just wrong. It's much better to have it all there once you get it loaded. Fred Bauder 21:38 Dec 26, 2002 (UTC)
What's your opinion with regards to differentiation of multiple definitions as disscussed elsewhere we need someway to be able to refer to individual definitions, but just numbering them would cause problems if the article was reordered sometime in the future ? --Imran
- This is a tough one. The definition is naturally at the heart of any dictionary entry. Both suggested bases for distinction (age and frequency) have their marits. Both also have their problems. The biggest argument against having the most common use first is that it requires a subjective judgement about which is the more common of two uses. For the British the most common use of the noun "boot" might be for the baggage compartment of a car, but for an American it might be a piece of footwear. I would prefer not to put anybody in the position of having to arbitrate that kind of conflict.
- I lean toward having the oldest usage first because it has the advantage of having fewer problems with subjectivity.
- but it introduces major problems of usability - people expect the most common usage first, not the oldest; defining dictionaries don't seem to have all that much difficulty figuring out which are the most common usages
- The difficulty here is that the resources required to decide which version is oldest are not at everybody's fingertips. Contributors who want to add to the article on a particular word should not feel the need to have access to obscure lexicological resources before they can contribute. When they add new definitions to an article they will do just as well to add them to the bottom of the list. Someone with a broader overall perspective may need to re-order them later. When ordering things we also need to remember that the differences between definitions will come in shades. Some are very distinct; others differ only in shades of meaning.
- I lean toward having the oldest usage first because it has the advantage of having fewer problems with subjectivity.
- I agree, refering to parts of articles by numbers has a serious risk of becoming meaningless or even misleading unless the list of definitions in an article is stable. We may still be able to make such references through the context of a particular usage, or the fact that it is only used in specialized circumstances. For now, a simple reference to the article may be the only accurate way to approach the problem. The structure of articles is still in a development stage. Perhaps the tech people could find some sort of indirect indexing scheme that would allow a definition to be moved around in an article without breaking links. My short answer is that I'm prepared to wait and see how this develops. Eclecticology 22:23 Dec 28, 2002 (UTC)
- One way we could do it is by having links with the number of the item in a standardized form, i.e. have the number of the definition after the word so for the first def in dog we would have dog (1) linking directly to that def. that way we could have something similar to "what links here" which could be used to find the links that need to be corrected if the order is changed. --Imran
- I don't want to completely shut the door om the numbered entries. Other dictionaries certainly use them to varying extents.
- the major use of numbered entries is to express 1. most common, 2. next most common usage, certainly this is extremely important for readers who genuinely don't know what a word means in context, and consider it important enough to look up.
- My complaints are more about dubdivisions being done prematurely than being done at all. If we are to have standardized forms, how will those standards be determined? That would be a whole other debate. If we are to have sub-division I am currently tending toward something more descriptive than numbers. "What links here" has certainly been helpful in trying to repair broken links in Wikipedia, and will continue to be helpful here, but I'm not sure that it will accomplish what you want. I'm finding that working out some of the words is a lot more difficult than I would have expected. If you look at the dog example, many of the alternative uses can somehow be connectied back to the most familiar one. though sometimes quite indirectly. These simple words are likely to present far more problems than many of the longer less well known words. Eclecticology 07:53 Jan 4, 2003 (UTC)#
- I agree that the problem of how/when to subdivide is important, but not having any standardized method is causing a retardation of the growth of wiktionary. Perhaps we should do what the OED used to do, have a drafting page where all the information on a word is stored in an arbitrary fashion and once there is enough information to distinguish the different uses and their properties move the information into a standardized format in the main dictioary. --Imran 23:57 Jan 4, 2003 (UTC)
- yes, that's a good plan, since it doesn't exclude early contributors by making them wait for structure, and it maximizes the chance that we'll find a way to distinguish common from uncommon uses to generate sub-wiktionary texts and indices.
- I agree that the problem of how/when to subdivide is important, but not having any standardized method is causing a retardation of the growth of wiktionary. Perhaps we should do what the OED used to do, have a drafting page where all the information on a word is stored in an arbitrary fashion and once there is enough information to distinguish the different uses and their properties move the information into a standardized format in the main dictioary. --Imran 23:57 Jan 4, 2003 (UTC)
I am not looking fowars to the word It's that has over 43 defs!
Do you think immediately derivative words, for instance walker from walk, should have their own article or should be in the main words article ? --Imran 23:37 Jan 5, 2003 (UTC)
- I have thought about this. I'm favouring a bit of both. The "main" article should always be bringing things together, and should likely list the possible derivatives. At "walker" we would still need a simple definition as "a person who walks q.v.", assuming that it had no other meaning. Of course we would also need to consider that a "walker" can also be a device used by seniors who are having diffiulty in walking. Even something like "walks" may need an entry that would guide non-English speakers to understanding that such a word could be the plural of the noun or the 3rd person singular of the verb. These derivative word articles could be much less comprehensive than the one on the root word. Eclecticology 01:46 Jan 6, 2003 (UTC)
- Longman's defining vocabulary deals with this by listing the suffixes separately, and using the suffixes only to qualify words in the most common way. Any exception to this convention, e.g. 'walker' to mean an old person's disability assistance device, gets its own definition, but not as part of the defining vocuabulary.
- A lot of this is a matter of trying to second guess what people might do to look things up. I believe, to start with that there should be separate entries for the prefixes and suffixes alone, such as -er, (assuming that the system will allow an entry that begins with a hyphen)
- absolutely, it must, since this is the standard English dictionary way of defining suffixes. also **THE STUPID/WRONG WIKIPEDIA CONVENTION OF CAPITALIZING THE FIRST WORD OF EVERY ENTRY** must now certainly be removed from the software. IT's bad in an encyclopedia, but there are ways to work around it. There's no way around it in a dictionary! The software simply has to change.
- You're preaching to the converted on this one. See Wiktionary:Bug reports. The serious distortions in an encyclopedia are more limited. It's difficult to work around the entry for the concept of pH in chemistry. A dictionary requires considerably more precision than an encyclopedia. Eclecticology
- absolutely, it must, since this is the standard English dictionary way of defining suffixes. also **THE STUPID/WRONG WIKIPEDIA CONVENTION OF CAPITALIZING THE FIRST WORD OF EVERY ENTRY** must now certainly be removed from the software. IT's bad in an encyclopedia, but there are ways to work around it. There's no way around it in a dictionary! The software simply has to change.
- I'm not really disputing your approach, but the more I get into this project the more I appreciate the difficulties there are in writing a dictionary. Eclecticology 19:04 Jan 10, 2003 (UTC)
- there are main wikipedia articles on w:defining vocabulary, w:defining dictionary, and w:core glossary (a more abstract concept needed by both). might be best to review those, and edit them to w:consensus, so that there is a stong standard basis for proceeding, rather than re-inventing the wheel. it is better to work out the ideal structure of a dictionary there, where the actual structure of real dictionaries is being explained to the casual reader.
- I think I'm grasping what you're trying to say. The consensus-building on this will likely need to be a continuing wrk in pregress. There are some profoundly shallow views around concerning what a dictionary is. There's also often a natural human tendency to search for certainty where none can exist. We may not find too many people willing to participate in developing the consensus. The required sophistication is just not there. If we are going to work from a "defining vocabulary" as you suggest, it seems that one of the first things we will need is a properly wikied list of those words so that each of those concepts can be adequately discussed. Then there's the question of how we can possibly discuss the words on such a list without getting locked int a self-referential loop. Eclecticology 07:49 Jan 11, 2003 (UTC)
- there are main wikipedia articles on w:defining vocabulary, w:defining dictionary, and w:core glossary (a more abstract concept needed by both). might be best to review those, and edit them to w:consensus, so that there is a stong standard basis for proceeding, rather than re-inventing the wheel. it is better to work out the ideal structure of a dictionary there, where the actual structure of real dictionaries is being explained to the casual reader.
- This is out-of-date. I stopped reading at 2.1.2 - because I think we're suppose to include = = English = = at the top of ALL articles these days. I'll wait to comment until version 2.0, which I'm sure you'll be working on over the holiday weekend. :-) Cheers, --Stranger 15:01, 1 September 2005 (UTC)