
The below are links to 1000 randomly-selected main-namespace pages that contain lines matching ^== *English *== *$.

The list was generated based on the 2012, June 26–27th database dump using more-or-less the code on the talk-page.

first 100

  • 80 are lemmas, others are converted to lemma form for subsequent analysis, biasing the sample toward nouns, because verbs are not so converted if there is an adj or noun that could be the "ed", "ing", or "bare" (after stripping "s" or "es") form.
  • 2 are Proper nouns, not normally well covered by dictionaries.
  • 25 are in All sources
  • 21 are Modern, not in MW 1913 or Century
  • 14 are Old, only in MW 1913 or Century
  • 12 are only in Wikipedia
  • 7 are in Few sources
  • 5 are only in 1 source, not WP or Old
  • 14 are only in Wiktionary

Group 1


Group 2


Group 3


Group 4


Unprocessed 5
