User:Dcljr/Statistics

This page contains terms related to probability and statistics. I've just copied it over from a subpage of my user page at Wikipedia and am in the process of paring it down to bare lists.

General terms edit

Etymology edit

Definitions edit

Some textbook definitions of statistics and related terms (italics added):

Stephen Bernstein and Ruth Bernstein, Schaum's Outline of Elements of Statistics II: Inferential Statistics (1999)
Statistics is the science that deals with the collection, analysis, and interpretation of numerical information.
In descriptive statistics, techniques are provided for collecting, organizing, summarizing, describing, and representing numerical information.
[Inferential statistics provides] techniques.... for making generalizations and decisions about the entire population from limited and uncertain sample information.
Donald A. Berry, Statistics: A Bayesian Perspective (1996)
Statistical inferences have two characteristics:
  1. Experimental or observational evidence is available or can be gathered.
  2. Conclusions are uncertain.
John E. Freund, Mathematical Statistics, 2nd edition (1971)
Statistics no longer consists merely of the collection of data and their representation in charts and tables — it is now considered to encompass not only the science of basing inferences on observed data, but the entire problem of making decisions in the face of uncertainty.
Gouri K. Bhattacharyya and Richard A. Johnson, Statistical Concepts and Methods (1977)
Statistics is a body of concepts and methods used to collect and interpret data concerning a particular area of investigation and to draw conclusions in situations where uncertainty and variation are present.
E. L. Lehmann, Theory of Point Estimation (1983)
Statistics is concerned with the collection of data and with their analysis and interpretation.
William H. Beyer (editor), CRC Standard Probability and Statistics Tables and Formulae (1991)
The pursuit of knowledge frequently involves data collection; and those responsible for the collection must appreciate the need for analyzing the data to recover and interpret the information therein. Today, statistics are being accepted as the universal language for the results of experimentation and research and the dissemination of information.
Oscar Kempthorne, The Design and Analysis of Experiments, reprint edition (1973)
Statistics enters [the scientific method] at two places:
  1. The taking of observations
  2. The comparison of the observations with the predictions from... theory.
Marvin Lentner and Thomas Bishop, Experimental Design and Analysis (1986)
The information obtained from planned experiments is used inductively. That is, generalizations are made about a population from information contained in a random sample of that particular population. ... [Such] inferences and decisions... are sometimes erroneous. Proper statistical analyses provide the tools for quantifying the chances of obtaining erroneous results.
Robert L. Mason, Richard F. Gunst and James L. Hess, Statistical Design and Analysis of Experiments (1989)
Statistics is the science of problem-solving in the presence of variability.
Statistics is a scientific discipline devoted to the drawing of valid inferences from experimental or observational data.
Stephen K. Campbell, Flaws and Fallacies in Statistical Thinking (1974)
Statistics... is a set of methods for obtaining, organizing, summarizing, presenting, and analyzing numerical facts. Usually these numerical facts represent partial rather than complete knowledge about a situation, as is the case when a sample is used in lieu of a complete census.

Basic concepts edit

Population vs. sample edit

Randomness, probability and uncertainty edit

Prior information and loss edit

Data collection edit

Sampling edit

Main article: Sampling (statistics)

Experimental design edit

Main article: Design of experiments

Data summary: descriptive statistics edit

Main article: Descriptive statistics

Levels of measurement edit

Main article: Level of measurement
  • Qualitative (categorical)
    • Nominal
    • Ordinal
  • Quantitative (numerical)
    • Interval
    • Ratio

Graphical summaries edit

Main article: ?

Numerical summaries edit

Main article: Summary statistics

Data interpretation: inferential statistics edit

Main article: Statistical inference

Estimation edit

Main article: Statistical estimation

Prediction edit

Main article: Statistical prediction

Hypothesis testing edit

Main article: Statistical hypothesis testing

Relationships and modeling edit

Correlation edit

Regression edit

Time series edit

Data mining edit

Statistical practice and methods edit

Statistics in other fields edit

Subfields or specialties in statistics edit

Probability:

Related areas of mathematics edit

Also: Statistical physics

Typical course in mathematical probability edit

Below are the topics typically (?) covered in a one-year course introducing the mathematical theory of probability to undergraduate students in mathematics and statistics. (Actually, this list contains much more material than is typically covered in one year.)

Topics of a more advanced nature are italicized, including those typically only covered in mathematical statistics or graduate-level probability theory courses (e.g., topics requiring measure theory). See also the #Typical course in mathematical statistics below.

order?

  • And so on, and so forth...

Typical course in mathematical statistics edit

Would cover many of the topics from the #Typical course in mathematical probability outlined above, plus...

  • And so on, and so forth...

Typical course in applied statistics edit

Less theoretical than the #Typical course in mathematical statistics outlined above. (Sometimes portions of the following form the basis of a second statistics course for mathematics majors — third in the sequence if probability is the first course).

  • And so on, and so forth...

Bayesian anaylsis edit

Hmm...

Terms from categorical data analysis edit

(By chapter: Agresti, 1990.)

  1. (none)
  2. contingency table, two-way table, two-way contingency table, cross-classification table, cross-tabulation, relative risk, odds ratio, concordant pair, discordant pair, gamma, Yule's Q, Goodman and Kruskal's tau, concentration coefficient, Kendall's tau-b, Sommer's d, proportional prediction, proportional prediction rule, uncertainty coefficient, Gini concentration, entropy (variation measure), tetrachoric correlation, contingency coefficient, Pearson's contingency coefficient, log odds ratio, cumulative odds ratio, Goodman and Kruskal's lambda, observed frequency
  3. expected frequency, independent multinomial sampling, product multinomial sampling, overdispersion, chi-squared goodness-of-fit test, goodness-of-fit test, Pearson's chi-squared statistic, likelihood-ratio chi-squared statistic, partitioning chi-squared, Fisher's exact test, multiple hypergeometric distribution, Freeman-Halton p-value, phi-squared, power divergence statistic, minimum discrimination information statistic, Neyman modified chi-squared, Freeman-Tukey statistic, ...

Statistical software edit

List of statistical software or List of statistical software packages...

Commercial edit

Free versions of commercial software edit

  • Gnumeric — not a clone of Excel, but implements many of the same functions (can it use Excel add-ins?)
  • R — free version of S
  • FIASCO or PSPP — free version of SPSS

Other free software edit

Licensing unknown edit

World Wide Web edit

  • StatLib — large repository of statistical software and data sets

Online sources of data edit

See also edit

External link edit

References edit

  • Agresti, Alan (1990). Categorical Data Analysis. NY: John Wiley & Sons. →ISBN.
  • Casella, George & Berger, Roger L. (1990). Statistical Inference. Pacific Grove, CA: Wadsworth & Brooks/Cole. →ISBN.
  • DeGroot, Morris (1986). Probability and Statistics (2nd ed.). Reading, Massachusetts: Addison-Wesley. →ISBN.
  • Kempthorne, Oscar (1973). The Design and Analysis of Experiments. Malabar, FL: Robert E. Krieger Publishing Company. →ISBN. [Rpt.; orig. 1952, NY: John Wiley & Sons.]
  • Kuehl, Robert O. (1994). Statistical Principles of Research Design and Analysis. Belmont, CA: Duxbury Press. →ISBN.
  • Lentner, Marvin & Bishop, Thomas (1986). Experimental Design and Analysis. Blacksburg, VA: Valley Book Company. →ISBN.
  • Manoukian, Edward B. (1986). Modern Concepts and Theorems of Mathematical Statistics. NY: Springer-Verlag. →ISBN.
  • Mason, Robert L.; Gunst, Richard F.; and Hess, James L. (1989). Statistical Design and Analysis of Experiments: With Applications to Engineering and Science. NY: John Wiley & Sons. →ISBN.
  • Ross, Sheldon (1988). A First Course in Probability Theory (3rd ed.). NY: Macmillan. →ISBN.

And eventually... edit

  • Berger, James O. (1985). Statistical Decision Theory and Bayesian Analysis (2nd ed.). NY: Springer-Verlag. →ISBN. (Also, Berlin: →ISBN.)
  • Berry, Donald A. (1996). Statistics: A Bayesian Perspective. Belmont, CA: Duxbury Press. →ISBN.
  • Feller, William (1950). An Introduction to Probability Theory and Its Applications, Vol. 1. NY: John Wiley & Sons. ISBN unknown. (Current: 3rd ed., 1968, NY: John Wiley & Sons, →ISBN.)
  • Feller, William (1971). An Introduction to Probability Theory and Its Applications, Vol. 2 (2nd ed.). NY: John Wiley & Sons. →ISBN.
  • Lehmann E. L. [Eric Leo] (1991). Theory of Point Estimation. Pacific Grove, CA: Wadsworth & Brooks/Cole. →ISBN. (Orig. 1983, NY: John Wiley & Sons.)
  • Lehmann E. L. [Eric Leo] (1994). Testing Statistical Hypotheses (2nd ed.). NY: Chapman & Hall. →ISBN. (Orig. 2nd ed., 1986, NY: John Wiley & Sons.)