Introduction to statistics and probability
Statistics-is a mathematical science pertaining to the collection, presentation, analysis and interpretation of data.
An analysis of any situation can be done in two ways
- Statistical analysis
- Non-statistical analysis
Statistical analysis- it is the science of collecting, exploring and presenting large amounts of data to identify patterns and trends. It is also called quantitative analysis.
Non-statistical analysis-it provides generic information and includes text, sound, still images and moving images. It is also called qualitative analysis.
There are two major categories of statistics
- Descriptive statistics
- Inferential statistics
Descriptive statistics helps organizing data and focuses on the main characteristics of the data. It also provides a summary of the data numerically or graphically.
Inferential statistics generalizes the larger dataset and applies probability theory to draw a conclusion. It allows you to infer population parameters based on samples statistics and to model relationships within the data.
Differences between descriptive and inferential.
Descriptive statistics
- Organizing and summarizing data using numbers and graphs.
- Data summary-bar graphs, histograms, pie charts etc. shape of the graph and skewness.
- Measures of central tendency; mean, median and mode.
- Measures of variability; range, variance and standard deviation.
Inferential statistics
- Using sample data to make an inference or draw a conclusion of the population.
- Uses probability to determine how confident we can be that the conclusions we make are correct (confidence intervals & margins of error).
Statistics are characteristics that describe a sample.
Parameter are characteristics that describe a population.
Sample is a sub-set of the population.
Scale of measurement.
Nominal scale data
- Qualitative /categorical
- Names, colours, labels, gender etc.
- Order does not matter.
Ordinal scale data
- Ranking/placement.
- The order matters.
- Difference cannot be measured.
Interval scale data
- The order matters
- Difference can be measured (except ratios)
- No true ‘0’ starting point.
Ratio scale data
- The order matters.
- Differences are measurable (including ratios)
- Contains a ‘’0’’ starting point
Data | nominal | ordinal | interval | ratio |
Labelled | yes | yes | yes | yes |
Meaningful order | no | yes | yes | yes |
Measurable difference | no | no | yes | yes |
True 0 starting point | no | no | no | yes |