What is Statistics ?

In short, it's used to make more effective decisions.

Types of Data

Qualitative

When data is non numeric like blood type or grade letter.

Quantitative

When data is represented as numbers.

We're mostly concerned about this type throughout the course. Quantitve data assumes numeric values to be:

Types of Statistical Studies

Descriptive Statistics

Consists of collecting, organizing, presenting and summarizing of data in informative way.

The target is to describe a certain situation represented by a data set.

Inferential Statistics

Involves drawing conclusions and making decisions depending on data from descriptive stats.

The target is to make inferences about a certain situation represented by a data set.

Populations vs Samples

Population

Samples

So the major task of inferential statistics is to make conclusions about a whole population based on a sample!

Descriptive Statistics Steps

Step 1: Organization

We organize data into a Frequency Distribution which is a table consisting of 3 columns:

Categorical Frequency Distribution

Used for qualitative and small quantitative data sets.

  1. Construct a table with a row for each class.
  2. Count the number of occurances of each class and place the results in the second column.
  3. Fill the 3rd column, relative frequency.
  4. Sum of relative frequencies must be one!

Grouped F.D.

Used for large quantitative data sets.

  1. Find the range of the data. range=highlowrange = high - low
  2. Increase range by 1 and divide the number of classes. Round up if result is non-integer.
  3. Create class limits following the range calculated above.
  4. Fill in the table keeping in mind the left-end inclusive convention.
  5. Compute the values of the relative frequencies.
  6. Sum of relative frequencies must be one!

Left-end inclusive convention states that we assume that all ranges are of from [s,e)[s, e) where the starting point is included but not the end.

Important Notes:

Step 2: Data Presentation

Data is represented by different types of graphs!

The following are used to present quantitative data.

Histograms

Histogram

Ogives

Ogives

Relative Frequency

Useful when comparing different data sets. Using frequencies will be misleading.

Frequency Polygons

Freq Poly

Pie Charts

Used to represent qualitative data!

Pie

Step 3: Summarization

Data summarization involves extracting information about the general distribution of data.

Central Tendency

We're interested in a value that represents the center of the distribution!

We'll study three definations under Central Tendency:

The median (MD)

It's the midpoint of the entire data array. To determine the midpoint we follow the following steps:

NOTE: Median doesn't need to be a data value!

The mode

It's the data value that has the highest frequency in a data set.

Summary

Data Organization

If the number of classes is given ?

If an initial value is given but NO number of classes

If the number of classes is NOT given and NO initial value (Stem-Leaf)

Data Summary