Statistics is important in many different areas of study and business.  

We look at statistics to see how a business is doing, how the economy is growing, in psychology experiments, how children perform at school and much more.  A working knowledge of statistics is essential to many jobs and careers.

Statistics are important to every sub-discipline of applied science which is important in the planning and management of any business or enterprise.

Statistics are scientific methods employed to gather, organise and analyse data.  We are able to draw conclusions and make inferences on the basis of such analyses.  Descriptive statistics describe a set of data, while inferential statistics make inferences about large groups based on data from a smaller subset of the group.  To infer means to draw a conclusion based on facts or premises.  Thus an inference is the end result; a proposition based on the act of inferring.

Distributions

Distributions are a way of displaying the chaos of numbers in an organised manner.  A frequency distribution is simply a table and often a graph that, at minimum, displays how many times in a set of data each response or "score" occurs.

Often the first stage of assessing data, to obtain descriptive statistics, is to plot the frequency distribution. This also illustrates the range (the spread from the largest to the smallest values) the data from a sample of a test population occupies and how ‘concentrated’ the data is within a range. The data with the highest occurrence has the highest frequency.  The data should be sorted in ascending order, and then it can be grouped into small ranges.  A small range is known as a class or a category.

A frequency is simply the count of the number of data values within each class or category. The data can be plotted discretely if the data set is large enough, or in groups to show the frequency of classes.  In the table below, the class that data is grouped into is the Test Scores, the frequency of the data occurring in each class is listed in the column called Frequency.

General Rules for creating a Frequency Distribution

When we look at unorganised data it is often overwhelming and no pattern can be drawn out of it.  In order to organize out data there are some general rules to follow: 

  • Find the largest and smallest numbers in the data.  From this find the range.  Range = largest number – smallest number.

  • Divide the range into a convenient number of class intervals. For example test results would normally have intervals of 10, ranging from 0 through to 100.  The intervals must have the same size and are generally picked so that midpoints do coincide with the actual data. 

  • Count the number of data that actually falls with each interval.

  • Graph the results making sure you include a title, label each axis, add interval values and draw in the data.

 

Central Tendency

Central tendency gives a single description of the average or "typical" score in the distribution and variability quantifies how "spread out" the scores are in the distribution.

Measures of central tendency, or "location", attempt to quantify what we mean when we think of as the "typical" or "average" score in a data set. The concept is extremely important and we encounter it frequently in daily life. For example, we often want to know before purchasing a car its average distance per litre of petrol. Or before accepting a job, you might want to know what a typical salary is for people in that position so you will know whether or not you are going to be paid what you are worth. Or, if you are a smoker, you might often think about how many cigarettes you smoke "on average" per day. Statistics geared toward measuring central tendency all focus on this concept of "typical" or "average."

Measuring Central Tendency

Range  

Range is the spread from the largest to the smallest values.   

Percentiles

Percentiles are an allocation of the spread of values into 100 divisions. 

Quartiles:

Division of the spread of values into quarters, with a 50% spread between the lower and upper quartiles. 

Mode:

Mode is by far the simplest, but also the least widely used, measure of central tendency is the mode. The mode in a distribution of data is simply the score that occurs most frequently.

Median

Technically, the median of a distribution is the value that cuts the distribution exactly in half, such that an equal number of scores are larger than that value as there are smaller than that value. The median is by definition what we call the 50th percentile. This is an ideal definition, but often distributions cannot be cut exactly in half in this way, but we still can define the median in the distribution. Distributions of qualitative data do not have a median. The median is most easily computed by sorting the data in the data set from smallest to largest. The median is the "middle" score in the distribution.

Mean

The mean, or "average", is the most widely used measure of central tendency. The mean is defined technically as the sum of all the data scores divided by n (the number of scores in the distribution). In a sample, we often symbolise the mean with , pronounced "X-bar."

Need Help?

Take advantage of our personalised, expert course counselling service to ensure you're making the best course choices for your situation.