- The Language Of Physics Is Mathematics-


Friday, 13 July 2012


Data

We collect information in order to draw conclusions, to make business decisions or to predict trends.  Information that has been gathered in some way is called data.  We may collect data ourselves or obtain data from an existing source.


Surveys

The collection of information for a specific purpose is called a survey.  Data is needed to help solve a problem.  If it is not already available, we collect it by conducting a survey.  The main methods of collecting data are by observation, interview and questionnaire.  Some surveys use a combination of these.


The following terms are often used when we collect data:
Population
The population is the whole set of items under consideration.  E.g. If the information on the best selling book in Australia is required, the population is all books on sale in bookshops in Australia.

Sample
A sample is drawn from a population.  So a sample is only a part of the population that has been selected to find an information about the population.


For example, a store manager wants to know the favourite washing powder of customers in his area, so that he can stock the more popular items (i.e. different brands of washing powders).  He decides to conduct a survey and choose the first 200 customers as a sample group, asking the question 'what is your favourite washing powder?'


When selecting a sample, two things are taken into consideration:
  • The sample should be representative of the population (i.e. it should have the same traits as that of the population).
  • All items of the population should have the same chance of being selected for the sample.
If a sample has these two features, it should be a fair representation of the population and unbiased towards particular sub-groups in the population.


Mean, Median and Mode



We use statistics such as the mean, median and mode to obtain information about a population from our sample set of observed values.


Mean

The mean (or average) of a set of data values is the sum of all of the data values divided by the number of data values.  That is:






Example 1

The marks of seven students in a mathematics test with a maximum possible mark of 20 are given below:


     15     13     18     16     14     17     12


Find the mean of this set of data values.
Solution:



So, the mean mark is 15.


Symbolically, we can set out the solution as follows:





So, the mean mark is 15.


Median

The median of a set of data values is the middle value of the data set when it has been arranged in ascending order.  That is, from the smallest value to the highest value.


Example 2

The marks of nine students in a geography test that had a maximum possible mark of 50 are given below:


     47     35     37     32     38     39     36     34     35


Find the median of this set of data values.
Solution:
Arrange the data values in order from the lowest value to the highest value:


     32     34     35     35     36     37     38     39     47


The fifth data value, 36, is the middle value in this arrangement.




Note:


In general:

If the number of values in the data set is even, then the median is the average of the two middle values.


Example 3

Find the median of the following data set:


     12     18     16     21     10     13     17     19
Solution:
Arrange the data values in order from the lowest value to the highest value:


     10     12     13     16     17     18     19     21


The number of values in the data set is 8, which is even.  So, the median is the average of the two middle values.



Alternative way:
There are 8 values in the data set.





The fourth and fifth scores, 16 and 17, are in the middle.  That is, there is no one middle value.




Note:
  • Half of the values in the data set lie below the median and half lie above the median.
  • The median is the most commonly quoted figure used to measure property prices.  The use of the median avoids the problem of the mean property price which is affected by a few expensive properties that are not representative of the general property market.


Mode

The mode of a set of data values is the value(s) that occurs most often.The mode has applications in printing.  For example, it is important to print more of the most popular books; because printing different books in equal numbers would cause a shortage of some books and an oversupply of others.
Likewise, the mode has applications in manufacturing.  For example, it is important to manufacture more of the most popular shoes; because manufacturing different shoes in equal numbers would cause a shortage of some shoes and an oversupply of others.


Example 4

Find the mode of the following data set:


     48     44     48     45     42     49     48
Solution:
The mode is 48 since it occurs most often.

Note:
  • It is possible for a set of data values to have more than one mode.
  • If there are two data values that occur most frequently, we say that the set of data values is bimodal.
  • If there is no data value or data values that occur most frequently, we say that the set of data values has no mode.

Frequency and Frequency Tables

The frequency of a particular data value is the number of times the data value occurs.
For example, if four students have a score of 80 in mathematics, and then the score of 80 is said to have a frequency of 4.  The frequency of a data value is often represented by f.


A frequency table is constructed by arranging collected data values in ascending order of magnitude with their corresponding frequencies.


Example 5

The marks awarded for an assignment set for a Year 8 class of 20 students were as follows:


     6     7     5     7     7     8     7     6     9     7
     4     10   6     8     8     9     5     6     4     8


Present this information in a frequency table.
Solution:
To construct a frequency table, we proceed as follows:

Step 1:

Construct a table with three columns.  The first column shows what is being arranged in ascending order (i.e. the marks).  The lowest mark is 4.  So, start from 4 in the first column as shown below.




Step 2:

Go through the list of marks.  The first mark in the list is 6, so put a tally mark against 6 in the second column.  The second mark in the list is 7, so put a tally mark against 7 in the second column.  The third mark in the list is 5, so put a tally mark against 5 in the third column as shown below.




We continue this process until all marks in the list are tallied.

Step 3:

Count the number of tally marks for each mark and write it in third column.  The finished frequency table is as follows:




In general:
We use the following steps to construct a frequency table:

Step 1:

Construct a table with three columns.  Then in the first column, write down all of the data values in ascending order of magnitude.

Step 2:

To complete the second column, go through the list of data values and place one tally mark at the appropriate place in the second column for every data value.  When the fifth tally is reached for a mark, draw a horizontal line through the first four tally marks as shown for 7 in the above frequency table.  We continue this process until all data values in the list are tallied.

Step 3:

Count the number of tally marks for each data value and write it in the third column.


Class Intervals (or Groups)

When the set of data values are spread out, it is difficult to set up a frequency table for every data value as there will be too many rows in the table.  So we group the data into class intervals (or groups) to help us organise, interpret and analyse the data.

Ideally, we should have between five and ten rows in a frequency table.  Bear this in mind when deciding the size of the class interval (or group).

Each group starts at a data value that is a multiple of that group.  For example, if the size of the group is 5, then the groups should start at 5, 10, 15, 20 etc.  Likewise, if the size of the group is 10, then the groups should start at 10, 20, 30, 40 etc.
The frequency of a group (or class interval) is the number of data values that fall in the range specified by that group (or class interval).


Example 6

The number of calls from motorists per day for roadside service was recorded for the month of December 2003.  The results were as follows:


Set up a frequency table for this set of data values.
Solution:
To construct a frequency table, we proceed as follows:


Step 1:  Construct a table with three columns, and then write the data groups or class intervals in the first column.  The size of each group is 40.  So, the groups will start at 0, 40, 80, 120, 160 and 200 to include all of the data.  Note that in fact we need 6 groups (1 more than we first thought).




Step 2:  Go through the list of data values.  For the first data value in the list, 28, place a tally mark against the group 0-39 in the second column.  For the second data value in the list, 122, place a tally mark against the group 120-159 in the second column.  For the third data value in the list, 217, place a tally mark against the group 200-239 in the second column.




We continue this process until all of the data values in the set are tallied.

Step 3:  Count the number of tally marks for each group and write it in the third column.  The finished frequency table is as follows:





Frequency Tables and the Mean

To find the mean of a large set of data values, we can use a frequency table.  Add an extra column to the frequency table and label it Frequency × Data Value.  Then calculate the sum of the values in this fourth column and use it to find the mean.


Example 7

A computer repair service received the following number of calls per day over a period of 30 days.


     6     5     6     9     7     4     2     4     7     8
     3     4     9     8     2     3     5     9     7     8
     9     7     5     6     7     7     4     6     2     4


Using a frequency table, find the average number of calls per day.  Round your answer to 1 decimal place.
Solution:






So, the average number of calls per day is 5.8.




Displaying Data

Statistical graphs are often used to display the data values of a random sample.  In the following sections, we will consider bar charts, pie charts and line graphs.


Bar Charts

Bar charts are often used to present data in a pictorial form to illustrate the information collected and highlight important points.  They are especially useful to depict monthly car production, monthly sales, quarterly profit, average annual rainfall etc.  A bar chart provides a useful comparison of data over time.  The height of each bar shows the total amount of the item of interest for each month (or year).
Bar charts are drawn with parallel bars placed vertically (or horizontally).  The width of each bar and the spacing between the bars are kept the same to avoid giving a misleading representation.  The height of the bar is drawn to scale to represent the amount of the item.


Example 8

The yearly production of cars by a particular company is recorded as follows:





Draw a bar chart to display this information.
Solution:



Thus, bars of equal width whose heights are in the ratio of 4 : 5 : 8 will represent the company's yearly production.







Pie Charts

Pie charts are useful to compare different parts of a whole amount.  They are often used to present financial information.  E.g. A company's expenditure can be shown to be the sum of its parts including different expense categories such as salaries, borrowing interest, taxation and general running costs (i.e. rent, electricity, heating etc).A pie chart is a circular chart in which the circle is divided into sectors.  Each sector visually represents an item in a data set to match the amount of the item as a percentage or fraction of the total data set.


Example 9

A family's weekly expenditure on its house mortgage, food and fuel is as follows:





Draw a pie chart to display the information.
Solution:



We can find what percentage of the total expenditure each item equals.
Percentage of weekly expenditure on:



To draw a pie chart, divide the circle into 100 percentage parts.  Then allocate the number of percentage parts required for each item.







Line Graphs

A line graph is often used to represent a set of data values in which a quantity varies with time.  These graphs are useful for finding trends.  That is, finding a general pattern in data sets including temperature, sales, employment, company profit or cost over a period of time.


Example 10

A cylinder of liquid was heated.  Its temperature was recorded at ten-minute intervals as shown in the following table.





a.  Draw a line graph to represent this information.


b.  Estimate the temperature of the cylinder after 25 minutes of heating.
Solution:



b.  The estimated temperature after 25 minutes of heating is 52°C.

No comments: