Thursday, May 31, 2012

Measures of Central Tendency

 

Measures of Central Tendency


Measures of central tendency are measures of the location of the middle or the center of a distribution. The definition of "middle" or "center" is purposely left somewhat vague so that the term "central tendency" can refer to a wide variety of measures. The measure of central tendency also want to attempt to quantify what we mean when we think of as the "typical" or "average" score in a data set. The concept is extremely important and we encounter it frequently in daily life. For example, we often want to know before purchasing a car its average distance per liter of diesel. Or before accepting a job, you might want to know what a typical salary is for people in that position so you will know whether or not you are going to be paid what you are worth. Or, if you are a smoker, you might often think about how many cigarettes you smoke "on average" per day and what would be your life span. Statistics geared toward measuring central tendency all focus on this concept of "typical" or "average."
The examples are mean median and mode for measures of central tendency and the examples are for ungroup data only for group data have different formula.


Mean

The mean, or "average", is the most widely used measure of central tendency. The mean is defined technically as the sum of all the data scores divided by n (the number of scores in the distribution). In a sample, we often symbolize the mean with a letter with a line over it. If the letter is "X", then the mean is symbolized as X, pronounced "X-bar." If we use the letter X to represent the variable being measured, and then symbolically, the mean is defined as

For example, using the data from above, where the n = 5 values of X were 5, 7, 6, 1, and 8, the mean is (5 + 7 + 6 + 1 + 8) / 5 = 5.4. The mean number of sexual partners reported by UNE students who responded to the question is, from Figure 4.1, (1 + 0 + 2 + 4 + . . . + 0 + 6 + 2 + 2)/ 177 = 1.864. Note that this is higher than both the mode and the median. In a positively skewed distribution, the mean will be higher than the median because its value will be dragged in the direction of the tail. Similarly in a negatively skewed distribution, the mean will be dragged lower than the median because of the extra large values in the left-hand tail.

Median
 
Median is described as the numerical value separating the higher half of a sample, a total population , or a probability distribution , from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values
The median is determined by sorting the data set from lowest to highest values and taking the data point in the middle of the sequence. There is an equal number of points above and below the median. For example, in the data set {1,2,3,4,5} the median is 3; there are two data points greater than this value and two data points less than this value. In this case, the median is equal to the mean. But consider the data set {1,2,3,4,10}. In this dataset, the median still is three, but the mean is equal to 4. If there is an even number of data points in the set, then there is no single point at the middle and the median is calculated by taking the mean of the two middle points

Mode

The mode is the most frequently occurring value in the data set. For example, in the data set {1,2,3,4,4}, the mode is equal to 4. A data set can have more than a single mode, in which case it is multimodal. In the data set {1,1,2,3,3} there are two modes: 1 and 3.
The mode can be very useful for dealing with categorical data. For example, if a sandwich shop sells 10 different types of sandwiches, the mode would represent the most popular sandwich. The mode also can be used with ordinal, interval, and ratio data. However, in interval and ratio scales, the data may be spread thinly with no data points having the same value. In such cases, the mode may not exist or may not be very meaningful.

0 comments:

Post a Comment