Skewness is defined as asymmetry in the distribution of the sample data values. Values on one side of the distribution tend to be further from the 'middle' than values on the other side. For skewed data, the usual measures of location will give different values, for example, mode < median < mean would indicate positive (or right) skewness.
Positive (or right) skewness is more common than negative (or left) skewness.
For univariate data Y1, Y2, ..., YN, the formula for skewness is:
Where is the mean, is the standard deviation, and N is the number of data points. The skewness for a normal distribution is zero, and any symmetric data should have skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. Some measurements have a lower bound and are skewed right. For example, in reliability studies, failure times cannot be negative
Properties of skewness
Skewness can be infinite, as when
Pr [ X > x ] = x -3 for x > 1 , Pr [ X < 1 ] = 0
or undefined, as when
Pr [ X < x ] = (1 - x) -3/ 2 for negative x
Pr [ X > x ] = (1 + x) -3/ 2 for positive x.
In this latter example, the third cumulant is undefined. One can also have distributions such as
Pr [ X > x ] = x -2 for x > 1 , Pr [ X < 1 ] = 0
where both the second and third cumulants are infinite, so the skewness is again undefined. If Y is the sum of n indepent random variables, all with the same distribution as X, then the third cumulant of Y is n times that of X and the second cumulant of Y is n times that of X
so Skew [ Y ] = Skew [ X ] / √n . This shows that the skewness of the sum is smaller, as it approaches a Gaussian distribution in accordance with the central limit theorem.
|Courses/Topics we help on|
|Quantitative Reasoning for Business||Applied Business Research and Statistics||Graphs & Diagrams|
|Confidence Interval for Mean & Proportions||Average||Random Variables - Discrete & Continuous Distributions|
|Correlation||Binomial & Poisson Distribution||Time Series|
|Quality control - R-chart - p-chart - Mean chart||Exponential Smoothing||Probability - Conditional Probability - Bayes' Theorem|
|Sampling Distribution||Moment Generating Function - Central Limit Theorem||Point Estimate & Interval Estimate|
|Normal, Uniform & Exponential Distribution||Chi-Square Test - Independence of Attributes||F-test - ANOVA|
|Distributions - Bernoulli||Geometric||t-test|
|Multiple Regression||Statistical Methods for Quality Control||Sampling Distribution|
|Non Parametric Tests||Analysis of Variance||Correlation Analysis|
|Regression Analysis||Descriptive Statistics||Moving Averages|
|Dispersion||Sampling Techniques||Estimation Theory|
|Testing of Hypothesis - Mean and Proportion Test||Data Analysis||Numerical Methods|
|Forecasting||Goodness-of-Fit Test||Inferential Statistics|
|IB Statistics||Applied socialogocal research skills||Longitudinal study|