Outliers
Outlier is an observation that lies out side the overall pattern of distribution . It will be numerically distant from rest of the data. Outlier is "one that appears to deviate markedly from other members of the sample in which it occurs." says Grubbs.Usually, the presence of an outlier indicates some sort of problem.Outlier points therefore indicates faulty data, erroneous procedures, or areas where a certain theory might be invalid.It can be due to some experimental errors or by measurement errors, or sometimes by long-tailed population.
If it is measurement error ,identification and removal of that outlying data is widely preferred. Because the presence of outliers will impact on the accuracy and appropriateness of a result. Quartile method is used to identify outliers.
If it is long-tailed population ,it means that the distribution has a high kurtosis (kurtosis means the peak level of probability distribution of any real-valued random variable). So, while dealing with it is necessary for one to be careful and cautious in using tools or intuitions that assume a normal distribution
Causes Of Outliers:
The presence of outliers is due to one of the following reasons :
Outlier Detection:
Basically there are three to the problem of outlier detection:
Apart from this , Grubb used a separate method called the ESD method for the detection of outliers where to detect a outlier from a sample of data, one must first find how far the outlier is from the rest of the data? The difference between the outlier and the mean divided by the SD is found . If that value is large, then the outlier is far from the others.
In a large sample of data , small number of outliers are to be expected. Such outliers can easily be found in histograms .They may include the sample maximum or sample minimum, or at times both, based on whether they are extremely high or extremely low. There wil be cases where the sample maximum & sample minimum will not be outliers when are unusually far from other observations
| Name* : |
|||||
| Email* : |
|||||
| Country* : |
|||||
| Phone* : |
|||||
| Subject* : |
|||||
| Upload Homework : Upload another homework (upto 5 uploads max.)
|
|||||
| Due Date |
Time |
AM/PM |
Timezone |
||
| Instructions |
|||||
|
|||||
| Courses/Topics we help on | ||
| Quantitative Reasoning for Business | Applied Business Research and Statistics | Graphs & Diagrams |
| Confidence Interval for Mean & Proportions | Average | Random Variables - Discrete & Continuous Distributions |
| Correlation | Binomial & Poisson Distribution | Time Series |
| Quality control - R-chart - p-chart - Mean chart | Exponential Smoothing | Probability - Conditional Probability - Bayes' Theorem |
| Sampling Distribution | Moment Generating Function - Central Limit Theorem | Point Estimate & Interval Estimate |
| Normal, Uniform & Exponential Distribution | Chi-Square Test - Independence of Attributes | F-test - ANOVA |
| Distributions - Bernoulli | Geometric | t-test |
| Multiple Regression | Statistical Methods for Quality Control | Sampling Distribution |
| Non Parametric Tests | Analysis of Variance | Correlation Analysis |
| Regression Analysis | Descriptive Statistics | Moving Averages |
| Dispersion | Sampling Techniques | Estimation Theory |
| Testing of Hypothesis - Mean and Proportion Test | Data Analysis | Numerical Methods |
| Forecasting | Goodness-of-Fit Test | Inferential Statistics |
| IB Statistics | Applied socialogocal research skills | Longitudinal study |