In statistics, there is a term called Outliers. An outlier is a data point that significantly differs from the other data in a sample. Sometimes, these outliers can make the statisticians more attentive about the abnormalities of experiments. The outliers also alert them about the errors in the measurement which may cause the omission of the outlier from the dataset. As the outliers help in understanding the experiment more clearly, it is one of the most critical aspects for the statisticians for their operation. A quartile is a type of quantile that also helps in finding the outliers.
There are three types of quartiles in a statistical experiment. The first quartile (Denoted as Q1) is referred to as the middle number between the smallest amount and the median of the data set. It cuts off the lower 25% of the data set from the above 75%. The second quartile (denoted as Q2) is the median of the data whereas the Third quartile is the value situated in the middle between the average and the highest amount of the data set. It splits off the above 25% data from the below 75%.
The Q1 is known as first quartile & lower quartile and also 25th percentile. The Q2 is known as second quartile & median and even 50th percentile. The Q3 is known as third quartile & upper quartile and also 75th percentile.
To find or decide the quartiles of a data set there is no specific universal agreement. However, we can use three methods to find out the quartiles of a data set.
While calculating the quartiles of a sample data set first you need to divide the sample data set into two halves using the median of the data set. If the sample dataset consists of an odd number of data, then do not include the average (the middle value of the data list) in any half of the data list. If the data set consists of even number of data, then split the data set into exactly half parts.
After partition, you can find the quartiles of the data set. The lower quartile will be the median of the lower half of the data whereas the average of the upper half of the data will be the top quartile.
In this method also you have to divide the sample data set into two halves using the median. But here the rules will be little different. If the data set you are working with consists an odd number of data, then you have to add the median into the two halves of the data set. But if the sample set has an even number of data then split the game into precisely two halves.
The Q1 will be the median of the lower half of the data set. The Q3 will be the median of the upper half of the sample set. The values found by using this method is also known as “Tukey’s hinges”.
In the 3rd method, there are three options. If the sample dataset consists of even number, then you have to follow from the above two methods. It is because the median is not a single datum.
If there are (4n+1) data points, then the lower quartile will be 75% of the nth data value plus 25% of the (n+1)th data value while the upper quartile will be 25% of the (3n+1)th data point plus 75% of the (3n+2)th data point.
If there are (4n+3) data points, then the lower quartile will be the 75% of the (n+1)th data value plus 25% of the (n+2)th data value and the upper quartile will 25% of the (3n+2)th data point plus 75% of the (3n+3)th data point.
For every method, the Q2 will be the median of the total or original sample set.