0
$\begingroup$

I'm doing a project on outlier detection, and as one of the methods for gaining an understanding of my dataset I'm using boxplots to visualize how the data points lie in regard to all the data points for that specific feature. However, when I use the seaborn module to plot it, it gives me a boxplot that doesn't make sense in my head. All datapoints lie outside the whiskers, with none of them inside?

Boxplot

The code is very simple:

def boxplot(): df = pd.read_csv('csv-data/test.csv') sns.boxplot(x='11', data=df) plt.show() 

I thought maybe the whiskers were the problem, so I tried to increase the length just to test it, and this is the result: enter image description here

As can be seen, the points simply just follow along with the whiskers. I'm not sure why this is happening, and I know for a fact that there should be points inside the whiskers because I have a point like "0.0768012746931151" for feature 11.

$\endgroup$

2 Answers 2

2
$\begingroup$

As you've observed, in a boxplot the only points plotted as individual point are the outliers (i.e. the points outside the whiskers). The boxes and whiskers represent the main part of the distribution, so there is no need to plot these as individual points. This is stated in the seaborn documentation, which states "The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the inter-quartile range."

$\endgroup$
1
$\begingroup$

Because the data points that are plotted as points are seen as outliers(generally defined as points that fall outside the 1.5 IQR). All other data points are not plotted since they are held inside the boxplot. If you change the length of the whiskers as a proportion of the IQR you simply change when data points are seen as outliers and therefore if they get plotted or not. For more information on the different components of a box plot also see the wikipedia page on box plots.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.