Variable Matplotlib Histogram Bin Width

Question

I am making a figure with 3 subplots and some of the histogram bins are appearing to be different sizes, despite them all being equal width. My goal is to create a histogram with equal width bars.

I am plotting data from three different data frames df1,df2,df3 and each gets its own axis. The first two data frames (df1,df2) have 12 values, while the third (df3) has 21 values. A minimal working example:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

#Data
df1 = pd.DataFrame(data={'Delta_Thick': {0: -0.10257269427388138,1: -0.39092250646203491,2:-0.23459561055233191,3: 0.68753181981137268,4: -0.86443211703287937,5: -0.46963178960649432,6: 0.14070311160589327,7: 0.1885440568340489,8: 0.64210565529921859,9: -0.81346415594104837,10: 0.68175896505459788,11: 0.33673654536030828}})
df2 = pd.DataFrame(data={'Delta_Thick':{0: -0.38775619399296002,1: -0.32367407025583783,2: -0.56055783338428344,3: 0.23824247437746471,4: -0.64925233000340721,5: -0.44120245730257612,6: 0.027222094241818928,7: -0.091069018106476163,8: 0.0066066466889458386,9: -0.60477189852646174,10: 0.12878952794346843,11: -0.0077463979905486591}})
df3 = pd.DataFrame(data={'Delta_Thick':{0: 0.28518349971907864,1: -0.06724843620619711,2: 0.32596222283195153,3: 0.44928934543390797,4: 0.20911991461399143,5: -0.036989014816141919,6: -0.21517978702947216, 7: -0.028429332303918198,8: 0.037553921139760305,9: 0.98813506475654656,10: 0.51938760439670373,11: 0.11348101736407434,12: 0.79676269452200232,13: 0.27961307494052506,14: -0.55282685608381399,15: 0.63549900861027275,16: -0.20869225741458663,17: 0.55296943711112945,18: 0.34448294335085694,19: 0.18268186220418725,20: 0.36422880308671302}})

fig, (ax,ax1,ax2) = plt.subplots(ncols=3)

bins=[round(x,1)for x in np.linspace(-1,1,21)]
counts, division = np.histogram(df1.loc[:,'Delta_Thick'],bins=bins)
df1.loc[:,'Delta_Thick'].hist(ax=ax, bins=division,color='green',label='Thing',hatch='//')
ax.xaxis.set_ticks(np.arange(-1, 1.5, 0.5))
ax.yaxis.set_ticks(np.arange(0, 5, 1))
ax.set_title('A. 1990-2016')
ax.set_ylabel('Number of Sites')
ax.legend(fontsize='x-small',loc=2)

#Deficit
bins=[round(x,1)for x in np.linspace(-1,0.6,16)]
counts, division = np.histogram(df2.loc[:,'Delta_Thick'],bins=bins)
df2.loc[:,'Delta_Thick'].hist(ax=ax1, bins=division,color='green',hatch='//')
ax1.xaxis.set_ticks(np.arange(-1, 0.75, 0.5))
ax1.yaxis.set_ticks(np.arange(0, 5, 1))
ax1.set_title('B. 1990-2003')
ax1.set_xlabel('X axis label')

#Enrich
bins=[round(x,1)for x in np.linspace(-1,0.6,16)]
counts, division = np.histogram(df3.loc[:,'Delta_Thick'],bins=bins)
df3.loc[:,'Delta_Thick'].hist(ax=ax2, bins=division,color='green',hatch='//')
ax2.xaxis.set_ticks(np.arange(-1, 1.5, 0.5))
ax2.yaxis.set_ticks(np.arange(0, 5, 1))
ax2.set_title('C. 2003-2016')
plt.tight_layout()
plt.show()

In the above plot, the third subplot ax2 has a histogram bar that appears to have a bin width of 0.2.

Could the length of the third data frame be causing this issue?

Doesn't the variable division dictate the bin width?

I think that the bins are all fine, it's just that one of the vertical line doesn't show up. I have run your code on python3, matplotlib 2.0.2 with Qt5Agg backend and I don't see at all the vertical lines separating the bins. If I save the figure as png I get something very similar to the Qt5Agg backend. If I save as PDF, some of the bins don't even show the hatching. I don't know if it's a bug in matplotlib or in whatever displays the figures. — Francesco Montesano
– Francesco Montesano, Commented Jun 29, 2017 at 20:33

dubbbdan · Accepted Answer · 2017-06-29 22:06:24Z

I dont know why, but somehow when I was adjusting the x ticks (i.e. ax2.xaxis.set_ticks) it altered the appearance of the histogram bars. So the working solution is:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

#Data
df1 = pd.DataFrame(data={'Delta_Thick': {0: -0.10257269427388138,1: -0.39092250646203491,2:-0.23459561055233191,3: 0.68753181981137268,4: -0.86443211703287937,5: -0.46963178960649432,6: 0.14070311160589327,7: 0.1885440568340489,8: 0.64210565529921859,9: -0.81346415594104837,10: 0.68175896505459788,11: 0.33673654536030828}})
df2 = pd.DataFrame(data={'Delta_Thick':{0: -0.38775619399296002,1: -0.32367407025583783,2: -0.56055783338428344,3: 0.23824247437746471,4: -0.64925233000340721,5: -0.44120245730257612,6: 0.027222094241818928,7: -0.091069018106476163,8: 0.0066066466889458386,9: -0.60477189852646174,10: 0.12878952794346843,11: -0.0077463979905486591}})
df3 = pd.DataFrame(data={'Delta_Thick':{0: 0.28518349971907864,1: -0.06724843620619711,2: 0.32596222283195153,3: 0.44928934543390797,4: 0.20911991461399143,5: -0.036989014816141919,6: -0.21517978702947216, 7: -0.028429332303918198,8: 0.037553921139760305,9: 0.98813506475654656,10: 0.51938760439670373,11: 0.11348101736407434,12: 0.79676269452200232,13: 0.27961307494052506,14: -0.55282685608381399,15: 0.63549900861027275,16: -0.20869225741458663,17: 0.55296943711112945,18: 0.34448294335085694,19: 0.18268186220418725,20: 0.36422880308671302}})

fig, (ax,ax1,ax2) = plt.subplots(ncols=3)

bins=[round(x,1)for x in np.linspace(-1,1,21)]
counts, division = np.histogram(df1.loc[:,'Delta_Thick'],bins=bins)
df1.loc[:,'Delta_Thick'].hist(ax=ax, bins=division,color='green',hatch='//')
ax.xaxis.set_ticks(np.arange(-1, 1.5, 0.5))
ax.yaxis.set_ticks(np.arange(0, 5, 1))
ax.set_title('A. 1990-2016')
ax.set_ylabel('Number of Sites')
ax.legend(fontsize='x-small',loc=2)

#Deficit
bins=[round(x,1)for x in np.linspace(-1,1,21)]
counts, division = np.histogram(df2.loc[:,'Delta_Thick'],bins=bins)
#ax1.hist(df2.loc[:,'Delta_Thick'],bins=counts.size)
df2.loc[:,'Delta_Thick'].hist(ax=ax1, bins=division,color='green',hatch='//')
ax1.xaxis.set_ticks(np.arange(-1, 1.5, 0.5))
ax1.yaxis.set_ticks(np.arange(0, 5, 1))
ax1.set_title('B. 1990-2003')
ax1.set_xlabel('X axis label')

#Enrich

bins=[round(x,1)for x in np.linspace(-1,1,21)]
counts, division = np.histogram(df3.loc[:,'Delta_Thick'],bins=bins)
df3.loc[:,'Delta_Thick'].hist(ax=ax2, bins=division,color='green',hatch='//')
ax2.xaxis.set_ticks(np.arange(-1, 2, 0.5))
ax2.yaxis.set_ticks(np.arange(0, 10, 1))
ax2.set_title('C. 2003-2016')


plt.tight_layout()
plt.show()

Notice I changed ax2.xaxis.set_ticks(np.arange(-1, 1.5, 0.5)) to ax2.xaxis.set_ticks(np.arange(-1, 2, 0.5)).

I think that part of the problem is that sometimes the border of one bar gets behind the nearby one. You can also try to increase the width of the border lines: I don't know if this works, but it's worth a try.

Collectives™ on Stack Overflow

Variable Matplotlib Histogram Bin Width

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related