#Why is my matplotlib histogram not including the x-value points in the x-axis in Python?

1 messages · Page 1 of 1 (latest)

merry cradle
#

Hello, I was getting into learning some Data Science, but I have hit a roadblock with matplotlib graphing. I thinnk I'm doing eveyrthing correctly, but I keep getting a weird looking graph. I'm trying to plot a histogram graph to represent the ferquency of client's total income based on a dataframe I have. I want the x-axis to have the income bins, but the y-axis seems to have the values corresponding to the 'AMT_INCOME_TOTAL'. I thought that someone else's eyes on my code would be really helpful. Please help if you can, thanks!

plt.rcdefaults()  # Reverting back to plt.figure defaults
plt.hist(x=df['AMT_INCOME_TOTAL'], bins=30, alpha=0.5, color='red')
plt.xlabel('Income Total')
plt.ylabel('Frequency')
plt.title('Histogram of Income Total')
plt.show()

In addition, I have checked the unique values that exist for the 'AMT_INCOME_TOTAL' column and that is not what is represented in the graph that's outputted by my plot.

The following is my colab file for an answerer to get a better idea of what I'm trying to do. I have helpful markdowns that explain what I'm trying to do as well: https://colab.research.google.com/drive/1ExpIPT8orJ6_f1qw909ROx9IYrU5YJsS?usp=sharing
(I sent the colab file because I thought that was similar to sending a GitHub repo even though that wasn't explicitly permitted. If that's not allowed, then please let me know and I will refrain from doing that again, thanks. )

In my markdown above the problematic histogram plot, I say that I want to find outliers, and I know that I can use a boxplot for that and there's a different and better prorcess for doing so, but I just wanted a histogram visual too. The histogram can be seen in the colab file. I don't think I can post pictures here as per the rules, so please take a look at that if you can, thanks. I just don't understand why it's not working. I would really appreciate any help in the matter.

merry cradle
#

For context, I am getting this graph using he code that I posted above. I then put the data into a website that can create histograms and I get the image from the second screenshot with the blue bins. This used 11 bins. I tried including 11 bins in my code instead of 30. his gave me a similarly bad graph like what I have in the third red bin screenshot with just one thicker bin. I'm trying something in my notebook right now by using a range for the bin parameter, but it's taking a while to output anything. I'll delete this question if it works.