👀
Crash Visualization
  • Welcome
  • Preface
    • Who the book is written for
    • How the book is organized
  • 1. Introduction of Data Visualization
    • 1.1 What is data visualization?
    • 1.2 Why does visualization matter?
  • 2. Tricks in Visualization
    • 2.1 Choose Appropriate Chart
    • 2.2 Features of Charts
      • 2.2.1 Table
      • 2.2.2 Column Chart
      • 2.2.3 Line Chart
      • 2.2.4 Pie Chart
      • 2.2.5 Scatter Chart
      • 2.2.6 Map Chart
    • 2.3 Misused Graph
    • 2.4 Tips in Visualization
  • 3. Matplotlib
    • 3.1 Basic Concepts
    • 3.2 Line Chart
    • 3.3 Area Chart
    • 3.4 Column Chart
    • 3.5 Histogram Chart
    • 3.6 Scatter Chart
    • 3.7 Lollipop Chart
    • 3.8 Pie Chart
    • 3.9 Venn Chart
    • 3.10 Waffle Chart
    • 3.11 Animation
  • 4. Seaborn
    • 4.1 Trends
    • 4.2 Ranking
      • 4.2.1 Barplot
      • 4.2.2 Boxplot
    • 4.3 Composition
      • 4.3.1 Stacked Chart
    • 4.4 Correlation
      • 4.4.1 Scatter Plot
      • 4.4.2 Linear Relationship
      • 4.4.3 Heatmap
      • 4.4.4 Pairplot
    • 4.5 Distribution
      • 4.5.1 Boxplot
      • 4.5.2 Violin plot
      • 4.5.3 Histogram plot
      • 4.5.4 Density plot
      • 4.5.5 Joint plot
  • 5. Bokeh
    • 5.1 Basic Plotting
    • 5.2 Data Sources
    • 5.3 Annotations
    • 5.4 Categorical Data
    • 5.5 Presentation and Layouts
    • 5.6 Linking and Interactions
    • 5.7 Network Graph
    • 5.8 Widgets
  • 6. Plotly
    • 6.1 Fundamental Concepts
      • 6.1.1 Plotly Express
      • 6.1.2 Plotly Graph Objects
    • 6.2 Advanced Charts
      • 6.2.1 Advanced Scatter Chart
      • 6.2.2 Advanced Bar Chart
      • 6.2.3 Advanced Pie Chart
      • 6.2.4 Advanced Heatmap
      • 6.2.5 Sankey Chart
      • 6.2.6 Tables
    • 6.3 Statistical Charts
      • 6.3.1 Common Statistical Charts
      • 6.3.2 Dendrograms
      • 6.3.3 Radar Chart
      • 6.3.4 Polar Chart
      • 6.3.5 Streamline Chart
    • 6.4 Financial Charts
      • 6.4.1 Funnel Chart
      • 6.4.2 Candlestick Chart
      • 6.4.3 Waterfall Chart
  • Support
    • Donation
Powered by GitBook
On this page
  • 1. Normal distribution
  • 2. Bins
  • 3. Skewed Distribution
  • 4. Histogram Plot

Was this helpful?

  1. 3. Matplotlib

3.5 Histogram Chart

Previous3.4 Column ChartNext3.6 Scatter Chart

Last updated 4 years ago

Was this helpful?

A frequency distribution shows how often each different value in a set of data occurs. A histogram is the most commonly used graph to show frequency distributions and can be a great first step in understanding a dataset.

1. Normal distribution

data = np.random.randn(1000)
plt.style.use('ggplot')    # customize the chart style
plt.hist(data)

2. Bins

fig,(ax1,ax2) = plt.subplots(1,2,figsize=(20,6))

ax1.hist(data, alpha=0.5, bins=50,label  ='bins: 50',color = 'r')
ax1.legend()
ax2.hist(data, alpha=0.5, bins=200,  label = 'bins: 100',color = 'r')
ax2.legend()
plt.suptitle('Histogram Chart in different bins')

3. Skewed Distribution

The normal distribution is balanced and beautiful. However, In real life, sometimes it does not follow the law. For example, in the credit card case, while the vast majority of transactions are very low, this distribution is extremely skewed.

4. Histogram Plot

import seaborn as sns
tips = sns.load_dataset('tips')

If you are the owner of a restaurant, of course, you may have a fist of things that want to know about your business. For instance, how much money can I earn per day? What price zone can make my customer happy and also make me happy? If you are the waiter or waitress, you probably care about a very similar question. How many tips can I earn per table?

plt.rcParams.update({'font.size': 20}) #  customize the font size

# create a subplot includes one row and two columns, the two itmes share y_axis
fig, (ax1, ax2) = plt.subplots(1, 2, sharey = True, figsize = (12,6))

ax1.hist(df.tip,color ='c',alpha =0.5) # customize color
ax1.set_title('Tips Distribution')     # set title separately

ax2.hist(df.total_bill,color = 'b',alpha =0.5)
ax2.set_title('Bills Distribution')

ax1.set_ylabel('Frequency')          # set the y_axis name
ax1.set_xlabel('USD')             # set the x_axis name

With the histogram plot, questions can be answered easily. A waiter/waitress may earn 2 - 4 dollars per table, sometime you may have the good luck to earn 6 dollars a table. While the restaurant's price is very customer friendly, a nice meal only costs 15 - 25 dollars.

Let's use the "tips" dataset as an example. You can download it , or load it via seaborn package.

here
Figure: Normal Distribution
Figure: Histogram chart in different bins
Figure 1.4.2 Skewed Distribution
Figure 1.4.3 Tips and Bills Distribution