👀
Crash Visualization
  • Welcome
  • Preface
    • Who the book is written for
    • How the book is organized
  • 1. Introduction of Data Visualization
    • 1.1 What is data visualization?
    • 1.2 Why does visualization matter?
  • 2. Tricks in Visualization
    • 2.1 Choose Appropriate Chart
    • 2.2 Features of Charts
      • 2.2.1 Table
      • 2.2.2 Column Chart
      • 2.2.3 Line Chart
      • 2.2.4 Pie Chart
      • 2.2.5 Scatter Chart
      • 2.2.6 Map Chart
    • 2.3 Misused Graph
    • 2.4 Tips in Visualization
  • 3. Matplotlib
    • 3.1 Basic Concepts
    • 3.2 Line Chart
    • 3.3 Area Chart
    • 3.4 Column Chart
    • 3.5 Histogram Chart
    • 3.6 Scatter Chart
    • 3.7 Lollipop Chart
    • 3.8 Pie Chart
    • 3.9 Venn Chart
    • 3.10 Waffle Chart
    • 3.11 Animation
  • 4. Seaborn
    • 4.1 Trends
    • 4.2 Ranking
      • 4.2.1 Barplot
      • 4.2.2 Boxplot
    • 4.3 Composition
      • 4.3.1 Stacked Chart
    • 4.4 Correlation
      • 4.4.1 Scatter Plot
      • 4.4.2 Linear Relationship
      • 4.4.3 Heatmap
      • 4.4.4 Pairplot
    • 4.5 Distribution
      • 4.5.1 Boxplot
      • 4.5.2 Violin plot
      • 4.5.3 Histogram plot
      • 4.5.4 Density plot
      • 4.5.5 Joint plot
  • 5. Bokeh
    • 5.1 Basic Plotting
    • 5.2 Data Sources
    • 5.3 Annotations
    • 5.4 Categorical Data
    • 5.5 Presentation and Layouts
    • 5.6 Linking and Interactions
    • 5.7 Network Graph
    • 5.8 Widgets
  • 6. Plotly
    • 6.1 Fundamental Concepts
      • 6.1.1 Plotly Express
      • 6.1.2 Plotly Graph Objects
    • 6.2 Advanced Charts
      • 6.2.1 Advanced Scatter Chart
      • 6.2.2 Advanced Bar Chart
      • 6.2.3 Advanced Pie Chart
      • 6.2.4 Advanced Heatmap
      • 6.2.5 Sankey Chart
      • 6.2.6 Tables
    • 6.3 Statistical Charts
      • 6.3.1 Common Statistical Charts
      • 6.3.2 Dendrograms
      • 6.3.3 Radar Chart
      • 6.3.4 Polar Chart
      • 6.3.5 Streamline Chart
    • 6.4 Financial Charts
      • 6.4.1 Funnel Chart
      • 6.4.2 Candlestick Chart
      • 6.4.3 Waterfall Chart
  • Support
    • Donation
Powered by GitBook
On this page
  • 1. Simple Scatterplot
  • 2. Differentiate groups by color
  • 3. Differentiate groups by color and marker
  • 4. Categorical Scatterplot
  • 5. Differentiate the quantitative variable by size
  • 6. Differentiate the quantitative variable by size and color
  • 7. More complex

Was this helpful?

  1. 4. Seaborn
  2. 4.4 Correlation

4.4.1 Scatter Plot

Previous4.4 CorrelationNext4.4.2 Linear Relationship

Last updated 4 years ago

Was this helpful?

A scatter plot uses dots to represent values for two different numeric variables. The position of each dot on the horizontal and vertical axis indicates values for an individual data point. Scatter plots are used to observe relationships between variables.

The primary uses of the scatter plot are to observe and show relationships between two numeric variables. The dots in a scatter plot not only report the values of individual data points but also patterns when the data are taken as a whole.

import seaborn as sns    #  import the library
plt.style.use('seaborn-pastel')  #  set the style  and color palette

# set the font size and figure size at once
plt.rcParams.update({'font.size': 18,'figure.figsize':(8, 6)})

1. Simple Scatterplot

tips = sns.load_dataset("tips")   #  load the embedded  dataset

# Draw a simple scatter plot between two variables
sns.scatterplot(x="total_bill", y="tip", data=tips,s = 80)

2. Differentiate groups by color

fferentiate Groups by Color# Group by another variable and show the groups with different colors:
sns.scatterplot(x="total_bill", y="tip", hue="time", data=tips, s= 80)

3. Differentiate groups by color and marker

# Show the grouping variable by varying both color and marker:
sns.scatterplot(x="total_bill", y="tip", hue="time", style="time", data=tips, s= 200)
sns.relplot(x="total_bill", y="tip",
                 col="time", hue="day", style="day",
                 kind="scatter", data=tips)

4. Categorical Scatterplot

The swarm plot can draw a categorical scatterplot with non-overlapping points. It gives a better representation of the distribution of values. In addition, it is also a good complement to a box or violin plot in cases where you want to show all observations along with some representation of the underlying distribution.

sns.swarmplot(x="day", y="tip", hue="time",
              palette=["r", "b", "y",'m'], data=  tips)

5. Differentiate the quantitative variable by size

# Load the example mpg dataset
mpg = sns.load_dataset("mpg")
sns.scatterplot(x="horsepower", y="mpg", size="weight", data=mpg)

6. Differentiate the quantitative variable by size and color

sns.scatterplot(x="horsepower", y="mpg", hue ='origin',size="weight", data=mpg)

7. More complex

# Plot miles per gallon against horsepower with other semantics

sns.relplot(x="horsepower", y="mpg", hue="origin", size="weight",
            sizes=(40, 400), alpha=.7, palette="muted",
            height=6, data=mpg)