When dealing with a set of data, often the first thing you’ll want to do is get a sense of how the variables are distributed.
1. Basic Histogram
The most convenient way to take a quick look at a univariate distribution in seaborn is distplot(). By default, this will draw a histogram and fit a kernel density estimate (KDE).
# Create a simple dataset
d = np.random.multivariate_normal([1, 1], [[6, 2], [2, 2]], size=3000)
df = pd.DataFrame(d, columns=['S1', 'S2'])
2. Univariate Distribution
sns.distplot(df['S1'])
plt.xlabel('x')
Histograms are likely familiar, and a hist function already exists in matplotlib. A histogram represents the distribution of data by forming bins along with the range of the data and then drawing bars to show the number of observations that fall in each bin. We can try more or fewer bins that may reveal other features in the data.