5.2 Data Sources

In this section, we will explain several ways of providing data for plots.

1. Providing directly

In Bokeh, it is possible to pass lists of values directly into plotting functions. In the example below, the data, x_values and y_values, are passed directly to the circle plotting method.

from bokeh.plotting import figure

x_values = [1, 2, 3, 4, 5]
y_values = [6, 7, 2, 3, 6]

p = figure()
p.circle(x=x_values, y=y_values)

When you pass in data like this, Bokeh works behind the scenes to make a ColumnDataSource for you. It is the core of most Bokeh plots, providing the data that is visualized by the glyphs of the plot. With theColumnDataSource , it is easy to share data between multiple plots and widgets.

2. ColumnDataSource

At the most basic level, a ColumnDataSource is simply a mapping between column names and lists of data. Once theColumnDataSourcehas been created, it can be passed into thesourceparameter of plotting methods which allows you to pass a column’s name as a stand-in for the data values.

from bokeh.plotting import figure
from bokeh.models import ColumnDataSource

data = {'x_value': [1, 2, 3, 4, 5],
        'y_values': [6, 7, 2, 3, 6]}

source = ColumnDataSource(data=data)

p = figure()
p.circle(x='x_values', y='y_values', source=source)

3. Pandas

The dataparameter can also be a Pandas DataFrame or GroupBy object.

source = ColumnDataSource(df)

Pandas GroupBy.

group = df.groupby(('colA', 'ColB'))
source = ColumnDataSource(group)

4. Streaming

ColumnDataSource streaming is an efficient way to append new data to a CDS. By using thestreammethod, Bokeh only sends new data to the browser instead of the entire dataset.

The stream method takes a new_data parameter containing a dict mapping column names to sequences of data to be appended to the respective columns.

source = ColumnDataSource(data=dict(foo=[], bar=[]))

# has new, identical-length updates for all columns in source
new_data = {
    'foo' : [10, 20],
    'bar' : [100, 200],
}

source.stream(new_data)

Last updated