- 2.25.0 (latest)
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
PlotAccessor(data)Make plots of Series or DataFrame with the matplotlib backend.
Examples: For Series:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series([1, 2, 3, 3])
>>> plot = ser.plot(kind='hist', title="My plot")
For DataFrame:
>>> df = bpd.DataFrame({'length': [1.5, 0.5, 1.2, 0.9, 3],
... 'width': [0.7, 0.2, 0.15, 0.2, 1.1]},
... index=['pig', 'rabbit', 'duck', 'chicken', 'horse'])
>>> plot = df.plot(title="DataFrame Plot")
Parameters |
|
|---|---|
| Name | Description |
data |
Series or DataFrame
The object for which the method is called. |
kind |
str
The kind of plot to produce: - 'line' : line plot (default) - 'hist' : histogram - 'area' : area plot - 'scatter' : scatter plot (DataFrame only) |
Methods
area
area(
x: typing.Optional[typing.Hashable] = None,
y: typing.Optional[typing.Hashable] = None,
stacked: bool = True,
**kwargs
)Draw a stacked area plot. An area plot displays quantitative data visually.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
Draw an area plot based on basic business metrics:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(
... {
... 'sales': [3, 2, 3, 9, 10, 6],
... 'signups': [5, 5, 6, 12, 14, 13],
... 'visits': [20, 42, 28, 62, 81, 50],
... },
... index=["01-31", "02-28", "03-31", "04-30", "05-31", "06-30"]
... )
>>> ax = df.plot.area()
Area plots are stacked by default. To produce an unstacked plot,
pass stacked=False:
>>> ax = df.plot.area(stacked=False)
Draw an area plot for a single column:
>>> ax = df.plot.area(y='sales')
Draw with a different x:
>>> df = bpd.DataFrame({
... 'sales': [3, 2, 3],
... 'visits': [20, 42, 28],
... 'day': [1, 2, 3],
... })
>>> ax = df.plot.area(x='day')
| Parameters | |
|---|---|
| Name | Description |
x |
label or position, optional
Coordinates for the X axis. By default uses the index. |
y |
label or position, optional
Column to plot. By default uses all columns. |
stacked |
bool, default True
Area plots are stacked by default. Set to False to create a unstacked plot. |
sampling_n |
int, default 100
Number of random items for plotting. |
sampling_random_state |
int, default 0
Seed for random number generator. |
| Returns | |
|---|---|
| Type | Description |
matplotlib.axes.Axes or numpy.ndarray |
Area plot, or array of area plots if subplots is True. |
hist
hist(by: typing.Optional[typing.Sequence[str]] = None, bins: int = 10, **kwargs)Draw one histogram of the DataFrame’s columns.
A histogram is a representation of the distribution of data.
This function groups the values of all given Series in the DataFrame
into bins and draws all bins in one matplotlib.axes.Axes.
This is useful when the DataFrame's Series are in a similar scale.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(np.random.randint(1, 7, 6000), columns=['one'])
>>> df['two'] = np.random.randint(1, 7, 6000) + np.random.randint(1, 7, 6000)
>>> ax = df.plot.hist(bins=12, alpha=0.5)
| Parameters | |
|---|---|
| Name | Description |
by |
str or sequence, optional
Column in the DataFrame to group by. It is not supported yet. |
bins |
int, default 10
Number of histogram bins to be used. |
| Returns | |
|---|---|
| Type | Description |
class |
matplotlib.AxesSubplot: A histogram plot. |
line
line(
x: typing.Optional[typing.Hashable] = None,
y: typing.Optional[typing.Hashable] = None,
**kwargs
)Plot Series or DataFrame as lines. This function is useful to plot lines using DataFrame's values as coordinates.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(
... {
... 'one': [1, 2, 3, 4],
... 'three': [3, 6, 9, 12],
... 'reverse_ten': [40, 30, 20, 10],
... }
... )
>>> ax = df.plot.line(x='one')
| Parameters | |
|---|---|
| Name | Description |
x |
label or position, optional
Allows plotting of one column versus another. If not specified, the index of the DataFrame is used. |
y |
label or position, optional
Allows plotting of one column versus another. If not specified, all numerical columns are used. |
color |
str, array-like, or dict, optional
The color for each of the DataFrame's columns. Possible values are: - A single color string referred to by name, RGB or RGBA code, for instance 'red' or '#a98d19'. - A sequence of color strings referred to by name, RGB or RGBA code, which will be used for each column recursively. For instance ['green','yellow'] each column's %(kind)s will be filled in green or yellow, alternatively. If there is only a single column to be plotted, then only the first color from the color list will be used. - A dict of the form {column name : color}, so that each column will be colored accordingly. For example, if your columns are called |
sampling_n |
int, default 100
Number of random items for plotting. |
sampling_random_state |
int, default 0
Seed for random number generator. |
| Returns | |
|---|---|
| Type | Description |
matplotlib.axes.Axes or np.ndarray of them |
An ndarray is returned with one matplotlib.axes.Axes per column when subplots=True. |
scatter
scatter(
x: typing.Optional[typing.Hashable] = None,
y: typing.Optional[typing.Hashable] = None,
s: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
c: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
**kwargs
)Create a scatter plot with varying marker point size and color.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
Let's see how to draw a scatter plot using coordinates from the values in a DataFrame's columns.
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1],
... [6.4, 3.2, 1], [5.9, 3.0, 2]],
... columns=['length', 'width', 'species'])
>>> ax1 = df.plot.scatter(x='length',
... y='width',
... c='DarkBlue')
And now with the color determined by a column as well.
>>> ax2 = df.plot.scatter(x='length',
... y='width',
... c='species',
... colormap='viridis')
| Parameters | |
|---|---|
| Name | Description |
x |
int or str
The column name or column position to be used as horizontal coordinates for each point. |
y |
int or str
The column name or column position to be used as vertical coordinates for each point. |
s |
str, scalar or array-like, optional
The size of each point. Possible values are: - A string with the name of the column to be used for marker's size. - A single scalar so all points have the same size. |
c |
str, int or array-like, optional
The color of each point. Possible values are: - A single color string referred to by name, RGB or RGBA code, for instance 'red' or '#a98d19'. - A column name or position whose values will be used to color the marker points according to a colormap. |
sampling_n |
int, default 100
Number of random items for plotting. |
sampling_random_state |
int, default 0
Seed for random number generator. |
| Returns | |
|---|---|
| Type | Description |
matplotlib.axes.Axes or np.ndarray of them |
An ndarray is returned with one matplotlib.axes.Axes per column when subplots=True. |
PlotAccessor
PlotAccessor(data)Make plots of Series or DataFrame with the matplotlib backend.
Examples: For Series:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series([1, 2, 3, 3])
>>> plot = ser.plot(kind='hist', title="My plot")
For DataFrame:
>>> df = bpd.DataFrame({'length': [1.5, 0.5, 1.2, 0.9, 3],
... 'width': [0.7, 0.2, 0.15, 0.2, 1.1]},
... index=['pig', 'rabbit', 'duck', 'chicken', 'horse'])
>>> plot = df.plot(title="DataFrame Plot")
| Parameters | |
|---|---|
| Name | Description |
data |
Series or DataFrame
The object for which the method is called. |
kind |
str
The kind of plot to produce: - 'line' : line plot (default) - 'hist' : histogram - 'area' : area plot - 'scatter' : scatter plot (DataFrame only) |
| Returns | |
|---|---|
| Type | Description |
matplotlib.axes.Axes or np.ndarray of them |
An ndarray is returned with one matplotlib.axes.Axes per column when subplots=True. |