- 2.25.0 (latest)
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
DataFrameGroupBy(
    block: bigframes.core.blocks.Block,
    by_col_ids: typing.Sequence[str],
    *,
    selected_cols: typing.Optional[typing.Sequence[str]] = None,
    dropna: bool = True,
    as_index: bool = True
)Class for grouping and aggregating relational data.
Methods
agg
agg(
    func=None, **kwargs
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]Aggregate using one or more operations.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> data = {"A": [1, 1, 2, 2],
...         "B": [1, 2, 3, 4],
...         "C": [0.362838, 0.227877, 1.267767, -0.562860]}
>>> df = bpd.DataFrame(data)
The aggregation is for each column.
>>> df.groupby('A').agg('min')
    B         C
A
1  1  0.227877
2  3  -0.56286
<BLANKLINE>
[2 rows x 2 columns]
Multiple aggregations
>>> df.groupby('A').agg(['min', 'max'])
    B             C
       min max       min       max
A
1        1   2  0.227877  0.362838
2        3   4  -0.56286  1.267767
<BLANKLINE>
[2 rows x 4 columns]
| Parameter | |
|---|---|
| Name | Description | 
| func | function, str, list, dict or NoneFunction to use for aggregating the data. Accepted combinations are: - string function name - list of function names, e.g.  | 
aggregate
aggregate(
    func=None, **kwargs
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]Aggregate using one or more operations.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> data = {"A": [1, 1, 2, 2],
...         "B": [1, 2, 3, 4],
...         "C": [0.362838, 0.227877, 1.267767, -0.562860]}
>>> df = bpd.DataFrame(data)
The aggregation is for each column.
>>> df.groupby('A').aggregate('min')
    B         C
A
1  1  0.227877
2  3  -0.56286
<BLANKLINE>
[2 rows x 2 columns]
Multiple aggregations
>>> df.groupby('A').agg(['min', 'max'])
    B             C
       min max       min       max
A
1        1   2  0.227877  0.362838
2        3   4  -0.56286  1.267767
<BLANKLINE>
[2 rows x 4 columns]
| Parameter | |
|---|---|
| Name | Description | 
| func | function, str, list, dict or NoneFunction to use for aggregating the data. Accepted combinations are: - string function name - list of function names, e.g.  | 
all
all() -> bigframes.dataframe.DataFrameReturn True if all values in the group are true, else False.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([1, 2, 0], index=lst)
>>> ser.groupby(level=0).all()
a     True
b    False
dtype: boolean
For DataFrameGroupBy:
>>> data = [[1, 0, 3], [1, 5, 6], [7, 8, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                    index=["ostrich", "penguin", "parrot"])
>>> df.groupby(by=["a"]).all()
        b       c
a
1   False    True
7   True    True
<BLANKLINE>
[2 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | DataFrame or Series of boolean values, where a value is True if all elements are True within its respective group; otherwise False. | 
any
any() -> bigframes.dataframe.DataFrameReturn True if any value in the group is true, else False.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([1, 2, 0], index=lst)
>>> ser.groupby(level=0).any()
a     True
b    False
dtype: boolean
For DataFrameGroupBy:
>>> data = [[1, 0, 3], [1, 0, 6], [7, 1, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                    index=["ostrich", "penguin", "parrot"])
>>> df.groupby(by=["a"]).any()
        b       c
a
1   False    True
7   True    True
<BLANKLINE>
[2 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | DataFrame or Series of boolean values, where a value is True if any element is True within its respective group; otherwise False. | 
count
count() -> bigframes.dataframe.DataFrameCompute count of group, excluding missing values.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([1, 2, np.nan], index=lst)
>>> ser.groupby(level=0).count()
a     2
b     0
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, np.nan, 3], [1, np.nan, 6], [7, 8, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                    index=["cow", "horse", "bull"])
>>> df.groupby(by=["a"]).count()
   b  c
a
1  0  2
7  1  1
<BLANKLINE>
[2 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Count of values within each group. | 
cumcount
cumcount(ascending: bool = True)Number each item in each group from 0 to the length of that group - 1. (DataFrameGroupBy functionality is not yet available.)
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b', 'c']
>>> ser = bpd.Series([5, 1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).cumcount()
a    0
a    1
b    0
b    1
c    0
dtype: Int64
>>> ser.groupby(level=0).cumcount(ascending=False)
a    0
a    1
b    0
b    1
c    0
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| ascending | bool, default TrueIf False, number in reverse, from length of group - 1 to 0. | 
| Returns | |
|---|---|
| Type | Description | 
| Series | Sequence number of each element within each group. | 
cummax
cummax(
    *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative max for each group.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([6, 2, 0], index=lst)
>>> ser.groupby(level=0).cummax()
a    6
a    6
b    0
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["fox", "gorilla", "lion"])
>>> df.groupby("a").cummax()
         b  c
fox      8  2
gorilla  8  5
lion     6  9
<BLANKLINE>
[3 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative max for each group. | 
cummin
cummin(
    *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative min for each group.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([6, 2, 0], index=lst)
>>> ser.groupby(level=0).cummin()
a    6
a    2
b    0
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["fox", "gorilla", "lion"])
>>> df.groupby("a").cummin()
         b  c
fox      8  2
gorilla  2  2
lion     6  9
<BLANKLINE>
[3 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative min for each group. | 
cumprod
cumprod(*args, **kwargs) -> bigframes.dataframe.DataFrameCumulative product for each group.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([6, 2, 0], index=lst)
>>> ser.groupby(level=0).cumprod()
a     6.0
a    12.0
b     0.0
dtype: Float64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["cow", "horse", "bull"])
>>> df.groupby("a").cumprod()
          b     c
cow     8.0   2.0
horse  16.0  10.0
bull    6.0   9.0
<BLANKLINE>
[3 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative product for each group. | 
cumsum
cumsum(
    *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative sum for each group.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b']
>>> ser = bpd.Series([6, 2, 0], index=lst)
>>> ser.groupby(level=0).cumsum()
a    6
a    8
b    0
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["fox", "gorilla", "lion"])
>>> df.groupby("a").cumsum()
          b  c
fox       8  2
gorilla  10  7
lion      6  9
<BLANKLINE>
[3 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative sum for each group. | 
diff
diff(periods=1) -> bigframes.series.SeriesFirst discrete difference of element. Calculates the difference of each element compared with another element in the group (default is element in previous row).
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'b', 'b', 'b']
>>> ser = bpd.Series([7, 2, 8, 4, 3, 3], index=lst)
>>> ser.groupby(level=0).diff()
a    <NA>
a      -5
a       6
b    <NA>
b      -1
b       0
dtype: Int64
For DataFrameGroupBy:
>>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': [1, 4, 8, 4, 4, 2, 1]}
>>> df = bpd.DataFrame(data, index=['dog', 'dog', 'dog',
...                   'mouse', 'mouse', 'mouse', 'mouse'])
>>> df.groupby(level=0).diff()
          a     b
dog    <NA>  <NA>
dog       2     3
dog       2     4
mouse  <NA>  <NA>
mouse     0     0
mouse     1    -2
mouse    -5    -1
<BLANKLINE>
[7 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | First differences. | 
expanding
expanding(min_periods: int = 1) -> bigframes.core.window.WindowProvides expanding functionality.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'c', 'c', 'e']
>>> ser = bpd.Series([1, 0, -2, -1, 2], index=lst)
>>> ser.groupby(level=0).expanding().min()
index  index
a      a         1
       a         0
c      c        -2
       c        -2
e      e         2
dtype: Int64
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | An expanding grouper, providing expanding functionality per group. | 
head
head(n: int = 5) -> bigframes.dataframe.DataFrameAPI documentation for head method.
kurt
kurt(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameReturn unbiased kurtosis over requested axis.
Kurtosis obtained using Fisher's definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b']
>>> ser = bpd.Series([0, 1, 1, 0, 0, 1, 2, 4, 5], index=lst)
>>> ser.groupby(level=0).kurt()
a        -6.0
b   -1.963223
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
kurtosis
kurtosis(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameReturn unbiased kurtosis over requested axis.
Kurtosis obtained using Fisher's definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b']
>>> ser = bpd.Series([0, 1, 1, 0, 0, 1, 2, 4, 5], index=lst)
>>> ser.groupby(level=0).kurtosis()
a        -6.0
b   -1.963223
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
max
max(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute max of group values.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b']
>>> ser = bpd.Series([1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).max()
a     2
b     4
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 5, 8], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                    index=["tiger", "leopard", "cheetah", "lion"])
>>> df.groupby(by=["a"]).max()
   b  c
a
1  8  5
2  6  9
<BLANKLINE>
[2 rows x 2 columns]
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed max of values within each group. | 
mean
mean(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute mean of groups, excluding missing values.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'A': [1, 1, 2, 1, 2],
...                    'B': [np.nan, 2, 3, 4, 5],
...                    'C': [1, 2, 1, 1, 2]}, columns=['A', 'B', 'C'])
Groupby one column and return the mean of the remaining columns in each group.
>>> df.groupby('A').mean()
    B         C
A
1  3.0  1.333333
2  4.0       1.5
<BLANKLINE>
[2 rows x 2 columns]
Groupby two columns and return the mean of the remaining column.
>>> df.groupby(['A', 'B']).mean()
         C
A B
1 2.0  2.0
  4.0  1.0
2 3.0  1.0
  5.0  2.0
<BLANKLINE>
[4 rows x 1 columns]
Groupby one column and return the mean of only particular column in the group.
>>> df.groupby('A')['B'].mean()
A
1    3.0
2    4.0
Name: B, dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series or pandas.DataFrame | Mean of groups. | 
median
median(
    numeric_only: bool = False, *, exact: bool = True
) -> bigframes.dataframe.DataFrameCompute median of groups, excluding missing values.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'b', 'b', 'b']
>>> ser = bpd.Series([7, 2, 8, 4, 3, 3], index=lst)
>>> ser.groupby(level=0).median()
a    7.0
b    3.0
dtype: Float64
For DataFrameGroupBy:
>>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': [1, 4, 8, 4, 4, 2, 1]}
>>> df = bpd.DataFrame(data, index=['dog', 'dog', 'dog',
...                    'mouse', 'mouse', 'mouse', 'mouse'])
>>> df.groupby(level=0).median()
        a    b
dog    3.0  4.0
mouse  7.0  3.0
<BLANKLINE>
[2 rows x 2 columns]
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| exact | bool, default TrueCalculate the exact median instead of an approximation. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series or pandas.DataFrame | Median of groups. | 
min
min(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute min of group values.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b']
>>> ser = bpd.Series([1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).min()
a     1
b     3
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 5, 8], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                    index=["tiger", "leopard", "cheetah", "lion"])
>>> df.groupby(by=["a"]).min()
   b  c
a
1  2  2
2  5  8
<BLANKLINE>
[2 rows x 2 columns]
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed min of values within each group. | 
nunique
nunique() -> bigframes.dataframe.DataFrameReturn DataFrame with counts of unique elements in each position.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'id': ['spam', 'egg', 'egg', 'spam',
...                           'ham', 'ham'],
...                    'value1': [1, 5, 5, 2, 5, 5],
...                    'value2': list('abbaxy')})
>>> df.groupby('id').nunique()
      value1  value2
id
egg        1       1
ham        1       2
spam       2       1
<BLANKLINE>
[3 rows x 2 columns]
prod
prod(numeric_only: bool = False, min_count: int = 0)Compute prod of group values. (DataFrameGroupBy functionality is not yet available.)
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b']
>>> ser = bpd.Series([1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).prod()
a     2.0
b    12.0
dtype: Float64
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed prod of values within each group. | 
quantile
quantile(
    q: typing.Union[float, typing.Sequence[float]] = 0.5, *, numeric_only: bool = False
) -> bigframes.dataframe.DataFrameReturn group values at the given quantile, a la numpy.percentile.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([
...     ['a', 1], ['a', 2], ['a', 3],
...     ['b', 1], ['b', 3], ['b', 5]
... ], columns=['key', 'val'])
>>> df.groupby('key').quantile()
     val
key
a    2.0
b    3.0
<BLANKLINE>
[2 rows x 1 columns]
| Parameters | |
|---|---|
| Name | Description | 
| q | float or array-like, default 0.5 (50% quantile)Value(s) between 0 and 1 providing the quantile(s) to compute. | 
| numeric_only | bool, default FalseInclude only  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Return type determined by caller of GroupBy object. | 
rolling
rolling(window: int, min_periods=None) -> bigframes.core.window.WindowReturns a rolling grouper, providing rolling functionality per group.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'a', 'e']
>>> ser = bpd.Series([1, 0, -2, -1, 2], index=lst)
>>> ser.groupby(level=0).rolling(2).min()
index  index
a      a        <NA>
    a           0
    a          -2
    a          -2
e      e        <NA>
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| min_periods | int, default NoneMinimum number of observations in window required to have a value; otherwise, result is  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Return a new grouper with our rolling appended. | 
shift
shift(periods=1) -> bigframes.series.SeriesShift each group by periods observations.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b']
>>> ser = bpd.Series([1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).shift(1)
a    <NA>
a       1
b    <NA>
b       3
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 2, 3], [1, 5, 6], [2, 5, 8], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["tuna", "salmon", "catfish", "goldfish"])
>>> df.groupby("a").shift(1)
             b     c
tuna      <NA>  <NA>
salmon       2     3
catfish   <NA>  <NA>
goldfish     5     8
<BLANKLINE>
[4 rows x 2 columns]
| Parameter | |
|---|---|
| Name | Description | 
| periods | int, default 1Number of periods to shift. | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Object shifted within each group. | 
size
size() -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]API documentation for size method.
skew
skew(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameReturn unbiased skew within groups.
Normalized by N-1.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series([390., 350., 357., np.nan, 22., 20., 30.],
...                  index=['Falcon', 'Falcon', 'Falcon', 'Falcon',
...                         'Parrot', 'Parrot', 'Parrot'],
...                  name="Max Speed")
>>> ser.groupby(level=0).skew()
Falcon    1.525174
Parrot    1.457863
Name: Max Speed, dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
std
std(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameCompute standard deviation of groups, excluding missing values.
For multiple groupings, the result index will be a MultiIndex.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'b', 'b', 'b']
>>> ser = bpd.Series([7, 2, 8, 4, 3, 3], index=lst)
>>> ser.groupby(level=0).std()
a     3.21455
b     0.57735
dtype: Float64
For DataFrameGroupBy:
>>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': [1, 4, 8, 4, 4, 2, 1]}
>>> df = bpd.DataFrame(data, index=['dog', 'dog', 'dog',
...                    'mouse', 'mouse', 'mouse', 'mouse'])
>>> df.groupby(level=0).std()
              a         b
dog         2.0  3.511885
mouse  2.217356       1.5
<BLANKLINE>
[2 rows x 2 columns]
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Standard deviation of values within each group. | 
sum
sum(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute sum of group values.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'b', 'b']
>>> ser = bpd.Series([1, 2, 3, 4], index=lst)
>>> ser.groupby(level=0).sum()
a     3
b     7
dtype: Int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 5, 8], [2, 6, 9]]
>>> df = bpd.DataFrame(data, columns=["a", "b", "c"],
...                   index=["tiger", "leopard", "cheetah", "lion"])
>>> df.groupby("a").sum()
    b   c
a
1  10   7
2  11  17
<BLANKLINE>
[2 rows x 2 columns]
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed sum of values within each group. | 
var
var(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameCompute variance of groups, excluding missing values.
For multiple groupings, the result index will be a MultiIndex.
Examples:
For SeriesGroupBy:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> lst = ['a', 'a', 'a', 'b', 'b', 'b']
>>> ser = bpd.Series([7, 2, 8, 4, 3, 3], index=lst)
>>> ser.groupby(level=0).var()
a   10.333333
b    0.333333
dtype: Float64
For DataFrameGroupBy:
>>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': [1, 4, 8, 4, 4, 2, 1]}
>>> df = bpd.DataFrame(data, index=['dog', 'dog', 'dog',
...                    'mouse', 'mouse', 'mouse', 'mouse'])
>>> df.groupby(level=0).var()
              a          b
dog         4.0  12.333333
mouse  4.916667       2.25
<BLANKLINE>
[2 rows x 2 columns]
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  |