plydata.helper_verbs.group_by_all¶
-
class
plydata.helper_verbs.
group_by_all
(*args, **kwargs)[source]¶ Groupby all columns
- Parameters
- data
dataframe
, optional Useful when not using the
>>
operator.- functions
callable()
ortuple
ordict
orstr
Functions to alter the columns:
function (any callable) - Function is applied to the column and the result columns replace the original columns.
tuple
of functions - Each function is applied to all of the columns and the name (__name__
) of the function is postfixed to resulting column names.dict
of the form{'name': function}
- Allows you to apply one or more functions and also control the postfix to the name.str
- String can be used for more complex statements, but the resulting names will be terrible.
- args
tuple
Arguments to the functions. The arguments are pass to all functions.
- kwargs
dict
Keyword arguments to the functions. The keyword arguments are passed to all functions.
- data
Examples
>>> import pandas as pd >>> import numpy as np >>> from plydata import * >>> df = pd.DataFrame({ ... 'alpha': list('aaabbb'), ... 'beta': list('babruq'), ... 'theta': list('cdecde'), ... 'x': [1, 2, 3, 4, 5, 6], ... 'y': [6, 5, 4, 3, 2, 1], ... 'z': [7, 9, 11, 8, 10, 12] ... })
Grouping by all the columns
>>> df >> group_by_all() groups: ['alpha', 'beta', 'theta', 'x', 'y', 'z'] alpha beta theta x y z 0 a b c 1 6 7 1 a a d 2 5 9 2 a b e 3 4 11 3 b r c 4 3 8 4 b u d 5 2 10 5 b q e 6 1 12
Grouping by all columns created by a function. Same output as above, but now all the columns are categorical
>>> result = df >> group_by_all(pd.Categorical) >>> result groups: ['alpha', 'beta', 'theta', 'x', 'y', 'z'] alpha beta theta x y z 0 a b c 1 6 7 1 a a d 2 5 9 2 a b e 3 4 11 3 b r c 4 3 8 4 b u d 5 2 10 5 b q e 6 1 12 >>> result['x'] 0 1 1 2 2 3 3 4 4 5 5 6 Name: x, dtype: category Categories (6, int64): [1, 2, 3, 4, 5, 6]
If apply more than one function or provide a postfix, the original columns are retained.
>>> (df ... >> select('x', 'y', 'z') ... >> group_by_all(dict(cat=pd.Categorical))) groups: ['x_cat', 'y_cat', 'z_cat'] x y z x_cat y_cat z_cat 0 1 6 7 1 6 7 1 2 5 9 2 5 9 2 3 4 11 3 4 11 3 4 3 8 4 3 8 4 5 2 10 5 2 10 5 6 1 12 6 1 12