plydata.helper_verbs.add_tally

class plydata.helper_verbs.add_tally(*args, **kwargs)[source]

Add column with tally of items in each group

Similar to tally, but it adds a column and does not collapse the groups.

Parameters
datadataframe, optional

Useful when not using the >> operator.

weightsstr or array-like, optional

Weight of each row in the group.

sortbool, optional

If True, sort the resulting data in descending order.

See also

add_count

Examples

>>> import pandas as pd
>>> from plydata import *
>>> df = pd.DataFrame({
...     'x': [1, 2, 3, 4, 5, 6],
...     'y': ['a', 'b', 'a', 'b', 'a', 'b'],
...     'w': [1, 2, 1, 2, 1, 2]})

Without groups it is one large group

>>> df >> add_tally()
   x  y  w  n
0  1  a  1  6
1  2  b  2  6
2  3  a  1  6
3  4  b  2  6
4  5  a  1  6
5  6  b  2  6

Sum of the weights

>>> df >> add_tally('w')
   x  y  w  n
0  1  a  1  9
1  2  b  2  9
2  3  a  1  9
3  4  b  2  9
4  5  a  1  9
5  6  b  2  9

With groups

>>> df >> group_by('y') >> add_tally()
groups: ['y']
   x  y  w  n
0  1  a  1  3
1  2  b  2  3
2  3  a  1  3
3  4  b  2  3
4  5  a  1  3
5  6  b  2  3

With groups and weights

>>> df >> group_by('y') >> add_tally('w')
groups: ['y']
   x  y  w  n
0  1  a  1  3
1  2  b  2  6
2  3  a  1  3
3  4  b  2  6
4  5  a  1  3
5  6  b  2  6

Applying the weights to a column

>>> df >> group_by('y') >> add_tally('x*w')
groups: ['y']
   x  y  w   n
0  1  a  1   9
1  2  b  2  24
2  3  a  1   9
3  4  b  2  24
4  5  a  1   9
5  6  b  2  24

Add tally is equivalent to using sum() or n() in define.

>>> df >> group_by('y') >> define(n='sum(x*w)')
groups: ['y']
   x  y  w   n
0  1  a  1   9
1  2  b  2  24
2  3  a  1   9
3  4  b  2  24
4  5  a  1   9
5  6  b  2  24
>>> df >> group_by('y') >> define(n='n()')
groups: ['y']
   x  y  w  n
0  1  a  1  3
1  2  b  2  3
2  3  a  1  3
3  4  b  2  3
4  5  a  1  3
5  6  b  2  3

Which is the same result as df >> group_by('y') >> add_tally() above.