plydata.helper_verbs.count¶

class plydata.helper_verbs.count(*args, **kwargs)[source]¶

Count observations by group

count is a convenient wrapper for summarise that will either call n or sum(n) depending on whether you’re tallying for the first time, or re-tallying. Similar to tally, but it does the group_by for you.

Parameters

datadataframe, optional: Useful when not using the >> operator.
*argsstr, list: Columns to group by.
weightsstr or array-like, optional: Weight of each row in the group.
sortbool, optional: If True, sort the resulting data in descending order.

Examples

>>> import pandas as pd
>>> from plydata import count, group_by, summarize
>>> df = pd.DataFrame({
...     'x': [1, 2, 3, 4, 5, 6],
...     'y': ['a', 'b', 'a', 'b', 'a', 'b'],
...     'w': [1, 2, 1, 2, 1, 2]})

Without groups it is one large group

>>> df >> count()
   n
0  6

Sum of the weights

>>> df >> count(weights='w')
    n
0   9

With groups

>>> df >> count('y')
   y  n
0  a  3
1  b  3

With groups and weights

>>> df >> count('y', weights='w')
   y  n
0  a  3
1  b  6

Applying the weights to a column

>>> df >> count('y', weights='x*w')
   y  n
0  a  9
1  b 24

You can do that with summarize

>>> df >> group_by('y') >> summarize(n='sum(x*w)')
   y  n
0  a  9
1  b 24