plydata.cat_tools.cat_infreq

plydata.cat_tools.cat_infreq(c, ordered=None)[source]

Reorder categorical by frequency of the values

Parameters
clist-like

Values that will make up the categorical.

orderedbool

If True, the categorical is ordered.

Returns
outcategorical

Values

Examples

>>> x = ['d', 'a', 'b', 'b', 'c', 'c', 'c']
>>> cat_infreq(x)
['d', 'a', 'b', 'b', 'c', 'c', 'c']
Categories (4, object): ['c', 'b', 'd', 'a']
>>> cat_infreq(x, ordered=True)
['d', 'a', 'b', 'b', 'c', 'c', 'c']
Categories (4, object): ['c' < 'b' < 'd' < 'a']

When two or more values occur the same number of times, if the categorical is ordered, the order is preserved. If it is not not ordered, the order depends on that of the values. Above 'd' comes before 'a', and below 'a' comes before 'a'.

>>> c = pd.Categorical(
...     x, categories=['a', 'c', 'b', 'd']
... )
>>> cat_infreq(c)
['d', 'a', 'b', 'b', 'c', 'c', 'c']
Categories (4, object): ['c', 'b', 'a', 'd']
>>> cat_infreq(c.set_ordered(True))
['d', 'a', 'b', 'b', 'c', 'c', 'c']
Categories (4, object): ['c' < 'b' < 'a' < 'd']