plydata.cat_tools.cat_collapse

plydata.cat_tools.cat_collapse(c, mapping, group_other=False)[source]

Collapse categories into manually defined groups

Parameters
clist-like

Values that will make up the categorical.

mappingdict

New categories and the old categories contained in them.

group_otherFalse

If True, a category is created to contain all other categories that have not been explicitly collapsed. The name of the other categories is other, it may be postfixed by the first available integer starting from 2 if there is a category with a similar name.

Returns
outcategorical

Values

Examples

>>> c = ['a', 'b', 'c', 'd', 'e', 'f']
>>> mapping = {'first_2': ['a', 'b'], 'second_2': ['c', 'd']}
>>> cat_collapse(c, mapping)
['first_2', 'first_2', 'second_2', 'second_2', 'e', 'f']
Categories (4, object): ['first_2', 'second_2', 'e', 'f']
>>> cat_collapse(c, mapping, group_other=True)
['first_2', 'first_2', 'second_2', 'second_2', 'other', 'other']
Categories (3, object): ['first_2', 'second_2', 'other']

Collapsing preserves the order

>>> cat_rev(c)
['a', 'b', 'c', 'd', 'e', 'f']
Categories (6, object): ['f', 'e', 'd', 'c', 'b', 'a']
>>> cat_collapse(cat_rev(c), mapping)
['first_2', 'first_2', 'second_2', 'second_2', 'e', 'f']
Categories (4, object): ['f', 'e', 'second_2', 'first_2']
>>> mapping = {'other': ['a', 'b'], 'another': ['c', 'd']}
>>> cat_collapse(c, mapping, group_other=True)
['other', 'other', 'another', 'another', 'other2', 'other2']
Categories (3, object): ['other', 'another', 'other2']