plydata.cat_tools.cat_anon

plydata.cat_tools.cat_anon(c, prefix='', random_state=None)[source]

Anonymise categories

Neither the value nor the order of the categories is preserved.

Parameters
clist-like

Values that will make up the categorical.

random_stateint or RandomState, optional

Seed or Random number generator to use. If None, then numpy global generator numpy.random is used.

Returns
outcategorical

Values

Examples

>>> np.random.seed(123)
>>> c = ['a', 'b', 'b', 'c', 'c', 'c']
>>> cat_anon(c)
['0', '1', '1', '2', '2', '2']
Categories (3, object): ['1', '0', '2']
>>> cat_anon(c, 'c-', 321)
['c-1', 'c-2', 'c-2', 'c-0', 'c-0', 'c-0']
Categories (3, object): ['c-0', 'c-2', 'c-1']
>>> cat_anon(pd.Categorical(c, ordered=True), 'c-', 321)
['c-1', 'c-2', 'c-2', 'c-0', 'c-0', 'c-0']
Categories (3, object): ['c-0' < 'c-2' < 'c-1']