plydata.tidy.spread¶

class plydata.tidy.spread(*args, **kwargs)[source]¶

Spread a key-value pair across multiple columns

Parameters

datadataframe, optional: Useful when not using the >> operator.
keystr: Name of the variable column
valuestr: Name of the value column
sepstr: Charater(s) used to separate the column names. This is used to add a hierarchy and resolve duplicate column names.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({
...     'name': ['mary', 'oscar', 'martha', 'john'] * 2,
...     'subject': np.repeat(['math', 'art'], 4),
...     'grade': [92, 83, 85, 90, 75, 95, 80, 72]
... })
>>> df
     name subject  grade
0    mary    math     92
1   oscar    math     83
2  martha    math     85
3    john    math     90
4    mary     art     75
5   oscar     art     95
6  martha     art     80
7    john     art     72
>>> df >> spread('subject', 'grade')
     name  art  math
0    john   72    90
1  martha   80    85
2    mary   75    92
3   oscar   95    83