plydata.tidy.separate_rows¶
-
class
plydata.tidy.
separate_rows
(*args, **kwargs)[source]¶ Separate values of a variable along multiple rows
- Parameters
- data
dataframe
, optional Useful when not using the
>>
operator.- *colslist-like |
select
|str
|slice
Columns to be gathered and whose contents will make values.
- sep
str
|regex
The pattern at which to separate the variable. The default value separates on a string of non-alphanumeric characters.
- convertbool
If
True
convert result columns to int, float or bool
- data
Examples
>>> import pandas as pd >>> df = pd.DataFrame({ ... 'parent': ['martha', 'james', 'alice'], ... 'child': ['leah', 'joe,vinny,laura', 'pat,lee'], ... 'age': ['3', '12,6,4', '2,7'] ... }) >>> df parent child age 0 martha leah 3 1 james joe,vinny,laura 12,6,4 2 alice pat,lee 2,7 >>> df >> separate_rows('child', 'age') parent child age 0 martha leah 3 1 james joe 12 2 james vinny 6 3 james laura 4 4 alice pat 2 5 alice lee 7
Column selection uses
plydata.one_table_verbs.select
, so you can do:>>> df >> separate_rows('-parent') parent child age 0 martha leah 3 1 james joe 12 2 james vinny 6 3 james laura 4 4 alice pat 2 5 alice lee 7
or
>>> df >> separate_rows(select(matches=r'^[ac]')) parent child age 0 martha leah 3 1 james joe 12 2 james vinny 6 3 james laura 4 4 alice pat 2 5 alice lee 7
You can separate all columns by specifying any column. All columns should be separable.
>>> df[['child', 'age']] >> separate_rows() child age 0 leah 3 1 joe 12 2 vinny 6 3 laura 4 4 pat 2 5 lee 7