plydata.tidy.unite

class plydata.tidy.unite(*args, **kwargs)[source]

Join multiple columns into one

Parameters
datadataframe, optional

Useful when not using the >> operator.

colstr

Name of new column

*unite_colslist-like | select | str | slice

Columns to join. Uses select.

sepstr

Separator between values. Default is _.

removebool

If True, remove the input columns from the output dataframe.

na_rmbool

If True, missing values will be removed prior to uniting each value.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({
...     'c1': [1, 2, 3, 4, None],
...     'c2': list('abcde'),
...     'c3': list('vwxyz')
... })
>>> df
    c1 c2 c3
0  1.0  a  v
1  2.0  b  w
2  3.0  c  x
3  4.0  d  y
4  NaN  e  z
>>> df >> unite('c1c2', 'c1', 'c2')
    c1c2  c3
0  1.0_a   v
1  2.0_b   w
2  3.0_c   x
3  4.0_d   y
4  nan_e   z
>>> df >> unite('c1c2', 'c1', 'c2', na_rm=True)
    c1c2  c3
0  1.0_a   v
1  2.0_b   w
2  3.0_c   x
3  4.0_d   y
4      e   z
>>> df >> unite('c2c3', 'c2', 'c3', sep=',')
    c1 c2c3
0  1.0  a,v
1  2.0  b,w
2  3.0  c,x
3  4.0  d,y
4  NaN  e,z
>>> df >> unite('c2c3', 'c2', 'c3', remove=False)
    c1 c2c3 c2 c3
0  1.0  a_v  a  v
1  2.0  b_w  b  w
2  3.0  c_x  c  x
3  4.0  d_y  d  y
4  NaN  e_z  e  z

You can choose columns in all ways that select can understand and you can also pass a select verb directly.

>>> df >> unite('c2c3', '-c1')
    c1 c2c3
0  1.0  a_v
1  2.0  b_w
2  3.0  c_x
3  4.0  d_y
4  NaN  e_z
>>> df >> unite('c2c3', select(matches=r'c[23]$'))
    c1 c2c3
0  1.0  a_v
1  2.0  b_w
2  3.0  c_x
3  4.0  d_y
4  NaN  e_z