plydata.helper_verbs.mutate_all¶
-
class
plydata.helper_verbs.
mutate_all
(*args, **kwargs)[source]¶ Modify all columns that are true for a predicate
- Parameters
- data
dataframe
, optional Useful when not using the
>>
operator.- functions
callable()
ortuple
ordict
orstr
Functions to alter the columns:
function (any callable) - Function is applied to the column and the result columns replace the original columns.
tuple
of functions - Each function is applied to all of the columns and the name (__name__
) of the function is postfixed to resulting column names.dict
of the form{'name': function}
- Allows you to apply one or more functions and also control the postfix to the name.str
- String can be used for more complex statements, but the resulting names will be terrible.
- args
tuple
Arguments to the functions. The arguments are pass to all functions.
- kwargs
dict
Keyword arguments to the functions. The keyword arguments are passed to all functions.
- data
Examples
>>> import pandas as pd >>> import numpy as np >>> from plydata import * >>> df = pd.DataFrame({ ... 'alpha': list('aaabbb'), ... 'beta': list('babruq'), ... 'theta': list('cdecde'), ... 'x': [1, 2, 3, 4, 5, 6], ... 'y': [6, 5, 4, 3, 2, 1], ... 'z': [7, 9, 11, 8, 10, 12] ... })
A single function with an argument
>>> df >> select('x', 'y', 'z') >> mutate_all(np.add, 10) x y z 0 11 16 17 1 12 15 19 2 13 14 21 3 14 13 18 4 15 12 20 5 16 11 22
A two functions that accept the same argument
>>> (df ... >> select('x', 'z') ... >> mutate_all((np.add, np.subtract), 10) ... ) x z x_add z_add x_subtract z_subtract 0 1 7 11 17 -9 -3 1 2 9 12 19 -8 -1 2 3 11 13 21 -7 1 3 4 8 14 18 -6 -2 4 5 10 15 20 -5 0 5 6 12 16 22 -4 2
Convert x, y and z from centimeters to inches and round the 2 decimal places.
>>> (df ... >> select('x', 'y', 'z') ... >> mutate_all(dict(inch=lambda col: np.round(col/2.54, 2))) ... ) x y z x_inch y_inch z_inch 0 1 6 7 0.39 2.36 2.76 1 2 5 9 0.79 1.97 3.54 2 3 4 11 1.18 1.57 4.33 3 4 3 8 1.57 1.18 3.15 4 5 2 10 1.97 0.79 3.94 5 6 1 12 2.36 0.39 4.72
Groupwise standardization of multiple variables.
>>> def scale(col): return (col - np.mean(col))/np.std(col) >>> (df ... >> group_by('alpha') ... >> select('x', 'y', 'z') ... >> mutate_all(scale)) groups: ['alpha'] alpha x y z 0 a -1.224745 1.224745 -1.224745 1 a 0.000000 0.000000 0.000000 2 a 1.224745 -1.224745 1.224745 3 b -1.224745 1.224745 -1.224745 4 b 0.000000 0.000000 0.000000 5 b 1.224745 -1.224745 1.224745