plydata.helper_verbs.arrange_at¶
-
class
plydata.helper_verbs.
arrange_at
(*args, **kwargs)[source]¶ Arrange by specific columns
- Parameters
- data
dataframe
, optional Useful when not using the
>>
operator.- names
tuple
ordict
Names of columns in dataframe. If a tuple, they should be names of columns. If a
dict
, they keys must be in.- startswithstr or tuple, optional
All column names that start with this string will be included.
- endswithstr or tuple, optional
All column names that end with this string will be included.
- containsstr or tuple, optional
All column names that contain with this string will be included.
- matchesstr or regex or tuple, optional
All column names that match the string or a compiled regex pattern will be included. A tuple can be used to match multiple regexs.
- dropbool, optional
If
True
, the selection is inverted. The unspecified/unmatched columns are returned instead. Default isFalse
.
- functions
callable()
ortuple
ordict
orstr
, optional Functions to alter the columns before they are sorted:
function (any callable) - Function is applied to the column and the result columns replace the original columns.
tuple
of functions - Each function is applied to all of the columns and the name (__name__
) of the function is postfixed to resulting column names.dict
of the form{'name': function}
- Allows you to apply one or more functions and also control the postfix to the name.str
- String can be used for more complex statements, but the resulting names will be terrible.
Note that, the functions do not change the data, they only affect the sorting.
- args
tuple
Arguments to the functions. The arguments are pass to all functions.
- kwargs
dict
Keyword arguments to the functions. The keyword arguments are passed to all functions.
- data
Notes
Do not use functions that change the order of the values in the array. Such functions are most likely the wrong candidates, they corrupt the data. Use function(s) that return values that can be sorted.
Examples
>>> import pandas as pd >>> import numpy as np >>> from plydata import * >>> df = pd.DataFrame({ ... 'alpha': list('aaabbb'), ... 'beta': list('babruq'), ... 'theta': list('cdecde'), ... 'x': [1, 2, 3, 4, 5, 6], ... 'y': [6, 5, 4, 3, 2, 1], ... 'z': [7, 9, 11, 8, 10, 12] ... })
Arrange by explictily naming the columns to arrange by. This is not much different from
arrange
.>>> df >> arrange_at(('alpha', 'z')) alpha beta theta x y z 0 a b c 1 6 7 1 a a d 2 5 9 2 a b e 3 4 11 3 b r c 4 3 8 4 b u d 5 2 10 5 b q e 6 1 12
Arrange by dynamically selecting the columns to arrange by. Here we the selection is beta and theta.
>>> df >> arrange_at(dict(contains='eta')) alpha beta theta x y z 1 a a d 2 5 9 0 a b c 1 6 7 2 a b e 3 4 11 5 b q e 6 1 12 3 b r c 4 3 8 4 b u d 5 2 10
In descending order.
>>> (df ... >> arrange_at( ... dict(contains='eta'), ... pd.Series.rank, ascending=False) ... ) alpha beta theta x y z 4 b u d 5 2 10 3 b r c 4 3 8 5 b q e 6 1 12 2 a b e 3 4 11 0 a b c 1 6 7 1 a a d 2 5 9