plydata.helper_verbs.select_if

class plydata.helper_verbs.select_if(*args, **kwargs)[source]

Select all columns that match a predicate

Parameters
datadataframe, optional

Useful when not using the >> operator.

predicatefunction

A predicate function to be applied to the columns of the dataframe. Good candidates for predicate functions are those that check the type of the column. Such function are avaible at pandas.api.dtypes, for example pandas.api.types.is_numeric_dtype().

For convenience, you can reference the is_*_dtype functions with shorter strings:

'is_bool'             # pandas.api.types.is_bool_dtype
'is_categorical'      # pandas.api.types.is_categorical_dtype
'is_complex'          # pandas.api.types.is_complex_dtype
'is_datetime64_any'   # pandas.api.types.is_datetime64_any_dtype
'is_datetime64'       # pandas.api.types.is_datetime64_dtype
'is_datetime64_ns'    # pandas.api.types.is_datetime64_ns_dtype
'is_datetime64tz'     # pandas.api.types.is_datetime64tz_dtype
'is_float'            # pandas.api.types.is_float_dtype
'is_int64'            # pandas.api.types.is_int64_dtype
'is_integer'          # pandas.api.types.is_integer_dtype
'is_interval'         # pandas.api.types.is_interval_dtype
'is_numeric'          # pandas.api.types.is_numeric_dtype
'is_object'           # pandas.api.types.is_object_dtype
'is_period'           # pandas.api.types.is_period_dtype
'is_signed_integer'   # pandas.api.types.is_signed_integer_dtype
'is_string'           # pandas.api.types.is_string_dtype
'is_timedelta64'      # pandas.api.types.is_timedelta64_dtype
'is_timedelta64_ns'   # pandas.api.types.is_timedelta64_ns_dtype
'is_unsigned_integer' # pandas.api.types.is_unsigned_integer_dtype

No other string values are allowed.

functioncallable()

Function to rename the column(s).

argstuple

Arguments to the functions. The arguments are pass to all functions.

kwargsdict

Keyword arguments to the functions. The keyword arguments are passed to all functions.

All sorted column names to uppercase

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from plydata import *
>>> df = pd.DataFrame({
...     'alpha': list('aaabbb'),
...     'beta': list('babruq'),
...     'theta': list('cdecde'),
...     'x': [1, 2, 3, 4, 5, 6],
...     'y': [6, 5, 4, 3, 2, 1],
...     'z': [7, 9, 11, 8, 10, 12]
... })

Select all sorted columns and convert names to upper case

>>> def is_sorted(col):
...     a = col.values
...     return all(a[:-1] <= a[1:])
>>> df >> select_if(is_sorted, str.upper)
  ALPHA  X
0     a  1
1     a  2
2     a  3
3     b  4
4     b  5
5     b  6

Group columns are always selected.

>>> df >> group_by('beta') >> select_if(is_sorted, str.upper)
groups: ['beta']
  beta ALPHA  X
0    b     a  1
1    a     a  2
2    b     a  3
3    r     b  4
4    u     b  5
5    q     b  6