plydata.expressions.case_when¶
-
class
plydata.expressions.
case_when
[source]¶ Vectorized case
- Parameters
- args
mapping
, iterable (predicate, value) pairs, ordered from most specific to most general.
- kwargs
collections.OrderedDict
{predicate: value} pairs, ordered from most specific to most general.
- args
Notes
As
dict
classes are ordered, in python 3.6 and above you can get away with:df >> define(divisible=case_when({ 'x%2 == 0': 'x+200', 'x%3 == 0': 'x+300', True: -1 }))
However, be careful it may not always be the case.
Examples
>>> import pandas as pd >>> from plydata import define >>> from plydata.expressions import case_when >>> df = pd.DataFrame({'x': range(10)})
Here we use an iterable of tuples with key-value pairs for the predicate and value.
>>> df >> define(divisible=case_when([ ... ('x%2 == 0', 2), ... ('x%3 == 0', 3), ... (True, -1) ... ])) x divisible 0 0 2 1 1 -1 2 2 2 3 3 3 4 4 2 5 5 -1 6 6 2 7 7 -1 8 8 2 9 9 3
When the most general predicate comes first, it obscures the rest. Every row is matched by atmost one predicate function
>>> df >> define(divisible=case_when([ ... (True, -1), ... ('x%2 == 0', 2), ... ('x%3 == 0', 3) ... ])) x divisible 0 0 -1 1 1 -1 2 2 -1 3 3 -1 4 4 -1 5 5 -1 6 6 -1 7 7 -1 8 8 -1 9 9 -1
String values must be quoted
>>> df >> define(divisible=case_when([ ... ('x%2 == 0', '"by-2"'), ... ('x%3 == 0', '"by-3"'), ... (True, '"neither-by-2or3"') ... ])) x divisible 0 0 by-2 1 1 neither-by-2or3 2 2 by-2 3 3 by-3 4 4 by-2 5 5 neither-by-2or3 6 6 by-2 7 7 neither-by-2or3 8 8 by-2 9 9 by-3
The values can be expressions
>>> df >> define(divisible=case_when([ ... ('x%2 == 0', 'x+200'), ... ('x%3 == 0', 'x+300'), ... (True, -1) ... ])) x divisible 0 0 200 1 1 -1 2 2 202 3 3 303 4 4 204 5 5 -1 6 6 206 7 7 -1 8 8 208 9 9 309
Combining Predicates
When combining predicate statements, you can use the bitwise operators,
|
,&
,^
and~
. The different statements must be enclosed in parenthesis, --()
.>>> df >> define(y=case_when([ ... ('(x < 5) & (x % 2 == 0)', '"less-than-5-and-even"'), ... ('(x < 5) & (x % 2 != 0)', '"less-than-5-and-odd"'), ... ('(x > 5) & (x % 2 == 0)', '"greater-than-5-and-even"'), ... ('(x > 5) & (x % 2 != 0)', '"greater-than-5-and-odd"'), ... (True, '"Just 5"') ... ])) x y 0 0 less-than-5-and-even 1 1 less-than-5-and-odd 2 2 less-than-5-and-even 3 3 less-than-5-and-odd 4 4 less-than-5-and-even 5 5 Just 5 6 6 greater-than-5-and-even 7 7 greater-than-5-and-odd 8 8 greater-than-5-and-even 9 9 greater-than-5-and-odd