plydata.one_table_verbs.distinct¶
-
class
plydata.one_table_verbs.
distinct
(*args, **kwargs)[source]¶ Select distinct/unique rows
- Parameters
- data
dataframe
, optional Useful when not using the
>>
operator.- columnslist-like, optional
Column names to use when determining uniqueness.
- keep{'first', 'last',
False
}, optional first
: Keep the first occurence.last
: Keep the last occurence.False : Do not keep any of the duplicates.
Default is False.
- kwargs
dict
, optional {name: expression}
computed columns. If specified, these are taken together with the columns when determining unique rows.
- data
Examples
>>> import pandas as pd >>> df = pd.DataFrame({'x': [1, 1, 2, 3, 4, 4, 5], ... 'y': [1, 2, 3, 4, 5, 5, 6]}) >>> df >> distinct() x y 0 1 1 1 1 2 2 2 3 3 3 4 4 4 5 6 5 6 >>> df >> distinct(['x']) x y 0 1 1 2 2 3 3 3 4 4 4 5 6 5 6 >>> df >> distinct(['x'], 'last') x y 1 1 2 2 2 3 3 3 4 5 4 5 6 5 6 >>> df >> distinct(z='x%2') x y z 0 1 1 1 2 2 3 0 >>> df >> distinct(['x'], z='x%2') x y z 0 1 1 1 2 2 3 0 3 3 4 1 4 4 5 0 6 5 6 1 >>> df >> define(z='x%2') >> distinct(['x', 'z']) x y z 0 1 1 1 2 2 3 0 3 3 4 1 4 4 5 0 6 5 6 1