Select¶
-
dabstract.abstract.abstract.Select(data, selector: Union[List[int], Callable, numbers.Integral], eval_data: Any = None, lazy: bool = True, workers: int = 1, buffer_len: int = 3, *args: List, **kwargs: Dict) → Union[dabstract.abstract.abstract.SelectAbstract, dabstract.abstract.abstract.DataAbstract, numpy.ndarray, list]¶ Factory function to allow for choice between lazy and direct example selection.
For both an instance of SelectAbstract is created. Different from lazy selecting, is that with direct selecting all examples are immediately evaluated.
For more information on the functionality of Select please check the docstring of SelectAbstract().
- Parameters
- dataIterable
input data to perform selection on, if eval_data is None
- selectorList[int] OR Callable OR numbers.Integral
selection criterium
- eval_dataAny
if eval_data not None, then selection will be performed on eval_data, else data (default = None)
- lazybool
apply lazily or not (default = True)
- workersint
amount of workers used for loading the data (default = 1)
- buffer_lenint
buffer_len of the pool (default = 3)
- arg/kwargs:
additional param to provide to the function if needed
- Returns
- SelectAbstract OR DataAbstract OR np.ndarray OR list
-
class
dabstract.abstract.abstract.SelectAbstract(data: Iterable, selector: Union[List[int], Callable, numbers.Integral], eval_data: Any = None, *args, **kwargs: Dict)¶ Bases:
dabstract.abstract.abstract.AbstractSelect a subset of your input sequence.
Selection is based on a so called ‘selector’ which may have the form of a Callable or a list/np.ndarray of integers. Important for these Callables is that they accept two arguments: (1) data to base selection on and (2) index of the variable to be evaluated.
Regarding the selector one can use set of build-in selectors in dabstract.dataset.select, lambda function, an own custom function or indices. For example:
random subsampling with:
$ SelectAbstract(data, dabstract.dataset.select.random_subsample('ratio': 0.5))select based on a key and a particular value:
$ SelectAbstract(data, dabstract.dataset.select.subsample_by_str('ratio': 0.5))use the lambda function such as:
$ SelectAbstract(data, (lambda x,k: x['data']['subdb'][k]))
directly use indices:
$ indices = np.array[0,1,2,3,4]) $ SelectAbstract(data, indices)
If no ‘eval_data’ is used, the evaluation is performed on data available in ‘data’. If ‘eval_data’ is available the evaluation is performed on ‘eval_data’
The SelectAbstract contains the following methods:
.get - return entry from SelectAbstract .keys - return the list of keys
The full explanation for each method is provided as a docstring at each method.
- Parameters
- dataIterable
input data to perform selection on, if eval_data is None
- selectorList[int] OR Callable OR numbers.Integral
selection criterium
- eval_dataAny
if eval_data not None, then selection will be performed on eval_data, else data (default = None)
- kwargsDict
additional param to provide to the function if needed
- Returns
- SelectAbstract class
-
get(index: int, return_info: bool = False, *args: List, **kwargs: Dict) → Union[List, numpy.ndarray, Any]¶ - Parameters
- indexint
index to retrieve data from
- return_infobool
return tuple (data, info) if True else data (default = False)
- argList
additional param to provide to the function if needed
- kwargsDict
additional param to provide to the function if needed
- Returns
- List OR np.ndarray OR Any
-
get_indices()¶
-
set_indices(selector, *args, **kwargs)¶