Split

dabstract.abstract.abstract.Split(data: Iterable, split_size: int = None, constraint: str = None, sample_len: int = None, sample_period: int = None, type: str = 'seconds', lazy: bool = True, workers: bool = 1, buffer_len: int = 3, *args: List, **kwargs: Dict) → Union[dabstract.abstract.abstract.SplitAbstract, dabstract.abstract.abstract.DataAbstract, numpy.ndarray, list]

Factory function to allow for choice between lazy and direct example splitting.

For both an instance of SplitAbstract is created. Different from lazy splitting, is that with direct splitting all examples are immediately evaluated.

To have more information on splitting, please read the docstring of SplitAbstract().

Parameters
dataIterable

Iterable object to be splitted

split_sizeint

split size in seconds/samples depending on ‘metric’

constraintstr

option ‘power2’ creates sizes with a order of 2 (used for autoencoders)

sample_lenint

sample length (default = None)

sample_periodint

sample period (default = None)

typestr

split_size type (‘seconds’,’samples’) (default = ‘seconds’)

lazybool

apply lazily or not (default = True)

workersint

amount of workers used for loading the data (default = 1)

buffer_lenint

buffer_len of the pool (default = 3)

argList

additional param to provide to the function if needed

kwargsDict

additional param to provide to the function if needed

Returns
SplitAbstract OR DataAbstract OR np.ndarray OR list
class dabstract.abstract.abstract.SplitAbstract(data: Iterable, split_size: int = None, constraint: str = None, sample_len: Union[int, List[int]] = None, sample_period: int = None, type: str = 'seconds')

Bases: dabstract.abstract.abstract.Abstract

The class is an abstract wrapper around an iterable to split this iterable in a lazy manner. Splitting refers to dividing the a particular example in multiple chunks, i.e. 60s examples are divided into 1s segments.

Splitting is based on the parameters split_size, constraint, sample_len, sample_period and type.

If type is set to ‘samples’ one has to define ‘sample_len’ and ‘split_size’. In that case ‘sample_len’ refers to the amount of samples in one example, and split_size the size of one segment. ‘sample_len’ can be set as an integer if all examples are of the same size OR a list of integers if these are different between examples.

If type is set to ‘seconds’ one has to define ‘sample_len’, ‘split_size’ and ‘sample_period’. In this case each of these variables are not samples but defined in terms of seconds. ‘sample_period’ additionally specifies the sample period of these samples in order to properly split.

The SplitAbstract contains the following methods:

.get - return entry from SplitAbstract
.keys - return attribute keys of data

The full explanation for each method is provided as a docstring at each method.

Parameters
dataIterable

Iterable object to be splitted

split_sizeint

split size in seconds/samples depending on ‘metric’

constraintstr

option ‘power2’ creates sizes with a order of 2 (used for autoencoders)

sample_lenint or List[int]

sample length (default = None)

sample_periodint

sample period (default = None)

typestr

split_size type (‘seconds’,’samples’) (default = ‘seconds’)

Returns
SplitAbstract class
get(index: int, return_info: bool = False, *args: List, **kwargs: Dict) → Union[List, numpy.ndarray, Any]
Parameters
indexint

index to retrieve data from

return_infobool

return tuple (data, info) if True else data (default = False) info contains the information that has been propagated through the chain of operations

argList

additional param to provide to the function if needed

kwargsDict

additional param to provide to the function if needed

Returns
——-
List OR np.ndarray OR Any
get_param()