Map

dabstract.abstract.abstract.Map(data, map_fct: Callable, info: List[Dict] = None, lazy: bool = True, workers: int = 1, buffer_len: int = 3, *arg: list, **kwargs: Dict) → Union[dabstract.abstract.abstract.MapAbstract, dabstract.abstract.abstract.DataAbstract, numpy.ndarray, list]

Factory function to allow for choice between lazy and direct mapping.

For both an instance of MapAbstract is created. Different from lazy mapping, is that with direct mapping all examples are immediately evaluated.

To have more information on mapping, please read the docstring of MapAbstract().

data :

The data that needs to be mapped

map_fctCallable

Callable object that defines the mapping

infoList[Dict]

List of Dictionary containing information that has been propagated through the chain of operations (default = None)

lazybool

apply lazily or not (default = True)

workersint

amount of workers used for loading the data (default = 1)

buffer_lenint

buffer_len of the pool (default = 3)

arglist

additional param to provide to the function if needed

kwargsDict

additional param to provide to the function if needed

Returns
MapAbstract OR DataAbstract OR np.ndarray OR list
class dabstract.abstract.abstract.MapAbstract(data: Iterable, map_fct: Callable, info: List[Dict] = None, *args: list, **kwargs: Dict)

Bases: dabstract.abstract.abstract.Abstract

The class applies a mapping to data in a lazy manner.

For example, consider the following function:

$   def some_function(input, multiplier, logarithm=False)
$       output = input * multiplier
$       if logarithm:
$           output = np.log10(output)
$       return output

You can apply this function with multiplier=5 and logarithm=True as follows:

$   data = [1,2,3]
$   data_map = MapAbstract(data,map_fct=some_function, 5, logarithm=True)
$   print(data_map[0])
0.6989

Similarly, one could use a lambda function:

$   data = [1,2,3]
$   data_map = MapAbstract(data, lambda x: np.log10(x*5))
$   print(data_map[0])
0.6989

Another example is to use the ProcessingChain. This would allow propagation of information. For example, assume the following ProcessingChain:

$   class custom_processor(Processor):
$       def process(self, data, **kwargs):
$           return data + 1, {'multiplier': 3}
$   class custom_processor2(Processor):
$       def process(self, data, **kwargs):
$           return data * kwargs['multiplier'], {}
$   dp = ProcessingChain()
$   dp.add(custom_processor)
$   dp.add(custom_processor2)

And add this to some data with a MapAbstract:

$   data = [1,2,3]
$   data_map = MapAbstract(data,map_fct=dp)
$   print(data_map[0])
6

When using a ProcessingChain one can utilise the fact that it propagates the so-called ‘info’ through lazy operations. To obtain the information that has been progated, one can use the .get() method:

$   print(data_map.get(0, return_info=True)
(6, {'multiplier': 3, 'output_shape': ()})

For more information on how to use a ProcessingChain, please check dabstract.dataprocessor.ProcessingChain.

There are cases when one would like to use a function that has not been defined as a dabstract Processor, but where it still is desired to for example propagate information, e.g. sampling frequency. One can encapsulate information in a mapping function such as:

$   data = [1,2,3]
$   data_map = MapAbstract(data, (lambda x): x, info=({'fs': 16000}, {'fs': 16000}, {'fs': 16000}))
$   print(data_map[0])
(1, {'fs': 16000})

To index through the data one could directly use default indexing, i.e. [idx] or use the .get() method.

The MapAbstract contains the following methods:

.get - return entry from MapAbstract
.keys - return attribute keys of data

The full explanation for each method is provided as a docstring at each method.

Parameters
dataIterable

Iterable object to be mapped

map_fctCallable

Callable object that defines the mapping

infoList[Dict]

List of Dictionary containing information that will be propagated through the chain of operations. Useful when the mapping function is not a ProcessingChain (default = None)

arglist

additional param to provide to the function if needed

kwargsdict

additional param to provide to the function if needed

Returns
MapAbstract class
get(index: int, return_info: bool = False, *args: List, **kwargs: Dict) → Union[List, numpy.ndarray, Any]
Parameters
indexint

index to retrieve data from

return_infobool

return tuple (data, info) if True else data (default = False) info contains the information that has been propagated through the chain of operations

argList

additional param to provide to the function if needed

kwargsDict

additional param to provide to the function if needed

Returns
List OR np.ndarray OR Any