Map¶

dabstract.abstract.abstract.Map(data, map_fct: Callable, info: List[Dict] = None, lazy: bool = True, workers: int = 1, buffer_len: int = 3, *arg: list, **kwargs: Dict) → Union[dabstract.abstract.abstract.MapAbstract, dabstract.abstract.abstract.DataAbstract, numpy.ndarray, list]¶

Factory function to allow for choice between lazy and direct mapping.

For both an instance of MapAbstract is created. Different from lazy mapping, is that with direct mapping all examples are immediately evaluated.

To have more information on mapping, please read the docstring of MapAbstract().

data :: The data that needs to be mapped
map_fctCallable: Callable object that defines the mapping
infoList[Dict]: List of Dictionary containing information that has been propagated through the chain of operations (default = None)
lazybool: apply lazily or not (default = True)
workersint: amount of workers used for loading the data (default = 1)
buffer_lenint: buffer_len of the pool (default = 3)
arglist: additional param to provide to the function if needed
kwargsDict: additional param to provide to the function if needed

Returns

MapAbstract OR DataAbstract OR np.ndarray OR list

class dabstract.abstract.abstract.MapAbstract(data: Iterable, map_fct: Callable, info: List[Dict] = None, *args: list, **kwargs: Dict)¶

Bases: dabstract.abstract.abstract.Abstract

The class applies a mapping to data in a lazy manner.

For example, consider the following function:

$   def some_function(input, multiplier, logarithm=False)
$       output = input * multiplier
$       if logarithm:
$           output = np.log10(output)
$       return output

You can apply this function with multiplier=5 and logarithm=True as follows:

$   data = [1,2,3]
$   data_map = MapAbstract(data,map_fct=some_function, 5, logarithm=True)
$   print(data_map[0])
0.6989

Similarly, one could use a lambda function:

$   data = [1,2,3]
$   data_map = MapAbstract(data, lambda x: np.log10(x*5))
$   print(data_map[0])
0.6989

Another example is to use the ProcessingChain. This would allow propagation of information. For example, assume the following ProcessingChain:

$   class custom_processor(Processor):
$       def process(self, data, **kwargs):
$           return data + 1, {'multiplier': 3}
$   class custom_processor2(Processor):
$       def process(self, data, **kwargs):
$           return data * kwargs['multiplier'], {}
$   dp = ProcessingChain()
$   dp.add(custom_processor)
$   dp.add(custom_processor2)

And add this to some data with a MapAbstract:

$   data = [1,2,3]
$   data_map = MapAbstract(data,map_fct=dp)
$   print(data_map[0])
6

When using a ProcessingChain one can utilise the fact that it propagates the so-called ‘info’ through lazy operations. To obtain the information that has been progated, one can use the .get() method:

$   print(data_map.get(0, return_info=True)
(6, {'multiplier': 3, 'output_shape': ()})

For more information on how to use a ProcessingChain, please check dabstract.dataprocessor.ProcessingChain.

There are cases when one would like to use a function that has not been defined as a dabstract Processor, but where it still is desired to for example propagate information, e.g. sampling frequency. One can encapsulate information in a mapping function such as:

$   data = [1,2,3]
$   data_map = MapAbstract(data, (lambda x): x, info=({'fs': 16000}, {'fs': 16000}, {'fs': 16000}))
$   print(data_map[0])
(1, {'fs': 16000})

To index through the data one could directly use default indexing, i.e. [idx] or use the .get() method.

The MapAbstract contains the following methods:

.get - return entry from MapAbstract
.keys - return attribute keys of data

The full explanation for each method is provided as a docstring at each method.

Parameters

dataIterable: Iterable object to be mapped
map_fctCallable: Callable object that defines the mapping
infoList[Dict]: List of Dictionary containing information that will be propagated through the chain of operations. Useful when the mapping function is not a ProcessingChain (default = None)
arglist: additional param to provide to the function if needed
kwargsdict: additional param to provide to the function if needed

Returns

MapAbstract class

get(index: int, return_info: bool = False, *args: List, **kwargs: Dict) → Union[List, numpy.ndarray, Any]¶

Parameters

indexint: index to retrieve data from
return_infobool: return tuple (data, info) if True else data (default = False) info contains the information that has been propagated through the chain of operations
argList: additional param to provide to the function if needed
kwargsDict: additional param to provide to the function if needed

Returns

List OR np.ndarray OR Any