Map¶
-
dabstract.abstract.abstract.Map(data, map_fct: Callable, info: List[Dict] = None, lazy: bool = True, workers: int = 1, buffer_len: int = 3, *arg: list, **kwargs: Dict) → Union[dabstract.abstract.abstract.MapAbstract, dabstract.abstract.abstract.DataAbstract, numpy.ndarray, list]¶ Factory function to allow for choice between lazy and direct mapping.
For both an instance of MapAbstract is created. Different from lazy mapping, is that with direct mapping all examples are immediately evaluated.
To have more information on mapping, please read the docstring of MapAbstract().
- data :
The data that needs to be mapped
- map_fctCallable
Callable object that defines the mapping
- infoList[Dict]
List of Dictionary containing information that has been propagated through the chain of operations (default = None)
- lazybool
apply lazily or not (default = True)
- workersint
amount of workers used for loading the data (default = 1)
- buffer_lenint
buffer_len of the pool (default = 3)
- arglist
additional param to provide to the function if needed
- kwargsDict
additional param to provide to the function if needed
- Returns
- MapAbstract OR DataAbstract OR np.ndarray OR list
-
class
dabstract.abstract.abstract.MapAbstract(data: Iterable, map_fct: Callable, info: List[Dict] = None, *args: list, **kwargs: Dict)¶ Bases:
dabstract.abstract.abstract.AbstractThe class applies a mapping to data in a lazy manner.
For example, consider the following function:
$ def some_function(input, multiplier, logarithm=False) $ output = input * multiplier $ if logarithm: $ output = np.log10(output) $ return output
You can apply this function with multiplier=5 and logarithm=True as follows:
$ data = [1,2,3] $ data_map = MapAbstract(data,map_fct=some_function, 5, logarithm=True) $ print(data_map[0]) 0.6989
Similarly, one could use a lambda function:
$ data = [1,2,3] $ data_map = MapAbstract(data, lambda x: np.log10(x*5)) $ print(data_map[0]) 0.6989
Another example is to use the ProcessingChain. This would allow propagation of information. For example, assume the following ProcessingChain:
$ class custom_processor(Processor): $ def process(self, data, **kwargs): $ return data + 1, {'multiplier': 3} $ class custom_processor2(Processor): $ def process(self, data, **kwargs): $ return data * kwargs['multiplier'], {} $ dp = ProcessingChain() $ dp.add(custom_processor) $ dp.add(custom_processor2)And add this to some data with a MapAbstract:
$ data = [1,2,3] $ data_map = MapAbstract(data,map_fct=dp) $ print(data_map[0]) 6
When using a ProcessingChain one can utilise the fact that it propagates the so-called ‘info’ through lazy operations. To obtain the information that has been progated, one can use the .get() method:
$ print(data_map.get(0, return_info=True) (6, {'multiplier': 3, 'output_shape': ()})For more information on how to use a ProcessingChain, please check dabstract.dataprocessor.ProcessingChain.
There are cases when one would like to use a function that has not been defined as a dabstract Processor, but where it still is desired to for example propagate information, e.g. sampling frequency. One can encapsulate information in a mapping function such as:
$ data = [1,2,3] $ data_map = MapAbstract(data, (lambda x): x, info=({'fs': 16000}, {'fs': 16000}, {'fs': 16000})) $ print(data_map[0]) (1, {'fs': 16000})To index through the data one could directly use default indexing, i.e. [idx] or use the .get() method.
The MapAbstract contains the following methods:
.get - return entry from MapAbstract .keys - return attribute keys of data
The full explanation for each method is provided as a docstring at each method.
- Parameters
- dataIterable
Iterable object to be mapped
- map_fctCallable
Callable object that defines the mapping
- infoList[Dict]
List of Dictionary containing information that will be propagated through the chain of operations. Useful when the mapping function is not a ProcessingChain (default = None)
- arglist
additional param to provide to the function if needed
- kwargsdict
additional param to provide to the function if needed
- Returns
- MapAbstract class
-
get(index: int, return_info: bool = False, *args: List, **kwargs: Dict) → Union[List, numpy.ndarray, Any]¶ - Parameters
- indexint
index to retrieve data from
- return_infobool
return tuple (data, info) if True else data (default = False) info contains the information that has been propagated through the chain of operations
- argList
additional param to provide to the function if needed
- kwargsDict
additional param to provide to the function if needed
- Returns
- List OR np.ndarray OR Any