Parallel_op¶
-
dabstract.abstract.abstract.parallel_op(data: Iterable, type: str = 'threadpool', workers: int = 0, buffer_len: int = 3, return_info: bool = False, *args: list, **kwargs: Dict) → Generator¶ Apply parallelisation to an iterable. This works for any iterable including dabstract functions.
Consider the following pseudo code as an example:
$ class IterableToParallize() $ def __init__(data, process_function) $ self.data = data $ self.process_function = process_function $ def __getitem__(k) $ return self.process_function(self.data[k]) $ $ iterable = IterableToParallize(data, process_function)
which could also be created using the abstract.MapAbstract as:
$ iterable = MapAbstract(data, process_function)
To get the data one could simply loop over the data like:
$ for example in iterable: $ do something
However, if this is costly, one would use this function to speed that up:
$ par_iterable = parallel_op(iterable, workers = 5) $ for example in par_iterable: $ do something
- Parameters
- dataIterable
Iterable object to be parralelise
- typestr [‘threadpool’,’processpool’]
String to select either ‘threadpool’ or ‘processpool’
- workersint
Amount of parallel workers
- buffer_lenint
The length of the buffer in case of a generator:
for data in dataset: do_something(data)
This will cue up buffer_len instances of data while do_something() is busy.
- return_infobool
Return information that has been propagated through a chain of processors and abstract’s. For example, if one has used WavDataReader from dabstract.dataprocessor this will retrieve you the sampling frequency (‘fs’)
- argslist
additional param to provide to iterable
- kwargsdict
additional param to provide to iterable
- Returns
- dataGenerator
The generator will return Union[Generator, Tuple[Generator, Dict]] When return_info is True, it returns a tuple of the exanoke and a Dictionary containing propagated information When return_info is False, it returns the example