Parallel_op¶

dabstract.abstract.abstract.parallel_op(data: Iterable, type: str = 'threadpool', workers: int = 0, buffer_len: int = 3, return_info: bool = False, *args: list, **kwargs: Dict) → Generator¶

Apply parallelisation to an iterable. This works for any iterable including dabstract functions.

Consider the following pseudo code as an example:

$ class IterableToParallize()
$   def __init__(data, process_function)
$       self.data = data
$       self.process_function = process_function
$   def __getitem__(k)
$       return self.process_function(self.data[k])
$
$   iterable = IterableToParallize(data, process_function)

which could also be created using the abstract.MapAbstract as:

$ iterable = MapAbstract(data, process_function)

To get the data one could simply loop over the data like:

$ for example in iterable:
$   do something

However, if this is costly, one would use this function to speed that up:

$ par_iterable = parallel_op(iterable, workers = 5)
$ for example in par_iterable:
$   do something

Parameters

dataIterable

Iterable object to be parralelise

typestr [‘threadpool’,’processpool’]

String to select either ‘threadpool’ or ‘processpool’

workersint

Amount of parallel workers

buffer_lenint

The length of the buffer in case of a generator:

for data in dataset:
    do_something(data)

This will cue up buffer_len instances of data while do_something() is busy.

return_infobool

Return information that has been propagated through a chain of processors and abstract’s. For example, if one has used WavDataReader from dabstract.dataprocessor this will retrieve you the sampling frequency (‘fs’)

argslist

additional param to provide to iterable

kwargsdict

additional param to provide to iterable

Returns

dataGenerator: The generator will return Union[Generator, Tuple[Generator, Dict]] When return_info is True, it returns a tuple of the exanoke and a Dictionary containing propagated information When return_info is False, it returns the example