cf.Data.cull_graph

Data.cull_graph()[source]

Remove unnecessary tasks from the dask graph in-place.

Performance

An unnecessary task is one which does not contribute to the computed result. Such tasks are always automatically removed (culled) at compute time, but removing them beforehand might improve performance by reducing the amount of work done in later steps.

New in version 3.14.0.

Returns

None

Examples

>>> d = cf.Data([1, 2, 3, 4, 5], chunks=3)
>>> d = d[:2]
>>> dict(d.to_dask_array().dask)
{('array-21ea057f160746a3d3f0943bba945460', 0): array([1, 2, 3]),
 ('array-21ea057f160746a3d3f0943bba945460', 1): array([4, 5]),
 ('getitem-3e4edac0a632402f6b45923a6b9d215f',
  0): (<function dask.array.chunk.getitem(obj, index)>, ('array-21ea057f160746a3d3f0943bba945460',
   0), (slice(0, 2, 1),))}
>>> d.cull_graph()
>>> dict(d.to_dask_array().dask)
{('getitem-3e4edac0a632402f6b45923a6b9d215f',
  0): (<function dask.array.chunk.getitem(obj, index)>, ('array-21ea057f160746a3d3f0943bba945460',
   0), (slice(0, 2, 1),)),
 ('array-21ea057f160746a3d3f0943bba945460', 0): array([1, 2, 3])}