Implementation details
DSC is implemented in Python 3. It relies on a number of libraries:
- DSC relies heavily on codes from
SoSproject for execution of pipelines, which implements job dispatch and management vianetworkxand, different from other timestamp-based workflow tools, a file signature system viaxxHash. Development of DSC is contributed directly toSoSwhenever approperate. sympyis used to expand DSC benchmark specification into pipelines, and to expand logic for@FILTERdecorator.pandasis used to ensure proper conversion between R and Python data frames. It is also used to manipulate output data.scipyprovides asparsemodule that supports storingscipy.sparsetype of matrix to DSC default storage format for Python.sqlalchemysupportsdsc-queryto use SQL-like syntax.
In addition,
- Preliminary cross-language communication from R to Python is implemented in
rpy2, and from Python to R usingreticulate. This might be replaced in future versions with some data bus implementation in the SoS project.