Implementation details
DSC is implemented in Python 3. It relies on a number of libraries:
- DSC relies heavily on codes from SoSproject for execution of pipelines, which implements job dispatch and management vianetworkxand, different from other timestamp-based workflow tools, a file signature system viaxxHash. Development of DSC is contributed directly toSoSwhenever approperate.
- sympyis used to expand DSC benchmark specification into pipelines, and to expand logic for- @FILTERdecorator.
- pandasis used to ensure proper conversion between R and Python data frames. It is also used to manipulate output data.
- scipyprovides a- sparsemodule that supports storing- scipy.sparsetype of matrix to DSC default storage format for Python.
- sqlalchemysupports- dsc-queryto use SQL-like syntax.
In addition,
- Preliminary cross-language communication from R to Python is implemented in rpy2, and from Python to R usingreticulate. This might be replaced in future versions with some data bus implementation in the SoS project.