Implementation details
DSC is implemented in Python 3. It relies on a number of libraries:
- DSC relies heavily on codes from
SoS
project for execution of pipelines, which implements job dispatch and management vianetworkx
and, different from other timestamp-based workflow tools, a file signature system viaxxHash
. Development of DSC is contributed directly toSoS
whenever approperate. sympy
is used to expand DSC benchmark specification into pipelines, and to expand logic for@FILTER
decorator.pandas
is used to ensure proper conversion between R and Python data frames. It is also used to manipulate output data.scipy
provides asparse
module that supports storingscipy.sparse
type of matrix to DSC default storage format for Python.sqlalchemy
supportsdsc-query
to use SQL-like syntax.
In addition,
- Preliminary cross-language communication from R to Python is implemented in
rpy2
, and from Python to R usingreticulate
. This might be replaced in future versions with some data bus implementation in the SoS project.