DSC in 5 Minutes with R & Python mix
This tutorials shows re-implementation of the DSC Introduction mixing R and Python implementations. Source code to run this example can be found here.
If you are reading this page and are in need of R & Python communications, I suppose you might have experience interacting between R and Python, and appreciate the challenges. Data transfer between R and Python currently depends on rpy2
. This has 2 implications: 1) information flow is no longer lossless because it is impossible to support the Python counter part for any arbitary R object (and vise versa) and 2) installation of rpy2
and get it to work can be challenging. While I cannot provide support for rpy2
installation, here is a personal note on how I got it work on my system, possibly the hard way, but might be of interest to those who are in the same situation as I did. Additionally, 3) there is noticible performance overhead at the data transfer interface.
Data communication via a more universial and robust interface has been proposed. We hope to be able to implement it in a near future release.
Here is the DSC script:
#!/usr/bin/env dsc
normal: normal.py
n: 100
$data: x
$true_mean: 0
t: t.R
n: 100
df: 2
$data: x
$true_mean: 3
mean: mean.R
x: $data
$est_mean: y
median: median.py
x: $data
$est_mean: y
sq_err: sq.py
a: $est_mean
b: $true_mean
$error: e
abs_err: abs.R
a: $est_mean
b: $true_mean
$error: e
DSC:
define:
simulate: normal, t
analyze: mean, median
score: abs_err, sq_err
run: simulate * analyze * score
exec_path: R, PY
python_modules: numpy
output: dsc_result
To execute:
cd ~/GIT/dsc/vignettes/one_sample_location_python
./settings_mix.dsc -c 30
INFO: Checking R library dscrutils@stephenslab/dsc/dscrutils ...
INFO: Checking R library reticulate@rstudio ...
INFO: Checking Python module numpy ...
INFO: Checking Python module rpy2 ...
INFO: DSC script exported to dsc_result.html
INFO: Constructing DSC from ./settings_mix.dsc ...
INFO: Building execution graph & running DSC ...
[#############################] 29 steps processed (26 jobs completed, 3 jobs ignored)
INFO: Building DSC database ...
INFO: DSC complete!
INFO: Elapsed time 9.688 seconds.