Prototype in DSC
Although DSC is a benchmarking tool, you might still be able to utilize DSC when you are in a “research mode”, or the prototyping stage. Starting directly with DSC to build mini-benchmark for methods prototyping saves the efforts of having to migrate your code over to DSC down the line when you seriously consider expanding your initial comparisons.
In prototyping stage, you might find command options --target
and --truncate
are of particular relevance. In brief, --target
accepts:
- A single or a sequence of modules
- A single or a sequence of module groups
- A single
run
flag
and --truncate
enables running a fraction of DSC benchmark relatively quickly to ensure everthing is correct. Please read on for details.
Test out one module at a time
It is highly recommanded that you check the correctness of DSC modules as you develop them. Do not run the entire benchmark unless you have checked it module by module: that is, only add in the next module when you are sure the first module works well. Command options --target
and --truncate
can be used for this purpose, for example for a DSC benchmark:
DSC:
define:
simulate: normal, t
estimate: median, mean
run: simulate * estimate * mse
When the benchmark is tested for the first time, one should at least run the following to ensure the first 2 modules work well:
dsc file.dsc --target normal --truncate -o prototype
dsc file.dsc --target t --truncate -o prototype
Here option -o
will write results to a separate folder called prototype
(or any name you want to call it) that you can safely remove after done prototyping.
Then move on to testing next modules:
dsc file.dsc --target "normal * mean" --truncate -o prototype
dsc file.dsc --target "normal * median" --truncate -o prototype
and finally:
dsc file.dsc --target "normal * median * mse" --truncate -o prototype
to test mse
. When everything looks good, run:
dsc file.dsc
You can also use module groups in --target
:
dsc file.dsc --target simulate --truncate -o prototype
will run both normal
and t
modules.
Option --truncate
--truncate
allows one to run one instance from a module. For example, for this DSC:
simulate: R(x=rnorm(n))
n: 100, 200, 300, 400
DSC:
replicate: 20
Then running
dsc file.dsc --target simulate
will result in 4 different n
values and 20 replicates, a total of 80 module instances. However with
dsc file.dsc --target simulate --truncate
It will only run the first n
(n=100) with only 1 replicate.
Test out a particular module downstream
--target
can also accept temporary run
flags. This is useful when testing out newly added modules downstream. For example in the DSC section below:
DSC:
define:
get_Y: original_Y
init: init_mnm
fit: fit_mnm, fit_susie, fit_varbvs, fit_finemap
run:
first_pass: get_data * get_Y * get_sumstats * init * fit
dap: get_data * get_Y * get_sumstats * init * fit_dap
Two run
flags are defined: first_pass
and dap
. Clearly the difference between first_pass
and dap
is that first_pass
does not have fit_dap
in its fit
group, but dat
has only fit_dap
not any other modules for fit
. As their name suggests, first_pass
are module that has been tested to work, as our first pass to a problem. dap
include modules that we are currently working on, which is fit_dap
. To prototype fit_dap
exclusively:
dsc file.dsc --target dap
You can remove this flag if deemed necessary after prototyping.
Test locally before running on a cluster system
If you are working with a cluster, we suggest that you have an interactive session on the cluster, and use --target
& --truncate
as instructed above to test your modules out quickly; but do not use --host
so your test DSC runs will be local to the interactive node and you get feedback quickly. Once you are confident everything works, you can submit DSC instances as cluster jobs using --host
option, from either your computer or from the cluster’s head node. See this tutorial for details.