RAxML Workflow interface: raxmlhpc2_workflow.xml

This is one of several RAxML interfaces that can be used for REST access. Others include raxmlhpc2_tgb.xml, raxmlhpc2_tgb_rest.xml, and raxmlhpc8_rest_xsede.xml.

This interface was created to support the multiple command line workflows offered by raxmlGUI. If you need access to a more flexible interface, please see the raxmlhpc8_rest_xsede.xml interface.

To use this interface, the first vparam calls:
toolId=RAXMLHPC2_WORKFLOW

The basic structure of the raxmlhpc2_workflow.xml interface:
The idea was to offer all of the workflow multiple command line options that raxmlGUI offers. We tried to preserve as much flexibility as possible in creating the interface. We divided the analyses supported by raxmlGUI into two cases: those that require a multi-line command line (which raxmlGUI generates using && between separate commands) and those that require just a single command line workflow.

This interface supports only the four multi-command line options supported by raxmlGUI. The files are all generated in a single working directory, so the files produced by one step of the workflow are automatically available for the next step.

Single command line workflows can be accomplished using the raxmlhpc8_rest_xsede.xml. Separate documentation for that interface is available.

Instructions for this interface:

To use the raxmlhpc2_workflow.xml interface, the user first selects the workflow type they want.

specify_workflow_=

The choices are the four possible multi-command line options from raxmlGUI. Fast Tree Search (FTS), Bootstrap/Consensus Tree (BOOTCON); Max Likelihood Search (MLS); and Maximum Likelihood /Thorough Bootstrap (MLTB). Default is FTS.

The workflow selection establishes the command line for the first event in the workflow, and confines possible subsequent steps to the subset supported by raxmlGUI. For an overview of options available for the first string, please see the raxmlhpc8_rest_xsede.xml documentation. No doubt capabilities can be expanded if usage requires it, just let us know your needs.

The user then chooses a set of parameters (model selection, partition uploads, etc.) that are allowed for all workflows, as well as parameters specific to the chosen workflow type.

One thing that this interface does not support is the changing of RAxML’s -f <foo> options. Here we only support what raxmlGUI supports.

Supported multi-workflow parameters:

The parameters listed below are relevant to workflows supported by raxmlGUI. Most other RAxML parameters are supported if they do not conflict with the workflow, but are described elsewhere.

runtime_   - Float - required parameter, default= 0.25

specify_workflow_ - Excl - required parameter.  FTS, MLTB, MLS, BOOTCON default=FTS

altrun_number_ - Integer - Specify the number alternative runs on distinct starting trees (-#/-N), for MLTB and MLS workflows only. Values between 2 and 1000 are allowed. Default=10.

choose_bootstop - Excl -Specify bootstrap protocol: a specific number of bootstraps, or auto bootstopping. Only for specify_workflow_=MLTB and BOOTCON, Allowed values are specify and bootstop. Default=specify.

bootstrap_value_ - Integer -  Number of Bootstrap iterations (-N). Only if choosebootstop=specify. Default=100

bootstopping_type_ - Excl - only when choose_bootstop_=bootstop. Select Bootstopping Criterion: (-N) autoFC, autoMR,autoMRE, autoMRE_IGN, Default=autoMRE

Specific workflow options:

specify_workflow_=FTS

FTS=Fast Tree search:

Example:
raxmlHPC-PTHREADS     -T 8 -f E -p 12345 -m GTRGAMMA -n infile.tre    -s infile.tx        -O
&& raxmlHPC-PTHREADS     -T 8 -f e -t RAxML_fastTree.infile.tre -n brL.infile.tre -s infile.txt -m GTRGAMMA  -O
&& raxmlHPC-PTHREADS     -T 8 -f J -t RAxML_result.brL.infile.tre -n sh.infile.tre -s infile.txt -m GTRGAMMA  -O

FTS settable parameters, all available only if specify_workflow=FTS

outsuffix_FTS_ - String -  required. Set a name for output files in the FTS workflow. –n <string>.tre default=infile.

fasttreesearch_workflow2_ - Switch (0/1) -  Optimize model parameters and branch lengths (-f e). Default=0. This parameter turns on the second step of the workflow, the optimization. It is not required to add this step in the workflow.

fasttreesearch_workflow3 - Switch (0/1) - This switch turns on the third step., find SH-like values (-f J). Default=0
The third step is also not required. And the second and third steps are independent of each other.

parsimony_seed_val2_ - Integer -  Enter a random seed value for parsimony inferences for the SH- like value step. Default = 12345

specify_workflow_=MLTB

 MLTB=Maximum Likelihood /Thorough Bootstrapping

Example: when runs = 10, no brL (-k)

raxmlHPC-PTHREADS.exe -T 2   -b 762   -m PROTGAMMABLOSUM62     -p 801   -N 100 -o 1_Euglena_gracilis -s infile2.txt -n testR -O
&&raxmlHPC-PTHREADS.exe -T 2 -f d -m PROTGAMMABLOSUM62  -o 1_Euglena_gracilis -s infile2.txt -N 10 -n testB -p 603 -O
&&raxmlHPC-PTHREADS.exe -T 2 -f b -t RAxML_bestTree.testB  -z RAxML_bootstrap.testR -m PROTGAMMABLOSUM62  -s infile2.txt -n test.tre -O

MLTB settable parameters, values are only allowed if specify_workflow=MLTB

mulparambootstrap_seed_val_ - Integer - Enter a random seed value for multi-parametric bootstrapping. This appears in the command line as –b <value>. Default=12345

outsuffix_MLTB1_ - String -  Name for output files from the bootstrapping step, step 1. (delivered on the first command line as –n $value). Default-testR. Also used to name the –z file, in step 3

parsimony_seed_val_MLTB2_ - Integer -Enter a random seed value for MLTB sampling step (Step 2). Default value=12345.

outsuffix_MLTB2_ - String - Name for the output files in step 2 of the ML + thorough bootstrap workflow. Delivered as –n $value to the second command line. Also used to name the –t file, in step 3. Default=testB.

outsuffix_MLTB3_ - String -  Set a name for output files in step 3 of the ML Thorough Bootstrapping analysis. Delivered as –n $value to the third command line. Default=MLTB_output

specify_workflow_=MLS

 MLS=Maximum Likelihood Search

Example:
raxmlHPC-PTHREADS.exe -T 2 -f d -m PROTGAMMABLOSUM62F -N 1 -p 130 -o 1_Euglena_gracilis -s infile2.txt -n infile2.tre -O
&&raxmlHPC-PTHREADS.exe -T 2 -f J -m PROTGAMMABLOSUM62F -t RAxML_bestTree.infile2.tre"  -n sh.infile2.tre -s infile2.txt -O
 && cat RAxML_result.infile2.tre* > combined_results.infile2.tre

MLS settable parameters, values are only allowed if specify_workflow=MLS

outsuffix_MLS_ - String - Set a name for output files in the MLS workflow. Delivered as –n $value to first command lines. Also used to name subsequent output files. Default=infile                                          

mlsearch_shlike_ - Switch (0/1) - Add the second step in this workflow, find SH-like values (-f J). Default=0

mlsearch_combine_ - Switch (0/1) - Cat the results in to a single big tree file. Default=0. This activates the third step in the workflow. This step can be run without the second step.

specify_workflow_=BOOTCON

BOOTCON=Bootstrap/Consensus Search

Example:
Bootstrap + Consensus: Two Steps
raxmlHPC-PTHREADS.exe -T 2 -m GTRGAMMA   -n infile.tre -s  infile.txt -O   -x 4 -N 100  -p 368
&& raxmlHPC-PTHREADS.exe -T 2 -m GTRGAMMA -n con.infile.tre -J MR -z infile.tre    

Both steps are required.

BOOTCON settable parameters, values are only allowed if specify_workflow=BOOTCON

outsuffix_BOOTCON_ - String -  Set a name for output files from the Rapid Bootstrap Step. Delivered as –n $value on the first line of the workflow. Default=boot

rapidbootstrap_seed_val_ - Integer -  Enter a random seed value for multi-parametric bootstrapping. Default=12345

outsuffix_BOOTCON2_ - String - Name for output files from the Consensus Tree Step. Delivered on the second line as –n $value. Default = con.tre.

Working Examples/Test Suites for raxmlhpc2_workflow interface

A test suite for the interface is available in the readonly CIPRES SVN.
For the multi-step workflows, multiple tests are usually provided. These provide all the information so it is possible to see how the workflow options can be constructed. There is at least one test per raxmlGUI option.

http://svn.sdsc.edu/repo/scigap/trunk/rest/pycipres/testdata/tooltests/variants/raxmlgui/

There are also tests here for the single commandline options that are run through the raxmlhpc8_rest_xsede.xml interface.

Multi-line tests use the raxmlhpc2_workflow.xml interface:

These include:

fast_tree_search (FTS), mlsearch (MLS), ML_thorough_bootstrap (MLTB), and bootstrap_consensus (BOOTCON).

Although all of these can generate multistep workflows, some create just a single command lines if certain options are not invoked.

The file documentation.txt for each validation test references an example working directory that contains the output file from that test experiment. Under the directory name you will also find the command line produced by this test. The output files are found at
http://svn.sdsc.edu/repo/scigap/trunk/rest/pycipres/testdata/tooltests/variants/raxmlgui/results_examples/ 

The input files and parameter settings that create the command line for each test are given in the files testInput.properties and test.Param.properties, and all necessary input files are present oin the test directory as well.

Tests for raxmlGUI options that can only generate a single command line, no matter the options, are also provided. These tests always use the raxmlhpc8_rest_xsede.xml interface, just to maximize the flexibility available to users. (The raxmlhpc8_xsede interface was created to accommodate the commands required by raxmlgui).

Single command line options that use raxmlhpc8_rest_xsede are:
ancestral_states, ml_rapidbootstrap, pairwise_distances

There is one test for each of these.

Please do report any issues noted.

If there is a tool or a feature you need, please let us know.