RAxML HPC8 REST


RAxML REST interface: raxmlhpc8_rest_xsede

On January 8, 2021, CIPRES shifted a new rest interface that uses the Expanse cluster.

The elements that have changed are shown in bold below. Items that may be critical are highlighted in red.

Please contact us if you have questions.

This is one of several RAxML interfaces that can be used from the REST API. Others include raxmlhpc2_tgb.xml, raxmlhpc2_tgb_rest.xml, raxmlhpc2bb.xml and raxmlhpc2_workflow.xml. This the recommended interface for development using RaxML, unless you want to use one of a few specific multi-command line workflows that are supported by raxmlhpc2_workflow.xml.

The tool ID for this interface is RAXMLHPC8_REST_XSEDE. It is selected with toolId=RAXMLHPC8_REST_XSEDE

This interface was constructed to simplify submissions from rest services. We found that some options available through the original browser interface (raxmlhpc2_tgb.xml) are not easily accessible through REST because of certain PISE evaluation conflicts. The raxmlhpc2_tgb.xml and raxmlhpc2_tgb_rest.xml interfaces will continue to be supported on the CRA, however.  The restructuring makes it easier to add options, and this is the interface we will do all future development on, so we recommend that users adopt this interface to access new features made available through RAxML. This interface also exposes a "generic" parameter which allows a user to enter native RAxML command line flags.

Configuring the analysis
The restructuring is based on the fact that each of the -f options that RAxML offers are mutually exclusive. They are all gathered under a single parameter,"select_analysis_". The default is select_analysis_=fd This selection enters no value to the command line, because RAxML  defaults to the -f d option. The naming convention for other analysis options is: select_analysis_ followed by the value of the RAxML native flag with any dashes and white space removed. The -f a option is  chosen with select_analysis_= fa, the -f A option is chosen select_analysis_= fA, the -f o option is chosen  select_analysis_= fo and so forth.

vparam.select_analysis_ - (Excl) - Allowed values.
                                                      fa: Rapid bootstrap analysis / search for best-scoring ML tree (-f a)
                                                      fd: Use the default, faster rapid hill-climbing algorithm (-f d) (Default)
                                                      fD: Rapid hillclimbing searches that also generate RELL bootstraps (-f D)                               
                                                      fb:Draw bipartitions onto a single tree topology. (-f b)
                                                      fA: Compute Marginal Ancestral States using a rooted reference tree. (-f A)
                                                      fJ: Compute SH-like support values on a given tree passed via -t (-f J)
                                                      fe: Optimize model parameters+branch lengths for given input tree (-f e)    
                                                      fg: Compute per site log Likelihoods for one or more trees passed via z (-f g) (new)
                                                      fG: Compute per site log Likelihoods for one or more trees passed via z (-f G) (new)

                                                      fh: Compute a log likelihood test (-f h)                              
                                                      fT: Do A Final Optimization of ML Tree (-f T)
                                                      fx: ]Compute pair-wise ML distances (-f x; GAMMA models only)
                                                      fk: Fix long branch lengths in partitioned data sets with missing data (-f k)                               
                                                      fE: Very fast experimental tree search(-f E)                              
                                                      fu: Morphological weight calibration using maximum likelihood (-f u)                             
                                                      fv: Classify a bunch of environmental sequences into a reference tree using thorough read insertions(-f v)
                                                      fo: Use the old and slower rapid hillclimbing without the heuristic cutoff (-f o)
                                                        I: Use a posteriori boostopping (-I)]
                                                       J: Compute majority rule consensus tree (-J)
                                                       y: Only compute a randomized parsimony starting tree (-y)

In addition, three other options are also specified with "select_analysis_" because they are also mutually exclusive with any -f option . The options are -J (Compute majority rule consensus tree), -y (generate a parsimony tree and quit), and -I (a posteriori bootstopping). These correspond to select_analysis_=J select_analysis_=y and select_analysis_=I respectively. For select_analysis_=J, a second parameter, "specify_mr" is set. If select_analysis_=J is submitted and no value for specify_mr_  is submitted, the default, specify_mr_=MR (Majority rule) is in force. specify_mr_ can be set to MRE, (Extended majority rule), STRICT(Strict Majority), MR_DROP, or STRICT_DROP

Similarly, for select_analysis_=I, a second parameter, aposterior_bootstopping_ is used to specify majority rule criteria. If select_analysis_=J is submitted and no value for specify_mr_ is specified, the default is autoMRE. Allowed values for aposterior_bootstopping_ are autoFC, autoMR, autoMRE, and autoMRE_IGN. These are analogous to the automatic bootstopping values supported by RAxML. The option select_analysis=y has no related parameters.

Bootstrapping:
Setting select_analysis_= fo, fa, and fd enables the bootstrapping parameter  "choose_bootstrap" to be set. To turn on bootstrapping,use choose_bootstrap_=x (or b). You can choose x or b (note that b is not valid with the –f a option). You can choose how bootstrapping is stopped with the choose_bootstop parameter. The value "specify" requests a specific number of bootstraps while the value "bootstop" is used for automatic bootstopping. The default is specify. choose_bootstop_=specify, means "use a specific number of bootstrap iterations".  The number of bootstraps can be set to any integer up to 1000 using bootstrap_value_=(some integer) default is 100 bootstraps.

Setting choose_bootstop_=bootstop means that RAxML will stop the bootstrapping automatically. The stopping criterion is chosen with the bootstopping_type parameter, where bootstopping_type_= autoFC, autoMR, autoMRE, or autoMRE_IGN The default is autoMRE.

Alternative runs:
Alternative runs can be turned on using the switch specify_runs_=1 The default is 0. The number of runs is then set using altrun_number_= any integer (up to 1000). The default value is 10.

Mesquite option
The mesquite option is enabled here using mesquite_output_=1 (default =0) but it is not allowed currently on runs where bootstrapping or alternative runs are used, because these runs call hybrid code and hybrid code chokes on the mesquite option currently.

Generic Parameter
The generic parameter allows users to enter something that isn’t supported by our interface. generic_= . There are restrictions on what can be added, and options that require file uploads are not currently supported using the generic parameter. If you want to add a -f option that is not supported, then select_analysis_ should be left at its default, so there aren't two -f options on the command line.

RAXMLHPC8_REST_XSEDE Parameter Summary:
Note that when the term “Excl” follows a parameter name, it means that only one of several possible values are accepted. The accepted values are listed with the parameter.  The term “Switch” means that only 0 or 1 are accepted, and the expression either evaluates to true or false.  Other parameter type terms, such as Integer, String, Float, etc are self-explanatory. 

Input File Parameters:
Input File Parameters must be accompanied by an upload file (see RAxML manual for format):
input.treetop_ - InFile - Supply a starting tree (Not available when doing rapid bootstrapping, -x) (-t)
input.constraint_ - InFile - Constraint (-g)
input.binary_backbone_ - InFile - Binary Backbone (-r)
input.partition_ - InFile - Use a mixed/partitioned model? (-q)
input.exclude_file_ - InFile - Create an input file that excludes the range of positions specified in this file (-E)
input.set_weights_ - InFile - Weight characters as specified in this file (-a)
input.user_prot_matrix_ - InFile - (protein data only) Upload a Custom Protein Substitution Matrix
input.user_prot_matrixq1_ - InFile - (protein only, when mulcustom_aa_matrices_=1; partition_ file is also required) Select the First Protein Substitution Matrix Called in Your Partition File
input.user_prot_matrixq2_ - InFile - (protein only, when user_prot_matrixq1_ is defined, partition_ file is required) Select the Second Protein Substitution Matrix Called in Your Partition File
input.user_prot_matrixq3_ - InFile - (protein only, when user_prot_matrixq2_ is defined, partition_ file is required) Select the Third Protein Substitution Matrix Called in Your Partition File
input.user_prot_matrixq4_ - InFile - (protein only, when user_prot_matrixq3_ is defined, partition_ file is required) Select the Fourth Protein Substitution Matrix Called in Your Partition File
input.user_prot_matrixq5_ - InFile - (protein only, when user_prot_matrixq4_ is defined, partition_ file is required) Select the Fifth Protein Substitution Matrix Called in Your Partition File
input.sec_str_file_ - InFile - (RNA structure only) Upload a Secondary Structure File (-S)
input.aposterior_topologies_ - Infile - (only when select_analysis_=I). Upload a file containing topologies for aposteriori bootstrapping.

Visible Parameters:
General run characteristics:
vparam.runtime_   - Float - less than 168, default = 0.25
vparam.more_memory_ - Switch - use if a data set requires more than 20 GB of memory. (default=0)
vparam.nchar_ - Integer - (only when more_memory_=1) Enter the number of patterns in your dataset. This configures the run for the right amount of memory. Default=1000.
vparam.ntax_ - Integer - (only when more_memory_=1)  Enter the number of taxa in your dataset. This configures the run for the right amount of memory. No default.
vparam.many_partitions_ - Switch - My data set has more than 100 partitions.
vparam.use_ml_freqs_ - Switch - Make an ML estimate of frequencies.

vparam.specify_ml_ - Switch - Specify an ML estmate of base frequencies.
vparam.number_cats_ - Integer - (only for CAT models) Specify the number of distinct rate categories (-c)
vparam.provide_parsimony_seed_ - Switch (0/1) - Specify a random seed value for parsimony inferences, generally required if –t isn’t used (-p) default=1
vparam.parsimony_seed_val_ - Integer - (only if provide_parsimony_seed_=1) Enter a random seed value for parsimony inferences (gives reproducible results from random starting tree) default 12345
vparam.datatype_ - Excl - protein/dna/rna(structure)/binary/multi. default=dna
vparam.partitionUnder_=    -Excl  choose the model for partitioning: K80, JC69, HKY85; DNA only; (no default)
vparam.outgroup_ - String - Outgroup (one or more comma-separated outgroups, no blank space allowed)
vparam.invariable_ - Excl - Estimate proportion of invariable sites (GTRGAMMA + I). Values: I, null (default=null)
vparam.use_emp_freqs_ - Excl - Use empirical frequencies. Protein only. Values: null, F  (default=null)
vparam.use_ml_freqs_ - Excl - Make an ML estimate of frequencies; X is the only allowed value.
vparam.ascertainment_ - Excl - Use empirical frequencies. Values: null, ASC_  (default=null)
vparam.ascertainment_corr - Excl - select the ascertainment correction, Values: lewis, felsenstein, stamatakis (default = lewis)
vparam.rearrangement_yes_ - Switch (0/1) -  Specify an initial rearrangement setting (-i) (default=0)
vparam.number_rearrange_ - Integer - (only if rearrangement_yes_ =1) Specify the distance from original pruning point (-i), (required if rearrangement_yes_=1). No default.
vparam.estimate_perpartbrlen_ - Switch (0/1) -  Estimate individual per-partition branch lengths (-M) (default=0)
vparam.specify_ML_ - Switch (0/1) - Specify an ML estimate of base frequencies (GTRGAMMA + X)
vparam.printbrlength_ - Switch - Print branch lengths (-k). (default=0)
vparam.disable_seqcheck_ -  Switch (0/1) - Disable checking for sequences with no values (-O). (default=0)
vparam.mesquite_output_  - Switch (0/1) - Print output files that can be parsed by Mesquite. (-mesquite). (default=0)

Data type-specific options/model setting
****************************************************************************
nucleic acid options (only when dataype=dna)
****************************************************************************
vparam.dna_gtrcat_ - Excl - GTRGAMMA / GTRCAT for the bootstrapping phase, and GTRGAMMA for the final tree inference (GTRCAT=default)
****************************************************************************
protein options (only when dataype=protein)
****************************************************************************
vparam.prot_sub_model_ - Excl - PROTGAMMA or PROTCAT default=PROTCAT:
vparam.prot_matrix_spec_ - Excl - Protein Substitution Matrix. Values allowed: DAYHOFF,DCMUT,JTT,MTREV,WAG,RTREV,CPREV,VT,BLOSUM62,MTMAM,LG,MTART,MTZOA,PMB,HIVB,HIVW,JTTDCMUT,FLU,DUMMY,DUMMY2,AUTO,LG4M,LG4X,PROT_FILE,GTR_UNLINKED,GTR default=DAYHOFF
vparam.mulcustom_aa_matrices_ -  Switch (0/1) - Use a Partition file that specifies AA Matrices. Turns on the ability to upload 1-5 partition files. default=0
****************************************************************************
RNA structure options (only when dataype=rna)
****************************************************************************
vparam.rna_model_ - Excl - Use an RNA Secondary Structure Substitution Model (-A)
S6A, S6B, S6C, S6D, S6E, S7A, S7B, S7C, S7D, S7E, S7F, S16A, S16B. default= S16A
****************************************************************************
Binary data options (binary)
****************************************************************************
vparam.bin_model_ - Excl - Binary data model (-m) BINCAT/BINGAMMA, default = BINCAT
****************************************************************************
Multiple State Morphological Matrix Options (only when dataype=multi)
****************************************************************************
vparam.multi_model_ - Excl - Multiple State Data Model (-m) MULTICAT/MULTIGAMMA default MULTICAT
vparam.choose_multi_model_ - Excl - Select a Multiple state data model (-K) ORDERED, MK, GTR. default=GTR
****************************************************************************
Select the Analysis
****************************************************************************
vparam.specify_runs_ - Switch (0/1)  - Specify the number alternative runs on distinct starting trees? (-#/-N) default=0.
vparam.altrun_number_ - Integer - only when specify_runs_ = 1. Enter number of number alternative runs. default=10
vparam.no_bfgs_ - Switch (0/1) - turn off the new bfgs algorithm. default=0
vparam.intermediate_treefiles_ - Switch (0/1) -Write intermediate tree files to a file (-j) default = 0
vparam.convergence_criterion_ - Switch (0/1) - Use ML search convergence criterion. (-D) default=0
vparam.specify_mr_ - Excl - only when select analysis_=J. Specify majority rule consensus tree (-J) technique. MRE, STRICT, MR_DROP, STRICT_DROP. default=MRE
****************************************************************************
Bootstrapping
****************************************************************************
vparam.choose_bootstrap_ - Excl - allowed values: b, x, or null. Only when select_analysis=fa, fd, or fo. Choose a Bootstrapping Type. No default                                                                                      

vparam.seed_value_ - Integer - Enter a random seed value for bootstrapping, delivers –x or –b depending on value of choose_bootstop. default=12345
vparam.choose_bootstop_ - Excl - Allowed values: specify or bootstop, Do you want to let RAxML stop bootstrapping automatically, or get a specific number of bootstraps? default=specify.
vparam.bootstrap_value_  - Integer - (only when choose_bootstop_=specify). Bootstrap iterations (-N) default = 100
vparam.bootstopping_type_ - Excl - (only when choose_bootstop_=bootstop). Select Bootstopping Criterion: (-N) autoFC, autoMR,autoMRE, autoMRE_IGN, default=autoMRE
vparam.aposterior_bootstopping_ - Excl - (only when select_analysis_=I). Select the criterion for a posteriori bootstopping analysis (-I). Requires a set of topologies in a file via –z, use aposterior_topologies_=; allowed values autoFC, autoMR,autoMRE, autoMRE_IGN, default=autoMRE
                       
EXAMPLES

Here are some samples:

A rapid bootstrap run:

tool=RAXMLHPC8_REST_XSEDE
input.infile_=infile.txt
vparam.runtime_=0.50
vparam.dna_gtrcat_=GTRGAMMA
vparam.select_analysis_=fa
vparam.choose_bootstrap_=x
vparam.disable_seqcheck_=1
vparam.outgroup_=Hetinte
vparam.printbrlength_=1

produces this command line:
raxmlHPC-HYBRID* -f a -n result -s infile.txt -N 100 -p 12345
-m GTRGAMMA -x 12345 -O -k  -o Hetinte
* (-T 8 will be added by the wrapper)


An ancestral states example:
tool=RAXMLHPC8_REST_XSEDE
input.infile_=infile.txt
input.treetop_=tree.tre
vparam.runtime_=0.50
vparam.dna_gtrcat_=GTRGAMMA
vparam.select_analysis_=fA
vparam.outsuffix_=result
vparam.disable_seqcheck_=1

produces the command line:
raxmlHPC-PTHREADS* -f A -n result -s infile.txt -t tree.tre -p 12345 -m GTRGAMMA –O
* (-T 8 will be added by the wrapper)

A single replicate using the - f d algorithm.
tool=RAXMLHPC8_REST_XSEDE
input.infile_=infile.txt
vparam.runtime_=0.50
vparam.specify_runs_=1
vparam.altrun_number_=1
vparam.dna_gtrcat_=GTRGAMMA
vparam.parsimony_seed_val_=362
vparam.outsuffix_=infile
vparam.disable_seqcheck_=1

produces the command line:     
raxmlHPC-HYBRID*  -m GTRGAMMA -N 1 -O -p 362  -s infile.txt -n infile
* (-T 8 will be added by the wrapper) 

If there is a tool or a feature you need, please let us know.

hummingbird in flight

Get 1000 Hours free

On the UCSD Supercomputer

Start Your Trial