Javascript Cookbook
This page provides sample dynamics expressions in versions 1.x of the Common Workflow Language. If you are describing a tool using the sbg:draft-2 version of CWL, please see this page for sample expressions.
This page provides examples of expressions that are entered in the tool editor when wrapping a tool for use on CAVATICA. Expressions can be used to dynamically set different tool properties such as arguments, secondary files, metadata, output file names, etc. Dynamic expressions commonly use the self
and inputs
predefined objects, which are determined at runtime and denote properties of the tool's inputs or outputs in a given execution and the ongoing tool execution (the job). The expressions are entered anywhere in the Tool Editor where the </> symbol is present. For general information about expressions, please refer to dynamic expressions in tool descriptions.
- Capture the name and content of an input file
- Name output files based on the Sample ID metadata field of input files
- Name output files based on input file names
- Setting metadata fields of output files based on inputs for paired end 1 and 2
- Order input reads based on paired-end metadata
- Configure a tool to unpack a TAR archive provided as its input
Capture the name and content of an input file
The following expression picks out the name of the data file input for each execution of the tool. The dynamic expression will be based on the inputs
object. We will use the following Javascript expression to fetch the input file name:
$(inputs.<input_ID_for_data_file>.basename)
In this expression, you should replace <input_ID_for_data_file>
with the ID for the input port that the data file goes into. This expression will then take the path of the input to the specified port, split the path on '/', and then select the last slice of the split list. This should be the file name, at the end of the path.
To refer to the content of the file, use the following expression:
$(inputs.<input_ID_for_data_file>.contents)
As before, <input_ID_for_data_file>
refers to the ID of the input port that the data file gets inputted to. This expression will pick out the file object that is inputted there, and set it to the content of the file named in the expression above.
Name output files based on the Sample ID metadata field of input files
The expression below is used to name an output file of a tool based on the value in the Sample ID metadata field obtained from the input files. As some tools allow you to specify the output file name as a command line argument, this expression can be used to define the argument value in the Arguments section.
${
var input_files = [].concat(inputs.<input_port_ID>)
var filename = input_files[0].basename;
if (input_files[0].metadata && input_files[0].metadata.sample_id)
{
var filebase = input_files[0].metadata.sample_id
}
else
{
var filebase = "sample_unknown"
}
return filebase.concat("<file_extension>")
}
The expression works as follows:
The following two lines of code get the file name(s) from the input file or array of files.
var input_files = [].concat(inputs.<input_port_ID>)
var filename = input_files[0].basename;
Make sure to replace <input_port_ID>
value with the the corresponding port ID of your app.
The next part of the expression checks whether the input file has a value set in the Sample ID metadata field. If there is a value, it is used as the base name of the output file. Otherwise, the base name will be sample_unknown
.
if (input_files[0].metadata && input_files[0].metadata.sample_id)
{
var filebase = input_files[0].metadata.sample_id
}
else
{
var filebase = "sample_unknown"
}
Finally, the extension is appended to the base name of the file:
return filebase.concat("<file_extension>")
Make sure to replace <file_extension>
with the extension of the tool's output file, including the dot (for example .bam
).
Name output files based on input file names
The following expression will retrieve the name of the input file for the job and return it as the name of the output file. This expression can also be used with tools that allow output file name to be defined through a command line argument. It is entered as the argument value in the Arguments section of the tool editor's Visual Editor tab.
${
var reads = [].concat(inputs.<input_port_ID>)
var file_path = reads[0].path // result in format /path/to/input_file.bam
var filename = reads[0].basename // result in format input_file.bam
var filebase = reads[0].nameroot // result in format input_file
var out_name = filebase.concat("<file_extension>") //e.g. input_file.vcf
return out_name
}
The expression works as follows:
The following four lines of code get the file path, name and basename from the file or array of files that have been provided as the input.
var reads = [].concat(inputs.<input_port_ID>)
var file_path = reads[0].path // result in format /path/to/input_file.bam
var filename = reads[0].basename // result in format input_file.bam
var filebase = reads[0].nameroot // result in format input_file
The last part of the expression appends the extension to the base name and returns the full name of the output file:
var out_name = filebase.concat("<file_extension>") //e.g. input_file.vcf
return out_name
You need to replace <file_extension>
in the code to match the extension you need for your output file. The extension should also include the dot (for example .vcf
).
Setting metadata fields of output files based on inputs for paired end 1 and 2
This expression is used to copy the values in the Paired-end metadata field from input files for a given job to their corresponding paired-end output files.
${
var out = self;
var filename = self.basename;
var filebase = filename.split('.').slice(0, -3).join('.')
var reads = [].concat(inputs.<input_port_ID>);
for (i=0; i<reads.length; i++)
{
var input_filename = reads[i].basename;
var input_filebase = input_filename.split('.').slice(0, -3).join('.');
if (filebase==input_filebase && inputs.<input_port_ID>[i].metadata && inputs.<input_port_ID>[i].metadata.paired_end)
{
out.metadata={"paired_end": ""}
out.metadata["paired_end"] = inputs.<input_port_ID>[i].metadata.paired_end;
return out
}
return out
}
}
The expression is entered in the OutputEval field on the output port's setup screen.
The code is analyzed below:
The OutputEval field is used to transform the output and modify it's attributes. The following line defines a variable containing the original value of the output:
var out = self
The following two lines extract the base name of the output file on the output port:
var filename = self.basename
var filebase = filename.split('.').slice(0, -3).join('.')
The expression assumes that the file extension includes three dot-separated portions, such as <base_name>.pe_1.fastq.gz
. The slice(0, -3)
method in the second line above is used to extract the part of the file name before .pe_1.fastq.gz
, but can be adjusted to match the file naming convention for your tool's input paired-end files. For example, if the naming convention is <base_name>.fastq
, the method needs to be slice(0, -1)
.
The next line creates an array containing the input file(s):
var reads = [].concat(inputs.<input_port_ID>)
Make sure to replace <input_port_ID>
with the actual ID of the input port for paired-end files.
Finally, there is a for loop that iterates through the array of input file paths and gets the base name of each of the files.
for (var i=0; i<reads.length; i++)
{
var input_filename = reads[i].basename;
var input_filebase = input_filename.split('.').slice(0, -3).join('.');
if (filebase==input_filebase && inputs.<input_port_ID>[i].metadata && inputs.<input_port_ID>[i].metadata.paired_end)
{
out.metadata={"paired_end": ""}
out.metadata["paired_end"] = inputs.<input_port_ID>[i].metadata.paired_end;
return out
}
return out
}
Once it finds the input file whose name matches the name of the output file, it checks whether the input file has metadata and has a value in the Paired-end metadata field. If there is a value, the value becomes the Paired-end metadata value for the output file.
Order input reads based on paired-end metadata
Some tools, such as BWA MEM Bundle require paired-end input reads to be ordered in the correct sequence (paired-end 1 first, followed by paired-end 2). This expression will automatically order the input reads based on the values entered in the paired_end metadata field:
${
var input_reads = inputs.input_reads
var read_metadata = input_reads[0].metadata
if(!read_metadata) read_metadata = []
var order = 0
if(read_metadata == []){ order = 0 }
else if('paired_end' in read_metadata){
var pe1 = read_metadata.paired_end
if(pe1 != 1) order = 1
}
if (input_reads.length == 1){
return input_reads[0].path
}
else if (input_reads.length == 2){
if (order == 0) return input_reads[0].path + ' ' + input_reads[1].path
else return input_reads[1].path + ' ' + input_reads[0].path
}
}
The expression works as follows:
The first code block declares a variable containing the input reads:
var input_reads = inputs.input_reads
The following part reads metadata from the first supplied input read. If there is no metadata, assigns an empty array to the read_metadata
variable:
var read_metadata = input_reads[0].metadata
if(!read_metadata) read_metadata = []
The order
flag is used to mark the order of input reads. The starting assumption is that the reads are ordered correctly, which is denoted by assigning the value 0 to order
.
var order = 0
The following code block starts by checking whether the first input read has any assigned metadata values. If it finds no values, no sorting can be done based on metadata and it assumes that the reads are in correct order (order = 0
). Otherwise, it checks whether paired-end 1 corresponds to the first given read:
if(read_metadata == []){ order = 0 }
else if('paired_end' in read_metadata){
var pe1 = read_metadata.paired_end
if(pe1 != 1) order = 1 // change order
}
Finally, the expression checks how many reads there are and returns them in the correct order. If only one input read is present, this read is returned as there is no need for ordering. If there are two input reads, they are returned in the correct order based on the value of the order
flag.
if (input_reads.length == 1){
return input_reads[0].path
}
else if (input_reads.length == 2){
if (order == 0) return input_reads[0].path + ' ' + input_reads[1].path
else return input_reads[1].path + ' ' + input_reads[0].path
}
}
Configure a tool to unpack a TAR archive provided as its input
In some cases, input files taken by a tool come in the form of a TAR archive. TAR archives can be produced by e.g. aligner indexers, which are a set of indexing tools that index reference files and output an archive containing the reference and the index file(s). A tool that needs to use the files from a TAR archive can be configured to unpack it.
Prerequisite: In order to simplify the process of unpacking the archive using the expression below, the tool's input port that takes the archive file will have have to be staged. This will make the TAR archive available directly in the tool's working directory.
- Leave Base Command empty
- Define the entire command line via arguments
- Click Add in the File Requirements section and select Expression
- Click </> next to the first field.
- Add the following code:
$(inputs.<input_port_id>)
Please make sure to replace <input_port_ID>
in the above code with the ID value of your tool's input port that takes the archive file.
6. Click Save. The input will now be available in the working directory.
7. In the Arguments section click Add an argument.
8. In the Value field on the right click </>.
9. Paste the following code:
{
var index_files_bundle = inputs.<input_port_ID>.basename
return 'tar -xf ' + index_files_bundle + ' ; '
}
The first line of the expression retrieves the name of the archive file using the inputs object. The second line appends the retrieved file name to the command that will unpack the archive file.
Please make sure to replace <input_port_ID>
in the above code with the ID value of your tool's input port that takes the archive file.
- Click Save.
- Click the Save icon in the top-right corner of the tool editor.
Your tool is now configured to unpack a TAR archive it receives as its input.
Updated less than a minute ago