{"_id":"5cbf3294ace26900161e4601","project":"5773dcfc255e820e00e1cd4d","version":{"_id":"5773dcfc255e820e00e1cd50","__v":26,"project":"5773dcfc255e820e00e1cd4d","createdAt":"2016-06-29T14:36:44.812Z","releaseDate":"2016-06-29T14:36:44.812Z","categories":["5773dcfc255e820e00e1cd51","5773df36904b0c0e00ef05ff","577baf92451b1e0e006075ac","577bb183b7ee4a0e007c4e8d","577ce77a1cf3cb0e0048e5ea","577d11865fd4de0e00cc3dab","578e62792c3c790e00937597","578f4fd98335ca0e006d5c84","578f5e5c3d04570e00976ebb","57bc35f7531e000e0075d118","57f801b3760f3a1700219ebb","5804d55d1642890f00803623","581c8d55c0dc651900aa9350","589dcf8ba8c63b3b00c3704f","594cebadd8a2f7001b0b53b2","59a562f46a5d8c00238e309a","5a2aa096e25025003c582b58","5a2e79566c771d003ca0acd4","5a3a5166142db90026f24007","5a3a52b5bcc254001c4bf152","5a3a574a2be213002675c6d2","5a3a66bb2be213002675cb73","5a3a6e4854faf60030b63159","5c8a68278e883901341de571","5cb9971e57bf020024523c7b","5cbf1683e2a36d01d5012ecd"],"is_deprecated":false,"is_hidden":false,"is_beta":false,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"category":{"_id":"5cbf1683e2a36d01d5012ecd","project":"5773dcfc255e820e00e1cd4d","version":"5773dcfc255e820e00e1cd50","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2019-04-23T13:43:31.578Z","from_sync":false,"order":13,"slug":"edit-an-app","title":"EDIT AN APP"},"user":"5767bc73bb15f40e00a28777","__v":0,"parentDoc":null,"updates":[],"next":{"pages":[],"description":""},"createdAt":"2019-04-23T15:43:16.044Z","link_external":false,"link_url":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":6,"body":"## Annotate output files with metadata\n\nYou may want to annotate the files produced by a tool with metadata, which will then be used by other tools in a workflow. You can choose the the name and type of these metadata by defining your own key-value pairs. Any keys can be used, but see this list for commonly used file metadata.\n\n### Annotate output files with metadata in CWL v1.0 apps\nTo enter key-value pairs that will be added to output files as metadata:\n\n1. In the **Output Ports** section click the port for which you want to set metadata.\n2. In the object inspector on the right, under **Metadata**, select the input port from which you would like to inherit metadata and use it to annotate files produced on the given output port.\nThe **Output eval** field gets automatically populated with an expression.\n3. Click the **Output eval** field. The Expression Editor opens, showing an expression such as `$(inheritMetadata(self, inputs.bam))`. This is expected and means that the files produced on the output port you are editing will inherit metadata from the input port whose ID is **bam**.\n\nTo annotate output file(s) with a custom metadata key and value pair (one not present on the input port), edit the expression in the **Output eval** field as follows:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"${\\n    var out = inheritMetadata(self, inputs.<input_id>)\\n    out.metadata['new_key'] = 'new_value'\\n    return out\\n}\",\n      \"language\": \"javascript\"\n    }\n  ]\n}\n[/block]\nIn the expression, `<input_id>` is the ID of the input port from which you are inheriting metadata, which was **bam** in the example above. The line `out.metadata['new_key'] = 'new_value'` sets a new, custom metadata key-value pair and can be repeated with new key values to set additional custom metadata.\n\n### Annotate output files with metadata in CWL sbg:draft-2 apps\nTo enter key-value pairs that will be added to output files as metadata, click on any output port that you have in the **Output Ports** section. In the object inspector, under **Metadata**, click **+ Add Metadata**. There is a field for the metadata **Keys** and a field for the corresponding **Value**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/001faf5-annotate-output-files-with-metadata.png\",\n        \"annotate-output-files-with-metadata.png\",\n        318,\n        625,\n        \"#f7f6f4\"\n      ]\n    }\n  ]\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Using dynamic expressions to capture metadata\",\n  \"body\": \"Metadata values can be string literals or dynamic expressions. For instance, the `$self` object can be used to refer to the path of the file being outputted. See the documentation on [dynamic expressions in tool descriptions](doc:dynamic-expressions-in-tool-descriptions-1) for more details.\"\n}\n[/block]\n## Create custom input structures for tools\n\nCertain tools will require more complex data types as inputs than simply files, strings or similar simple types. In particular, you may need to input arrays of structures. This requires you to define a custom input structure that is essentially an input type that is composed of further input types. A common situation in which you'll need to use a complex data structure is when the input to your bioinformatics tool is a genomic sample. This will consist of, for example, a BAM file and a sample ID (string), or perhaps two FASTQ files and an insert size (int).\n\nTo define a custom input structure:\n\n1. In the **Input Ports** section of the tool editor select an input port or add a new one by clicking **Add an Input**.\n2. Edit the ID of your input. In the example shown below we have given the array the ID **Custom_input**. You can give it any ID you choose.\n3. In the object inspector on the right, in the **Type** drop-down list select **array**. \n4. In the **Items Type** drop-down select **record** to define your own types. \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/cca770f-create-custom-input-structures-for-tools-1.png\",\n        \"create-custom-input-structures-for-tools-1.png\",\n        310,\n        358,\n        \"#f8f6f2\"\n      ]\n    }\n  ]\n}\n[/block]\nOnce you have created the input array, you can define its fields. To do this, in the **Input Ports** section click the **Add field** button within the newly created input.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/2139b0b-create-custom-input-structures-for-tools-2.png\",\n        \"create-custom-input-structures-for-tools-2.png\",\n        1231,\n        396,\n        \"#f4f6f6\"\n      ]\n    }\n  ]\n}\n[/block]\nDefine the individual input fields that the array is composed of. You define the input fields of an input array in the same way as you would usually define an input port for a tool. In our example, let's set the array data structure to take three inputs, two of which are FASTQ files, and the third of which is the insert size. We'll label the input ports with the IDs **FASTQ_1**, **FASTQ_2** and **Insert_size**, respectively. We set the **Type** of the first two to **File**, and the **Type** of the third as **int**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/c589978-create-custom-input-structures-for-tools-3.png\",\n        \"create-custom-input-structures-for-tools-3.png\",\n        912,\n        488,\n        \"#f8f7f4\"\n      ]\n    }\n  ]\n}\n[/block]\n## Describe index files\n\nSome tools generate index files, such as .bai, .tbx, .bwt files. Typically, you will want Cavatica to treat indexes as attached to their associated data file, so that they will be copied and moved along with the data file in workflows. To represent the index file as attached to its data file you should make sure that:\n\n1. The index file has the same filename as the data file and is stored in the same directory as it.\n2. You have indicated in the tool editor that an index file is expected to be output alongside the data file.\n\nTo do this:\n\n1. In the **Output Ports** section of the editor, enter the details of the data file as the output.\n2. Under **Secondary Files** click **Add secondary file** and enter the extension of the index file, modified according to the following convention (suppose that the index file has extension '.ext'):\n   * If the index file is named by simply appending **.ext** onto the end of the extension of the data file, then simply enter **.ext** in the field. For example, do this if the data file is a **.bam** and the index extension is **.bai** (so the resulting file name of the index file is **file_name.bam.bai**).\n   * If the index file is named by replacing the extension of the data file with **.ext**, then enter **^.ext** in the the field. For example, when the data file is a FASTA contig list and the index extension is **.dict.**\n\nIn the first example, we have named the output port 'BAM' and set up globbing to catch all files with the **.bam** extension and designate them as its outputs. We have also specified that a secondary file is expected, and that it should have the same name as the BAM file, but with the **.bai** extension appended to the file name and extension of the BAM file.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/cd78a11-describe-index-files-1.png\",\n        \"describe-index-files-1.png\",\n        308,\n        578,\n        \"#f8f8f9\"\n      ]\n    }\n  ]\n}\n[/block]\nIn the second example, the output port is named **Reference** and it catches files with the **.fasta** extension. We have specified that an index file is expected, and that it will have the same file name as the FASTA file but that the **.DICT** extension will replace the extension of the FASTA file.\n\n### Attaching index files and data files\nSome tools (indexers) only output index files for inputted data files; they do not output the data file and the index together. If your tool is one of these, then the index file will be outputted to the current working directory of the job, but the data file will not. This can make it difficult to forward the data file and index file together.\n\nIn this case, if you need to forward an input file with an attached index file, then you should copy the data file to the current working directory of the job (tool execution), since this is the directory to which the index file will be outputted. To copy the data file to the current working directory, select **Copy** under **Stage Input** on the description of that input port (CWL version sbg:draft-2 only).\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/e6139d5-describe-index-files-2.png\",\n        \"describe-index-files-2.png\",\n        308,\n        577,\n        \"#f8f8f8\"\n      ]\n    }\n  ]\n}\n[/block]\n### Describe index files using expressions (CWL v1.0 only)\nCWL version 1.0 also supports expressions as a method of defining index file extensions. This allows you to implement a more complex logic, such as:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"${\\n    if (self.nameext == '.gz'){\\n        return {'class': 'File', 'path': self.path + '.tbi'}\\n    }\\n    else{\\n        return {'class': 'File', 'path': self.path + '.idx'}\\n    }\\n}\",\n      \"language\": \"javascript\"\n    }\n  ]\n}\n[/block]\nIn the expression above, the index file extension depends on the extension of the input file. Specifically, if the extension of the input file is **.gz**, the returned value is a file object that has the same path as the input file, with the added **.tbi** extension. Otherwise, the returned file object will have the **.idx** extension.\n\nPlease note that an expression used for describing secondary (index) files can't return a pattern (such as **.tbi**), but must return a File object with full path to the index file, as shown in the example above.\n\n## Import your own CWL tool description\nAs an alternative to using the integrated editor on Cavatica, you can add your own tool description in JSON or YAML format, provided that the description is compliant with the Common Workflow Language. This method is useful if you already have CWL files for your tools.\n\nTo add your own tool description:\n\n1. Click on the **Code** tab in the tool editor.\n2. Paste your CWL description, replacing the existing code.\n3. Click the **Save** icon in the top-right corner.\n\nAlternatively, you can use the API to [upload CWL descriptions](doc:add-an-app-using-raw-cwl) of your tools or workflows in JSON or YAML format.\n\nTo get started writing CWL by hand, see [this guide](http://www.commonwl.org/user_guide/). Alternatively, try inspecting the CWL descriptions of public apps by opening them in the tool editor and switching to the **Code** tab.\n\n## Configure log files for your tool\n\nLogs are produced and kept for each job executed on Cavatica. They are shown on the visual interface and via the API:\n\n* To access a job's logs on the visual interface of Cavatica, go to the [view task logs](doc:view-task-logs) page.\n* To access a job's logs via the API, issue the API request to [get task execution details](doc:get-task-execution-details).\n\nYou can modify the default behavior so that further files are also presented as the logs for a given tool. This is done using a 'hint' specifying the filename or file extension that a file must match in order to be named as a log:\n\n1. Open your tool in the tool editor.\n2. Scroll down to the **Hints** section.\n3. Click **Add a Hint** to add a new hint, consisting of a key-value pair. To configure the log files for your tool, enter `sbg:SaveLogs` as the key, and a glob (pattern) as the value. Any filenames matching the glob will be reported as log files for jobs run using the tool. For example, the `*.txt` glob will match all files in the working directory of the job whose extension is `.txt`. For more information and examples, see the [documentation on globbing](https://docs.sevenbridges.com/docs/glob). You can also enter a literal filename, such as `results.txt` to catch this file specifically.\n\nYou can add values (globs) for as many log files as you like. A glob will only match files from the working directory of the job, without recursively searching through any subdirectories.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Note that the files shown as logs differ depending on whether you are viewing the logs of a job via the visual interface or via the API.\\n\\n* By default, the [view task logs](doc:view-task-logs) page on the visual interface shows as logs any *.log files in the working directory of the job (including std.err.log and cmd.log) as well as job.json and cwl.output.json files.\\n* By default, the API request to [get task execution details](doc:get-task-execution-details) shows as the log only the standard error for the job, std.err.log.\\n\\nConsequently, if you set the value of `sbg:SaveLogs` to `*.log`, then this will change the files displayed as logs via the API, but will *not* alter the files shown as logs on the visual interface, since the visual interface already presented all .log files as logs. However, if you add a new value of `*.txt` for `sbg:SaveLogs`, then .txt files will be added to the log files shown via the visual interface and the API.\",\n  \"title\": \"Differences between the API and visual interface\"\n}\n[/block]","excerpt":"","slug":"advanced-practices","type":"basic","title":"Advanced practices"}

Advanced practices


## Annotate output files with metadata You may want to annotate the files produced by a tool with metadata, which will then be used by other tools in a workflow. You can choose the the name and type of these metadata by defining your own key-value pairs. Any keys can be used, but see this list for commonly used file metadata. ### Annotate output files with metadata in CWL v1.0 apps To enter key-value pairs that will be added to output files as metadata: 1. In the **Output Ports** section click the port for which you want to set metadata. 2. In the object inspector on the right, under **Metadata**, select the input port from which you would like to inherit metadata and use it to annotate files produced on the given output port. The **Output eval** field gets automatically populated with an expression. 3. Click the **Output eval** field. The Expression Editor opens, showing an expression such as `$(inheritMetadata(self, inputs.bam))`. This is expected and means that the files produced on the output port you are editing will inherit metadata from the input port whose ID is **bam**. To annotate output file(s) with a custom metadata key and value pair (one not present on the input port), edit the expression in the **Output eval** field as follows: [block:code] { "codes": [ { "code": "${\n var out = inheritMetadata(self, inputs.<input_id>)\n out.metadata['new_key'] = 'new_value'\n return out\n}", "language": "javascript" } ] } [/block] In the expression, `<input_id>` is the ID of the input port from which you are inheriting metadata, which was **bam** in the example above. The line `out.metadata['new_key'] = 'new_value'` sets a new, custom metadata key-value pair and can be repeated with new key values to set additional custom metadata. ### Annotate output files with metadata in CWL sbg:draft-2 apps To enter key-value pairs that will be added to output files as metadata, click on any output port that you have in the **Output Ports** section. In the object inspector, under **Metadata**, click **+ Add Metadata**. There is a field for the metadata **Keys** and a field for the corresponding **Value**. [block:image] { "images": [ { "image": [ "https://files.readme.io/001faf5-annotate-output-files-with-metadata.png", "annotate-output-files-with-metadata.png", 318, 625, "#f7f6f4" ] } ] } [/block] [block:callout] { "type": "info", "title": "Using dynamic expressions to capture metadata", "body": "Metadata values can be string literals or dynamic expressions. For instance, the `$self` object can be used to refer to the path of the file being outputted. See the documentation on [dynamic expressions in tool descriptions](doc:dynamic-expressions-in-tool-descriptions-1) for more details." } [/block] ## Create custom input structures for tools Certain tools will require more complex data types as inputs than simply files, strings or similar simple types. In particular, you may need to input arrays of structures. This requires you to define a custom input structure that is essentially an input type that is composed of further input types. A common situation in which you'll need to use a complex data structure is when the input to your bioinformatics tool is a genomic sample. This will consist of, for example, a BAM file and a sample ID (string), or perhaps two FASTQ files and an insert size (int). To define a custom input structure: 1. In the **Input Ports** section of the tool editor select an input port or add a new one by clicking **Add an Input**. 2. Edit the ID of your input. In the example shown below we have given the array the ID **Custom_input**. You can give it any ID you choose. 3. In the object inspector on the right, in the **Type** drop-down list select **array**. 4. In the **Items Type** drop-down select **record** to define your own types. [block:image] { "images": [ { "image": [ "https://files.readme.io/cca770f-create-custom-input-structures-for-tools-1.png", "create-custom-input-structures-for-tools-1.png", 310, 358, "#f8f6f2" ] } ] } [/block] Once you have created the input array, you can define its fields. To do this, in the **Input Ports** section click the **Add field** button within the newly created input. [block:image] { "images": [ { "image": [ "https://files.readme.io/2139b0b-create-custom-input-structures-for-tools-2.png", "create-custom-input-structures-for-tools-2.png", 1231, 396, "#f4f6f6" ] } ] } [/block] Define the individual input fields that the array is composed of. You define the input fields of an input array in the same way as you would usually define an input port for a tool. In our example, let's set the array data structure to take three inputs, two of which are FASTQ files, and the third of which is the insert size. We'll label the input ports with the IDs **FASTQ_1**, **FASTQ_2** and **Insert_size**, respectively. We set the **Type** of the first two to **File**, and the **Type** of the third as **int**. [block:image] { "images": [ { "image": [ "https://files.readme.io/c589978-create-custom-input-structures-for-tools-3.png", "create-custom-input-structures-for-tools-3.png", 912, 488, "#f8f7f4" ] } ] } [/block] ## Describe index files Some tools generate index files, such as .bai, .tbx, .bwt files. Typically, you will want Cavatica to treat indexes as attached to their associated data file, so that they will be copied and moved along with the data file in workflows. To represent the index file as attached to its data file you should make sure that: 1. The index file has the same filename as the data file and is stored in the same directory as it. 2. You have indicated in the tool editor that an index file is expected to be output alongside the data file. To do this: 1. In the **Output Ports** section of the editor, enter the details of the data file as the output. 2. Under **Secondary Files** click **Add secondary file** and enter the extension of the index file, modified according to the following convention (suppose that the index file has extension '.ext'): * If the index file is named by simply appending **.ext** onto the end of the extension of the data file, then simply enter **.ext** in the field. For example, do this if the data file is a **.bam** and the index extension is **.bai** (so the resulting file name of the index file is **file_name.bam.bai**). * If the index file is named by replacing the extension of the data file with **.ext**, then enter **^.ext** in the the field. For example, when the data file is a FASTA contig list and the index extension is **.dict.** In the first example, we have named the output port 'BAM' and set up globbing to catch all files with the **.bam** extension and designate them as its outputs. We have also specified that a secondary file is expected, and that it should have the same name as the BAM file, but with the **.bai** extension appended to the file name and extension of the BAM file. [block:image] { "images": [ { "image": [ "https://files.readme.io/cd78a11-describe-index-files-1.png", "describe-index-files-1.png", 308, 578, "#f8f8f9" ] } ] } [/block] In the second example, the output port is named **Reference** and it catches files with the **.fasta** extension. We have specified that an index file is expected, and that it will have the same file name as the FASTA file but that the **.DICT** extension will replace the extension of the FASTA file. ### Attaching index files and data files Some tools (indexers) only output index files for inputted data files; they do not output the data file and the index together. If your tool is one of these, then the index file will be outputted to the current working directory of the job, but the data file will not. This can make it difficult to forward the data file and index file together. In this case, if you need to forward an input file with an attached index file, then you should copy the data file to the current working directory of the job (tool execution), since this is the directory to which the index file will be outputted. To copy the data file to the current working directory, select **Copy** under **Stage Input** on the description of that input port (CWL version sbg:draft-2 only). [block:image] { "images": [ { "image": [ "https://files.readme.io/e6139d5-describe-index-files-2.png", "describe-index-files-2.png", 308, 577, "#f8f8f8" ] } ] } [/block] ### Describe index files using expressions (CWL v1.0 only) CWL version 1.0 also supports expressions as a method of defining index file extensions. This allows you to implement a more complex logic, such as: [block:code] { "codes": [ { "code": "${\n if (self.nameext == '.gz'){\n return {'class': 'File', 'path': self.path + '.tbi'}\n }\n else{\n return {'class': 'File', 'path': self.path + '.idx'}\n }\n}", "language": "javascript" } ] } [/block] In the expression above, the index file extension depends on the extension of the input file. Specifically, if the extension of the input file is **.gz**, the returned value is a file object that has the same path as the input file, with the added **.tbi** extension. Otherwise, the returned file object will have the **.idx** extension. Please note that an expression used for describing secondary (index) files can't return a pattern (such as **.tbi**), but must return a File object with full path to the index file, as shown in the example above. ## Import your own CWL tool description As an alternative to using the integrated editor on Cavatica, you can add your own tool description in JSON or YAML format, provided that the description is compliant with the Common Workflow Language. This method is useful if you already have CWL files for your tools. To add your own tool description: 1. Click on the **Code** tab in the tool editor. 2. Paste your CWL description, replacing the existing code. 3. Click the **Save** icon in the top-right corner. Alternatively, you can use the API to [upload CWL descriptions](doc:add-an-app-using-raw-cwl) of your tools or workflows in JSON or YAML format. To get started writing CWL by hand, see [this guide](http://www.commonwl.org/user_guide/). Alternatively, try inspecting the CWL descriptions of public apps by opening them in the tool editor and switching to the **Code** tab. ## Configure log files for your tool Logs are produced and kept for each job executed on Cavatica. They are shown on the visual interface and via the API: * To access a job's logs on the visual interface of Cavatica, go to the [view task logs](doc:view-task-logs) page. * To access a job's logs via the API, issue the API request to [get task execution details](doc:get-task-execution-details). You can modify the default behavior so that further files are also presented as the logs for a given tool. This is done using a 'hint' specifying the filename or file extension that a file must match in order to be named as a log: 1. Open your tool in the tool editor. 2. Scroll down to the **Hints** section. 3. Click **Add a Hint** to add a new hint, consisting of a key-value pair. To configure the log files for your tool, enter `sbg:SaveLogs` as the key, and a glob (pattern) as the value. Any filenames matching the glob will be reported as log files for jobs run using the tool. For example, the `*.txt` glob will match all files in the working directory of the job whose extension is `.txt`. For more information and examples, see the [documentation on globbing](https://docs.sevenbridges.com/docs/glob). You can also enter a literal filename, such as `results.txt` to catch this file specifically. You can add values (globs) for as many log files as you like. A glob will only match files from the working directory of the job, without recursively searching through any subdirectories. [block:callout] { "type": "info", "body": "Note that the files shown as logs differ depending on whether you are viewing the logs of a job via the visual interface or via the API.\n\n* By default, the [view task logs](doc:view-task-logs) page on the visual interface shows as logs any *.log files in the working directory of the job (including std.err.log and cmd.log) as well as job.json and cwl.output.json files.\n* By default, the API request to [get task execution details](doc:get-task-execution-details) shows as the log only the standard error for the job, std.err.log.\n\nConsequently, if you set the value of `sbg:SaveLogs` to `*.log`, then this will change the files displayed as logs via the API, but will *not* alter the files shown as logs on the visual interface, since the visual interface already presented all .log files as logs. However, if you add a new value of `*.txt` for `sbg:SaveLogs`, then .txt files will be added to the log files shown via the visual interface and the API.", "title": "Differences between the API and visual interface" } [/block]