{"_id":"5cd57a72bf6f320131f337b6","project":"5773dcfc255e820e00e1cd4d","version":{"_id":"5773dcfc255e820e00e1cd50","__v":27,"project":"5773dcfc255e820e00e1cd4d","createdAt":"2016-06-29T14:36:44.812Z","releaseDate":"2016-06-29T14:36:44.812Z","categories":["5773dcfc255e820e00e1cd51","5773df36904b0c0e00ef05ff","577baf92451b1e0e006075ac","577bb183b7ee4a0e007c4e8d","577ce77a1cf3cb0e0048e5ea","577d11865fd4de0e00cc3dab","578e62792c3c790e00937597","578f4fd98335ca0e006d5c84","578f5e5c3d04570e00976ebb","57bc35f7531e000e0075d118","57f801b3760f3a1700219ebb","5804d55d1642890f00803623","581c8d55c0dc651900aa9350","589dcf8ba8c63b3b00c3704f","594cebadd8a2f7001b0b53b2","59a562f46a5d8c00238e309a","5a2aa096e25025003c582b58","5a2e79566c771d003ca0acd4","5a3a5166142db90026f24007","5a3a52b5bcc254001c4bf152","5a3a574a2be213002675c6d2","5a3a66bb2be213002675cb73","5a3a6e4854faf60030b63159","5c8a68278e883901341de571","5cb9971e57bf020024523c7b","5cbf1683e2a36d01d5012ecd","5dc15666a4f788004c5fd7d7"],"is_deprecated":false,"is_hidden":false,"is_beta":false,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"category":{"_id":"5cb9971e57bf020024523c7b","project":"5773dcfc255e820e00e1cd4d","version":"5773dcfc255e820e00e1cd50","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2019-04-19T09:38:38.103Z","from_sync":false,"order":4,"slug":"set-metadata-associated-with-a-file","title":"Set metadata associated with a file"},"user":"566590c83889610d0008a253","__v":0,"parentDoc":null,"metadata":{"title":"","description":"","image":[]},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2019-05-10T13:19:46.264Z","link_external":false,"link_url":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":2,"body":"You can use the Command Line Uploader to set some or all of the metadata during upload. Or, you can [manually set metadata](doc:set-metadata-using-the-visual-interface) later. \n\n##Set metadata for a single file\n\nFor each file queued for upload, the Uploader looks for a supplementary file containing metadata to set for the file. This supplementary file should exist in the same directory as the file being uploaded, have an identical name to the original filename, and be appended by `.meta`.\n\nFor example, if you are uploading `sample1.fastq`, the supplementary file should be named `sample1.fastq.meta`.\n\nThe supplementary file should contain a valid JSON object, as shown in the example below. Key-value pairs from this JSON object will be set on the server as metadata describing the uploaded file. For a list of key-value pairs that should be used to set a file's metadata, see the section on the JSON metadata schema in our documentation on [metadata](doc:metadata-on-cavatica).\n\nIf the supplementary .meta file contains invalid JSON or metadata values that fall outside of their acceptable range, a warning will be issued on the standard output, but the file upload will continue. Note that if you set invalid metadata values, the workflows you use with your files may not function correctly.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Supplementary files do not need to be included for upload in order for their metadata to be applied to the files being uploaded. Parsing and assigning metadata from supplementary files happens automatically as long as they are properly matched to their principal files via the naming convention described above.\"\n}\n[/block]\nThe following array of key-value pairs is an example of the metadata that could be contained in the metadata file `sample1.fastq.meta`:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n  \\\"sample_id\\\": \\\"sample1\\\",\\n  \\\"library_id\\\": \\\"library1\\\",\\n  \\\"paired_end\\\": \\\"1\\\",\\n  \\\"platform\\\": \\\"illumina HiSeq\\\",\\n  \\\"quality_scale\\\": \\\"illumina13\\\"\\n}\",\n      \"language\": \"text\",\n      \"name\": \"Bash\"\n    }\n  ]\n}\n[/block]\n \n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"If you are using old style projects and want to set metadata using the command line uploader, you need to use the following array of key-value pairs instead of example  above: \\n\\n{\\n  \\\"file_type\\\": \\\"fastq\\\",\\n  \\\"sample\\\": \\\"sample1\\\",\\n  \\\"library\\\": \\\"library1\\\",\\n  \\\"paired_end\\\": \\\"1\\\",\\n  \\\"qual_scale\\\": \\\"illumina13\\\",\\n  \\\"seq_tech\\\": \\\"illumina\\\"\\n}\",\n  \"title\": \"For Seven Bridged users on AWS:\"\n}\n[/block]\nApart from the standard set of metadata fields that can be seen through the visual interface, you are also able to add custom metadata for your files. Custom metadata fields are user-defined key-value pairs that allow you to provide additional metadata associated to files on Cavatica. Custom metadata can be added via the command line uploader or via the API, but *not* through the visual interface.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Custom metadata fields will not be visible on the visual interface, but their values can be retrieved by [getting file details via the API](doc:get-file-details).\"\n}\n[/block]\nWhen adding custom metadata fields, you need to pay attention to the following set of rules:\n  * Keys and values are case sensitive unless explicitly treated differently by a tool or a part of the Platform.\n  * Maximum number of key-value pairs per file is 1000, including null-value keys.\n  * Keys and values are UTF-8 encoded strings.\n  * Maximum length of a key is 100 bytes (UTF-8 encoding).\n  * Maximum length of a value is 300 bytes (UTF-8 encoding).\n[block:callout]\n{\n  \"type\": \"success\",\n  \"body\": \"Learn more about [metadata fields on Cavatica](metadata-on-cavatica).\"\n}\n[/block]\n \n##Set metadata for multiple files using a manifest file\n\nMetadata can be set for multiple files during the upload by supplying a manifest file which contains the metadata for a group of accompanying files.\n\n \n## Set metadata for multiple files\n\nMetadata can be set for multiple files during the upload by supplying a manifest file that contains the metadata for a group of accompanying files.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Learn more about the [manifest file format](doc:format-of-a-manifest-file).\"\n}\n[/block]\n### Upload files and set metadata\n\nTo upload multiple files and set their metadata using the manifest, issue the following command:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"sbg-uploader.sh --manifest-file filename.csv --manifest-metadata\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nThis will upload all files which are specified in the manifest (e.g. filename.csv) and apply relevant metadata for each of the files.\n\nThe `--manifest-file` option is used for specifying the name (and path) of the manifest file, while the `--manifest-metadata` option instructs the Command Line Uploader to also parse metadata values from the manifest.\n\n### Upload files and set individual metadata fields\n\nTo upload multiple files and set individual metadata fields, issue the following command:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"sbg-uploader.sh --manifest-file filename.csv --manifest-metadata sample paired_end\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nIn the example above the only two metadata fields which will be set for to uploaded files are `sample` and `paired_end`. The metadata fields are specified after the `--manifest-metadata` option.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"You can specify any number of metadata fields by listing them after the `--manifest-metadata` option.\"\n}\n[/block]\n### Upload files without setting metadata\n\nThe manifest file allows you to specify multiple files for the upload without setting any metadata. This is useful in case you are dealing with larger volumes of data, or if you want to automate the upload of a fixed list of files.\n\nTo upload files which are specified in the manifest while omitting the metadata, issue the following command:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"sbg-uploader.sh --manifest-file filename.csv\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\n### Perform a dry run\n\nBefore performing an actual upload you can do a dry run. This will only output data in the terminal allowing you to check if all the settings are correct without uploading anything. To perform a dry run, issue the following command:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"sbg-uploader.sh --manifest-file manifest.csv --manifest-metadata --dry-run\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nTo only output information about specific metadata fields, issue the following command:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"sbg-uploader.sh --manifest-file manifest.csv --manifest-metadata --dry-run sample library\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nThe `sample` and `library` metadata fields are the only ones which will be outputted in the terminal.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"You can specify any number of individual metadata fields by listing them after the `--dry-run` option.\"\n}\n[/block]\n### General notes\n\nThe Command Line Uploader assumes that both the files which are being uploaded and the accompanying manifest file reside in the same directory. If that is not the case, you can specify the path:\n\n  * **within the manifest**, by prepending the file path to the file name.\n  * **in the command line** by specifying the full path to the manifest file.\n\nIf a file you have specified in the manifest also has an accompanying .meta file, the contents of that .meta file will be applied in addition to what is parsed from the manifest, expanding and/or overriding any key-value pairs.","excerpt":"","slug":"set-metadata-using-the-command-line-uploader","type":"basic","title":"Set metadata using the command line uploader"}

Set metadata using the command line uploader


You can use the Command Line Uploader to set some or all of the metadata during upload. Or, you can [manually set metadata](doc:set-metadata-using-the-visual-interface) later. ##Set metadata for a single file For each file queued for upload, the Uploader looks for a supplementary file containing metadata to set for the file. This supplementary file should exist in the same directory as the file being uploaded, have an identical name to the original filename, and be appended by `.meta`. For example, if you are uploading `sample1.fastq`, the supplementary file should be named `sample1.fastq.meta`. The supplementary file should contain a valid JSON object, as shown in the example below. Key-value pairs from this JSON object will be set on the server as metadata describing the uploaded file. For a list of key-value pairs that should be used to set a file's metadata, see the section on the JSON metadata schema in our documentation on [metadata](doc:metadata-on-cavatica). If the supplementary .meta file contains invalid JSON or metadata values that fall outside of their acceptable range, a warning will be issued on the standard output, but the file upload will continue. Note that if you set invalid metadata values, the workflows you use with your files may not function correctly. [block:callout] { "type": "info", "body": "Supplementary files do not need to be included for upload in order for their metadata to be applied to the files being uploaded. Parsing and assigning metadata from supplementary files happens automatically as long as they are properly matched to their principal files via the naming convention described above." } [/block] The following array of key-value pairs is an example of the metadata that could be contained in the metadata file `sample1.fastq.meta`: [block:code] { "codes": [ { "code": "{\n \"sample_id\": \"sample1\",\n \"library_id\": \"library1\",\n \"paired_end\": \"1\",\n \"platform\": \"illumina HiSeq\",\n \"quality_scale\": \"illumina13\"\n}", "language": "text", "name": "Bash" } ] } [/block] [block:callout] { "type": "warning", "body": "If you are using old style projects and want to set metadata using the command line uploader, you need to use the following array of key-value pairs instead of example above: \n\n{\n \"file_type\": \"fastq\",\n \"sample\": \"sample1\",\n \"library\": \"library1\",\n \"paired_end\": \"1\",\n \"qual_scale\": \"illumina13\",\n \"seq_tech\": \"illumina\"\n}", "title": "For Seven Bridged users on AWS:" } [/block] Apart from the standard set of metadata fields that can be seen through the visual interface, you are also able to add custom metadata for your files. Custom metadata fields are user-defined key-value pairs that allow you to provide additional metadata associated to files on Cavatica. Custom metadata can be added via the command line uploader or via the API, but *not* through the visual interface. [block:callout] { "type": "info", "body": "Custom metadata fields will not be visible on the visual interface, but their values can be retrieved by [getting file details via the API](doc:get-file-details)." } [/block] When adding custom metadata fields, you need to pay attention to the following set of rules: * Keys and values are case sensitive unless explicitly treated differently by a tool or a part of the Platform. * Maximum number of key-value pairs per file is 1000, including null-value keys. * Keys and values are UTF-8 encoded strings. * Maximum length of a key is 100 bytes (UTF-8 encoding). * Maximum length of a value is 300 bytes (UTF-8 encoding). [block:callout] { "type": "success", "body": "Learn more about [metadata fields on Cavatica](metadata-on-cavatica)." } [/block] ##Set metadata for multiple files using a manifest file Metadata can be set for multiple files during the upload by supplying a manifest file which contains the metadata for a group of accompanying files. ## Set metadata for multiple files Metadata can be set for multiple files during the upload by supplying a manifest file that contains the metadata for a group of accompanying files. [block:callout] { "type": "info", "body": "Learn more about the [manifest file format](doc:format-of-a-manifest-file)." } [/block] ### Upload files and set metadata To upload multiple files and set their metadata using the manifest, issue the following command: [block:code] { "codes": [ { "code": "sbg-uploader.sh --manifest-file filename.csv --manifest-metadata", "language": "text" } ] } [/block] This will upload all files which are specified in the manifest (e.g. filename.csv) and apply relevant metadata for each of the files. The `--manifest-file` option is used for specifying the name (and path) of the manifest file, while the `--manifest-metadata` option instructs the Command Line Uploader to also parse metadata values from the manifest. ### Upload files and set individual metadata fields To upload multiple files and set individual metadata fields, issue the following command: [block:code] { "codes": [ { "code": "sbg-uploader.sh --manifest-file filename.csv --manifest-metadata sample paired_end", "language": "text" } ] } [/block] In the example above the only two metadata fields which will be set for to uploaded files are `sample` and `paired_end`. The metadata fields are specified after the `--manifest-metadata` option. [block:callout] { "type": "info", "body": "You can specify any number of metadata fields by listing them after the `--manifest-metadata` option." } [/block] ### Upload files without setting metadata The manifest file allows you to specify multiple files for the upload without setting any metadata. This is useful in case you are dealing with larger volumes of data, or if you want to automate the upload of a fixed list of files. To upload files which are specified in the manifest while omitting the metadata, issue the following command: [block:code] { "codes": [ { "code": "sbg-uploader.sh --manifest-file filename.csv", "language": "text" } ] } [/block] ### Perform a dry run Before performing an actual upload you can do a dry run. This will only output data in the terminal allowing you to check if all the settings are correct without uploading anything. To perform a dry run, issue the following command: [block:code] { "codes": [ { "code": "sbg-uploader.sh --manifest-file manifest.csv --manifest-metadata --dry-run", "language": "text" } ] } [/block] To only output information about specific metadata fields, issue the following command: [block:code] { "codes": [ { "code": "sbg-uploader.sh --manifest-file manifest.csv --manifest-metadata --dry-run sample library", "language": "text" } ] } [/block] The `sample` and `library` metadata fields are the only ones which will be outputted in the terminal. [block:callout] { "type": "info", "body": "You can specify any number of individual metadata fields by listing them after the `--dry-run` option." } [/block] ### General notes The Command Line Uploader assumes that both the files which are being uploaded and the accompanying manifest file reside in the same directory. If that is not the case, you can specify the path: * **within the manifest**, by prepending the file path to the file name. * **in the command line** by specifying the full path to the manifest file. If a file you have specified in the manifest also has an accompanying .meta file, the contents of that .meta file will be applied in addition to what is parsed from the manifest, expanding and/or overriding any key-value pairs.