{"metadata":{"image":[],"title":"","description":""},"api":{"url":"","auth":"required","settings":"","results":{"codes":[]},"params":[]},"next":{"description":"","pages":[]},"title":"About libraries in a Data Cruncher analysis","type":"basic","slug":"about-libraries-in-a-data-cruncher-analysis","excerpt":"","body":"At the moment, Data Cruncher offers a set of predefined libraries curated by Seven Bridges bioinformaticians, which are automatically available every time an analysis is started. The list of available libraries depends on the _environment_ you are using (**JupyterLab** or **RStudio**) and the selected _environment setup_ (set of preinstalled libraries that is available each time an analysis is started). Both of these settings are selected in the analysis creation wizard and cannot be changed once the analysis has been created.\n\n## JupyterLab\n\nDepending on the purpose and objective of your JupyterLab analysis, you can select an environment setup that you find most suitable for the given analysis. The following table shows the available JupyterLab environment setups and some details about available tools and libraries in each of them:\n\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Environment setup\",\n    \"h-1\": \"Details\",\n    \"1-0\": \"**SB Data Science - Python 3.6, R 3.4** (legacy)\",\n    \"1-1\": \"This environment setup contains **Python version 3.6.3**, **R version 3.4.1** and **Julia 0.6.2**. The setup also includes libraries that are available in [datascience-notebook](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), with the addition of the following libraries:\\n\\n*Python2 \\\\ Python3:* **path.py**, **biopython**, **pymongo**, **cytoolz**, **pysam**, **pyvcf**, **ipywidgets**, **beautifulsoup4**, **cigar**, **bioservices**, **intervaltree**, **appdirs**, **cssselect**, **bokeh**, **scikit-allel**, **cairo**, **lxml**, **cairosvg**, **rpy2**\\n\\n*R:* **r-ggfortify**, **r**, **r-stringi**, **r-pheatmap**, **r-gplots**, **bioconductor-ballgown**, **bioconductor-deseq2**, **bioconductor-metagenomeseq**, **bioconductor-biomformat**, **bioconductor-biocinstaller**, **r-xml**\",\n    \"2-0\": \"**SB Machine Learning - TensorFlow 2.0, Python 3.7**\",\n    \"2-1\": \"This environment setup is optimized for machine learning and *execution on GPU instances*. It is based on the **jupyter/tensorflow-notebook** image (**jupyter/scipy-notebook** that includes popular packages from the scientific Python ecosystem, with the addition of popular Python deep learning libraries). Learn more about [available libraries](https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-tensorflow-notebook).\",\n    \"0-0\": \"**SB Data Science - Python 3.8, R 3.6** (default)\",\n    \"0-1\": \"This environment setup contains **Python version 3.8**, **R version 3.6.3** and **Julia 1.4.1**. The setup also includes libraries that are available in [datascience-notebook](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), with the addition of the **tabix** library.\"\n  },\n  \"cols\": 2,\n  \"rows\": 3\n}\n[/block]\n\nAll available environment setups also contain **sevenbridges-python** and **sevenbridges-r** API libraries, as well as **htop** and **openvpn** as general-purpose tools. The libraries are installed using **conda**, as JupyterLab supports multiple programming languages and **conda** is a language-agnostic package manager. You can also install libraries directly from the notebook and use them during the execution of your analysis. For optimal performance and avoidance of potential conflicts, we recommend using **conda** when installing libraries within your analyses. However, unlike default libraries, libraries installed in that way will not be automatically available next time the analysis is started.\n\n## RStudio (beta)\n\nIf you select RStudio as the analysis environment, you can also select one of the available environment setups depending on the purpose of your analysis. This will help you optimize analysis setup and time to getting a fully-functional environment that suits your needs by having the needed libraries preinstalled in the selected environment setup. Here are the available options:\n\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Environment setup\",\n    \"h-1\": \"Details\",\n    \"1-0\": \"**SB Bioinformatics - R 3.6**\",\n    \"1-1\": \"This environment setup is based on the **rstudio/verse** image from [The Rocker Project](https://www.rocker-project.org/) and contains **tidyverse**, **devtools**, tex and publishing-related packages. For more information about the image, please see [its Docker Hub repository](https://hub.docker.com/r/rocker/verse). \\n\\nHere is a list of libraries that are installed by default:\\n\\n*CRAN* - **BiocManager**, **ggfortify**, **pheatmap**, **gplots**\\n\\n*Bioconductor* - **ballgown**, **DESeq2**, **metagenomeSeq**, **biomformat**, **BiocInstaller**\",\n    \"2-0\": \"**SB Machine Learning - TensorFlow 1.13, R 3.6**\",\n    \"2-1\": \"This environment setup is optimized for machine learning and *execution on GPU instances*. It is based on the **rocker/ml-gpu** image that is intended for machine learning and GPU-based computation in R. [Learn more](https://hub.docker.com/r/rocker/ml-gpu).\",\n    \"0-0\": \"**SB Bioinformatics - R 4.0** (default)\",\n    \"0-1\": \"This environment setup is based on the official Bioconductor image **bioconductor_docker:RELEASE_3_11** which is built on top of **rockerdev/rstudio:4.0.0-ubuntu18.04**. For more information about the image, please see [its Docker Hub repository](https://hub.docker.com/r/bioconductor/bioconductor_docker). \\n\\nHere is a list of libraries that are installed by default:\\n\\n*CRAN* -  **BiocManager**, **devtools**, **doSNOW**, **ggfortify**, **gplots**, **pheatmap**, **Seurat**, **tidyverse**\\n\\n*Bioconductor* - **AnnotationDbi**, **arrayQualityMetrics**, **ballgown**, **Biobase**, **BiocParallel**, **biomaRt**, **biomformat**, **Biostrings**, **DelayedArray**, **DESeq2**, **edgeR**, **genefilter**, **GenomeInfoDb**, **GenomicAlignments**, **GenomicFeatures**, **GenomicRanges**, **GEOquery**, **IRanges**, **limma**, **metagenomeSeq**, **oligo**, **Rsamtools**, **rtracklayer**, **SummarizedExperiment**, **XVector**\"\n  },\n  \"cols\": 2,\n  \"rows\": 3\n}\n[/block]\nAll available environment setups also contain the [sevenbridges-r](https://github.com/sbg/sevenbridges-r) API library, as well as **htop** and **openvpn** as general-purpose tools.","updates":[],"order":8,"isReference":false,"hidden":false,"sync_unique":"","link_url":"","link_external":false,"_id":"594cee43c804570021d22185","project":"5773dcfc255e820e00e1cd4d","version":{"version":"1.0","version_clean":"1.0.0","codename":"","is_stable":true,"is_beta":false,"is_hidden":false,"is_deprecated":false,"categories":["5773dcfc255e820e00e1cd51","5773df36904b0c0e00ef05ff","577baf92451b1e0e006075ac","577bb183b7ee4a0e007c4e8d","577ce77a1cf3cb0e0048e5ea","577d11865fd4de0e00cc3dab","578e62792c3c790e00937597","578f4fd98335ca0e006d5c84","578f5e5c3d04570e00976ebb","57bc35f7531e000e0075d118","57f801b3760f3a1700219ebb","5804d55d1642890f00803623","581c8d55c0dc651900aa9350","589dcf8ba8c63b3b00c3704f","594cebadd8a2f7001b0b53b2","59a562f46a5d8c00238e309a","5a2aa096e25025003c582b58","5a2e79566c771d003ca0acd4","5a3a5166142db90026f24007","5a3a52b5bcc254001c4bf152","5a3a574a2be213002675c6d2","5a3a66bb2be213002675cb73","5a3a6e4854faf60030b63159","5c8a68278e883901341de571","5cb9971e57bf020024523c7b","5cbf1683e2a36d01d5012ecd","5dc15666a4f788004c5fd7d7","5eaff69e844d67003642a020","5eb00899b36ba5002d35b0c1","5eb0172be179b70073dc936e","5eb01b42b36ba5002d35ebba","5eb01f202654a20136813093","5eb918ef149186021c9a76c8","5f0839d3f4b24e005ebbbc29"],"_id":"5773dcfc255e820e00e1cd50","__v":34,"createdAt":"2016-06-29T14:36:44.812Z","releaseDate":"2016-06-29T14:36:44.812Z","project":"5773dcfc255e820e00e1cd4d"},"category":{"sync":{"isSync":false,"url":""},"pages":[],"title":"Data Cruncher","slug":"data-cruncher","order":27,"from_sync":false,"reference":false,"_id":"594cebadd8a2f7001b0b53b2","project":"5773dcfc255e820e00e1cd4d","version":"5773dcfc255e820e00e1cd50","createdAt":"2017-06-23T10:21:33.309Z","__v":0},"user":"575e85ac41c8ba0e00259a44","createdAt":"2017-06-23T10:32:35.963Z","githubsync":"","__v":0,"parentDoc":null}

About libraries in a Data Cruncher analysis


At the moment, Data Cruncher offers a set of predefined libraries curated by Seven Bridges bioinformaticians, which are automatically available every time an analysis is started. The list of available libraries depends on the _environment_ you are using (**JupyterLab** or **RStudio**) and the selected _environment setup_ (set of preinstalled libraries that is available each time an analysis is started). Both of these settings are selected in the analysis creation wizard and cannot be changed once the analysis has been created. ## JupyterLab Depending on the purpose and objective of your JupyterLab analysis, you can select an environment setup that you find most suitable for the given analysis. The following table shows the available JupyterLab environment setups and some details about available tools and libraries in each of them: [block:parameters] { "data": { "h-0": "Environment setup", "h-1": "Details", "1-0": "**SB Data Science - Python 3.6, R 3.4** (legacy)", "1-1": "This environment setup contains **Python version 3.6.3**, **R version 3.4.1** and **Julia 0.6.2**. The setup also includes libraries that are available in [datascience-notebook](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), with the addition of the following libraries:\n\n*Python2 \\ Python3:* **path.py**, **biopython**, **pymongo**, **cytoolz**, **pysam**, **pyvcf**, **ipywidgets**, **beautifulsoup4**, **cigar**, **bioservices**, **intervaltree**, **appdirs**, **cssselect**, **bokeh**, **scikit-allel**, **cairo**, **lxml**, **cairosvg**, **rpy2**\n\n*R:* **r-ggfortify**, **r**, **r-stringi**, **r-pheatmap**, **r-gplots**, **bioconductor-ballgown**, **bioconductor-deseq2**, **bioconductor-metagenomeseq**, **bioconductor-biomformat**, **bioconductor-biocinstaller**, **r-xml**", "2-0": "**SB Machine Learning - TensorFlow 2.0, Python 3.7**", "2-1": "This environment setup is optimized for machine learning and *execution on GPU instances*. It is based on the **jupyter/tensorflow-notebook** image (**jupyter/scipy-notebook** that includes popular packages from the scientific Python ecosystem, with the addition of popular Python deep learning libraries). Learn more about [available libraries](https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-tensorflow-notebook).", "0-0": "**SB Data Science - Python 3.8, R 3.6** (default)", "0-1": "This environment setup contains **Python version 3.8**, **R version 3.6.3** and **Julia 1.4.1**. The setup also includes libraries that are available in [datascience-notebook](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), with the addition of the **tabix** library." }, "cols": 2, "rows": 3 } [/block] All available environment setups also contain **sevenbridges-python** and **sevenbridges-r** API libraries, as well as **htop** and **openvpn** as general-purpose tools. The libraries are installed using **conda**, as JupyterLab supports multiple programming languages and **conda** is a language-agnostic package manager. You can also install libraries directly from the notebook and use them during the execution of your analysis. For optimal performance and avoidance of potential conflicts, we recommend using **conda** when installing libraries within your analyses. However, unlike default libraries, libraries installed in that way will not be automatically available next time the analysis is started. ## RStudio (beta) If you select RStudio as the analysis environment, you can also select one of the available environment setups depending on the purpose of your analysis. This will help you optimize analysis setup and time to getting a fully-functional environment that suits your needs by having the needed libraries preinstalled in the selected environment setup. Here are the available options: [block:parameters] { "data": { "h-0": "Environment setup", "h-1": "Details", "1-0": "**SB Bioinformatics - R 3.6**", "1-1": "This environment setup is based on the **rstudio/verse** image from [The Rocker Project](https://www.rocker-project.org/) and contains **tidyverse**, **devtools**, tex and publishing-related packages. For more information about the image, please see [its Docker Hub repository](https://hub.docker.com/r/rocker/verse). \n\nHere is a list of libraries that are installed by default:\n\n*CRAN* - **BiocManager**, **ggfortify**, **pheatmap**, **gplots**\n\n*Bioconductor* - **ballgown**, **DESeq2**, **metagenomeSeq**, **biomformat**, **BiocInstaller**", "2-0": "**SB Machine Learning - TensorFlow 1.13, R 3.6**", "2-1": "This environment setup is optimized for machine learning and *execution on GPU instances*. It is based on the **rocker/ml-gpu** image that is intended for machine learning and GPU-based computation in R. [Learn more](https://hub.docker.com/r/rocker/ml-gpu).", "0-0": "**SB Bioinformatics - R 4.0** (default)", "0-1": "This environment setup is based on the official Bioconductor image **bioconductor_docker:RELEASE_3_11** which is built on top of **rockerdev/rstudio:4.0.0-ubuntu18.04**. For more information about the image, please see [its Docker Hub repository](https://hub.docker.com/r/bioconductor/bioconductor_docker). \n\nHere is a list of libraries that are installed by default:\n\n*CRAN* - **BiocManager**, **devtools**, **doSNOW**, **ggfortify**, **gplots**, **pheatmap**, **Seurat**, **tidyverse**\n\n*Bioconductor* - **AnnotationDbi**, **arrayQualityMetrics**, **ballgown**, **Biobase**, **BiocParallel**, **biomaRt**, **biomformat**, **Biostrings**, **DelayedArray**, **DESeq2**, **edgeR**, **genefilter**, **GenomeInfoDb**, **GenomicAlignments**, **GenomicFeatures**, **GenomicRanges**, **GEOquery**, **IRanges**, **limma**, **metagenomeSeq**, **oligo**, **Rsamtools**, **rtracklayer**, **SummarizedExperiment**, **XVector**" }, "cols": 2, "rows": 3 } [/block] All available environment setups also contain the [sevenbridges-r](https://github.com/sbg/sevenbridges-r) API library, as well as **htop** and **openvpn** as general-purpose tools.