-
Notifications
You must be signed in to change notification settings - Fork 10
Adding mia examples #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SHillman836
wants to merge
84
commits into
EBI-Metagenomics:main
Choose a base branch
from
SHillman836:adding-mia-examples
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
84 commits
Select commit
Hold shift + click to select a range
bec04a9
Initial commit
SandyRogers d8bced8
adds docker setups for local and shinyproxy; first notebooks
SandyRogers a757cc1
updates container config for quay.io
SandyRogers 2c8c1c7
updates R notebooks: cheat sheet; output removal; cross-study taxonom…
SandyRogers af7a2dc
use upstream jupyter/datascience-notebook layer instead of shiny-proxy's
SandyRogers 604da99
pins some dependencies for a more reproducible build
SandyRogers 3332498
adds a custom jupyter lab extension to redirect jupyterlab to specifi…
SandyRogers 634b9d9
adds support for setting ENV VARs via query params. updates notebooks…
SandyRogers fcbbe13
Merge pull request #1 from EBI-Metagenomics/upstream-jupyter
SandyRogers 5ce5715
cleanup of jl extension: subsume license and remove GHA
SandyRogers 1bd99fb
Adds integration tests (#2)
SandyRogers 0f7e0ac
adds integration status badge
SandyRogers cfbdd60
bioconda SIAMCAT install
Ales-ibt 91d6090
Update environment.yml
Ales-ibt f02a5e4
Install metagenomeseq
Ales-ibt ff0bca7
Merge pull request #4 from EBI-Metagenomics/comparative_metagenomics
Ales-ibt b7cd231
Comparative metagenomics (#5)
Ales-ibt 98aa836
docs: add SandyRogers as a contributor for code, example, and 3 more …
allcontributors[bot] 1b2a062
docs: add Ales-ibt as a contributor for code, example, and ideas (#9)
allcontributors[bot] 2f61427
Comparative metagenomics siamcat (#6)
Ales-ibt b4e7686
adds jupyter-lab extension with MGnify help (#12)
SandyRogers ad64119
updates comparative metagenomics notebook for lib upgrades
SandyRogers 6f9c305
docs: add bebatut as a contributor for infra (#15)
allcontributors[bot] 7d1b4ee
docs: add bgruening as a contributor for infra (#16)
allcontributors[bot] f4b8887
docs: add vestalisvirginis as a contributor for ideas, code, and cont…
allcontributors[bot] dea36fc
fixes all-contributors config
SandyRogers 7dd1c3b
docs: add mberacochea as a contributor for ideas, code, and 2 more (#18)
allcontributors[bot] b89fe4a
rationalizing docker images and speeding up cache population
SandyRogers c93867b
updates shinyproxy on GHA tests
SandyRogers 7301308
fixes shinyproxy version in tests config
SandyRogers 9c9c1b6
(re)adds notebooks to docker image during build
SandyRogers 4ddef76
Static (preview) rendering (#19)
SandyRogers ba2dba2
Update issue templates
SandyRogers 74e2e1b
Siamcat2 interpretation plot (#20)
Ales-ibt 47af244
adds info about deployment
SandyRogers d014d08
Adds static documentation (docs.mgnify.org) (#22)
SandyRogers 88d0972
fixes case sensitive glossary links
SandyRogers 5809641
Separating python and r kernels into their own conda envs (#23)
SandyRogers af478d4
simplifies mgnify_query notebook for faster rendering
SandyRogers 691d0a8
Biohackaton2022 genomes nb (#11)
vestalisvirginis d809398
Added GSC workshop files
tgurbich 3e8525f
Fixes
tgurbich d1fe1e1
Cleaned execution printouts
tgurbich ad0bc28
Implemented suggestions from review
tgurbich fce5d7c
Merge pull request #28 from EBI-Metagenomics/gsc_workshop
tgurbich b2869cb
Corrected typos
tgurbich 2bd0b89
Merge pull request #29 from EBI-Metagenomics/gsc_corrections
tgurbich 980ea17
docs: add tgurbich as a contributor for ideas, code, and content (#31)
allcontributors[bot] 5c43d23
Pathways vis (#26)
Ales-ibt 72c34b0
docs: add amartyanambiar as a contributor for code, example, and idea…
allcontributors[bot] c1c9bc0
updates mgnifyr-cache compress
SandyRogers 51c9fcc
Multi stage build (#33)
SandyRogers 6ccdd14
static render fixes and cleanup
SandyRogers f837385
Merge remote-tracking branch 'origin/main'
SandyRogers 64f8d83
AtlantECO notebook (#35)
KateSakharova 3a3cfdd
docs: add KateSakharova as a contributor for ideas, code, and content…
allcontributors[bot] 66d2608
Push built containers to registry (#38)
SandyRogers f691ab1
free up disk space during preview build to make space for docker img
SandyRogers 0c69fcc
Updated submission link to reflect redirection of submit page, to the…
MGS-sails f099158
Added text update suggestion from Lorna
MGS-sails 27f1de6
Merge pull request #40 from EBI-Metagenomics/data-flow-updates
MGS-sails 9934d5b
adds details of sourmash command and parameters used by MGnify, to docs
SandyRogers 0449b05
Adds documentation page about "additional analyses" (RO-Crates) (#41)
SandyRogers f821261
Update MGnifyR repo (#42)
SandyRogers 0114f51
Update genome-viewer.md
tgurbich 088873e
Update src/docs/genome-viewer.md
tgurbich 6fbc862
Merge pull request #44 from EBI-Metagenomics/genome-viewer-update
tgurbich 90b8256
Small bug fixed on Pathways Vis notebook (#43)
Ales-ibt cca64d9
Multiomics docu (#45)
Ales-ibt 639f06c
do not render atlanteco notebook into docs
SandyRogers 953b077
extra try to not render atlanteco notebook
SandyRogers 9ae8aec
fix "Search for Samples or Studies" R notebook: sparse df merge
SandyRogers e2dfe77
Fix/update atlanteco (#47)
KateSakharova a37a29b
Started adding in the comparative metagenomics in mia code
SHillman836 2515770
quick gitignore change
SHillman836 6dbcc07
added .Rdata to gitignore
SHillman836 cf0c292
finished part 2
SHillman836 479e65c
finished notebook draft
SHillman836 ef696f3
Merge remote-tracking branch 'upstream/main' into adding-mia-examples
SHillman836 7a5b6db
updated changes
SHillman836 806fa4d
updated changes
SHillman836 88d8d75
updated changes
SHillman836 0dca210
changed file format
SHillman836 994e0f1
Delete notebooks.Rproj
TuomasBorman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
902 changes: 902 additions & 0 deletions
902
src/notebooks/R Mia Examples/Comparative-Metagenomics.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
143 changes: 143 additions & 0 deletions
143
src/notebooks/R Mia Examples/Fetch-Analyses-metadata-for-a-Study.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"---\n", | ||
"title: \"Fetch analyses metadata for a study\"\n", | ||
"author:\n", | ||
" - name: Noah de Gunst\n", | ||
" affiliation:\n", | ||
" - id: mia\n", | ||
" name: Department of Computing, University of Turku, Finland\n", | ||
" - name: Sam Hillman\n", | ||
" affiliation:\n", | ||
" - id: mia\n", | ||
" name: Department of Computing, University of Turku, Finland\n", | ||
"categories: [R]\n", | ||
"execute: \n", | ||
" eval: true\n", | ||
"---\n", | ||
"\n", | ||
"::: {style=\"max-width:1200px\"}\n", | ||
"\n", | ||
"\n", | ||
"# Fetch a study using MGnifyR; download the metadata for all of its analyses\n", | ||
"\n", | ||
"The [MGnify API](https://www.ebi.ac.uk/metagenomics/api/v1) returns data and \n", | ||
"relationships as JSON. [MGnifyR](https://www.bioconductor.org/packages/release/bioc/html/MGnifyR.html) \n", | ||
"is a package to help you read MGnify data into your R analyses.\n", | ||
"\n", | ||
"You can find all of the other \"API endpoints\" using the [Browsable API interface in your web browser](https://www.ebi.ac.uk/metagenomics/api/v1).\n", | ||
"\n", | ||
"This is an interactive code notebook (a Jupyter Notebook). To run this code, click \n", | ||
"into each cell and press the ▶ button in the top toolbar, or press `shift+enter`.\n", | ||
"\n", | ||
"------------------------------------------------------------------------\n", | ||
":::\n", | ||
"\n", | ||
"#### Setting the access code\n", | ||
"First, we need to specify the accession number of the study we're working with. \n", | ||
"This can be done by setting the `mgnify_study_accession` variable. The accession \n", | ||
"number uniquely identifies the study in the MGnify database.\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"\n", | ||
"source(\"./utils/variable_utils.r\")\n", | ||
"\n", | ||
"mgnify_study_accession <- get_variable_from_link_or_input('MGYS', 'Study Accession', \n", | ||
" 'MGYS00005116')\n", | ||
"\n", | ||
"# You can also just directly set the accession variable in code, like this:\n", | ||
"# mgnify_study_accession <- \"MGYS00005292\"\n", | ||
"```\n", | ||
"\n", | ||
"#### Constructing a MgnifyClient object to access the database\n", | ||
"To interact with the MGnify database, we need to create an MgnifyClient object. \n", | ||
"This object allows us to fetch data from MGnify, and we can configure it to use \n", | ||
"a cache for efficiency. \n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"# Importing the libraries\n", | ||
"library(vegan)\n", | ||
"library(ggplot2)\n", | ||
"library(mia)\n", | ||
"library(MGnifyR)\n", | ||
"\n", | ||
"# Check if the cache directory exists, if not, create it\n", | ||
"if (!dir.exists(\"./.mgnify_cache\")) {\n", | ||
" dir.create(\"./.mgnify_cache\", recursive = TRUE)\n", | ||
"}\n", | ||
"\n", | ||
"# Create the MgnifyClient object with caching enabled\n", | ||
"mg <- MgnifyClient(usecache = TRUE, cache_dir = \"./.mgnify_cache\")\n", | ||
"```\n", | ||
"\n", | ||
"#### Displaying the help file\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"library(IRdisplay)\n", | ||
"display_markdown(file = '../_resources/mgnifyr_help.md')\n", | ||
"```\n", | ||
"\n", | ||
"## Fetch a list of the Analyses for the Study\n", | ||
"Using the MgnifyClient object, we can search for all analyses associated with the \n", | ||
"study accession number we set earlier. This will return a list of analysis accession \n", | ||
"numbers.\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"analyses_accessions <- searchAnalysis(mg, \"studies\", mgnify_study_accession)\n", | ||
"analyses_accessions\n", | ||
"```\n", | ||
"\n", | ||
"## Download metadata for the first 10 Analyses\n", | ||
"...and put it into a dataframe.\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"analyses_metadata_df <- getMetadata(mg, head(analyses_accessions, 10))\n", | ||
"```\n", | ||
"\n", | ||
"## Display metadata\n", | ||
"The table could be big, so let's look at a sample of it (`head`).\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"t(head(analyses_metadata_df))\n", | ||
"```\n", | ||
"\n", | ||
"## Download the data to a multi-assay data object\n", | ||
"\n", | ||
"> [mia](https://microbiome.github.io/mia/) is a Bioconductor package designed to \n", | ||
"import, store and analyze microbiome data using an object called a `TreeSummarizedExperiment.` \n", | ||
"This is a tailored data container optimized for microbiome data analysis.Being \n", | ||
"built on the `SummarizedExperiment` class, miaverse seamlessly integrates into the \n", | ||
"extensive `SummarizedExperiment` ecosystem. In this example we download the MGnifyR \n", | ||
"data to an MAE, which contains multiple `TreeSummarizedExperiment` objects.\n", | ||
"\n", | ||
"\n", | ||
"```{r}\n", | ||
"#| output: false\n", | ||
"mae <- getResult(mg, accession = analyses_accessions)\n", | ||
"```\n", | ||
"\n", | ||
"You use `MGnifyR` features further, for example to download data. Check the Cheat \n", | ||
"Sheet at the top for more." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# Help with MGnifyR | ||
|
||
MGnifyR is an R package that provides a convenient way for R users to access data from [the MGnify API](https://www.ebi.ac.uk/metagenomics/api/). | ||
|
||
Detailed help for each function is available in R using the standard `?function_name` command. | ||
|
||
A vignette is available containing a reasonably verbose overview of the main functionality. | ||
This can be read either within R with the `vignette("MGnifyR")` command, or [on the bioconductor vignette website](https://www.bioconductor.org/packages/release/bioc/vignettes/MGnifyR/inst/doc/MGnifyR.html) | ||
|
||
## MGnifyR Command cheat sheet | ||
|
||
For a full list of key MGnifyR functions, please look at the [MGnifyR website](https://ebi-metagenomics.github.io/MGnifyR/reference/index.html). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Retrieve name and URL for a specific pathway in the KEGG database | ||
get_pathway_info <- function(pathway) { | ||
pathway <- paste("map", pathway, sep = "") | ||
pathway_name <- keggList(pathway)[[1]] | ||
pathway_url <- paste("https://www.kegg.jp/pathway/", pathway, sep = "") | ||
return(list(pathway_name = pathway_name, pathway_url = pathway_url)) | ||
} | ||
|
||
|
||
# Function to prompt users for pathway selection and return custom pathway IDs | ||
PathwaysSelection <- function() { | ||
display_markdown("#### Pathways Selection :\n\n | ||
- For the most general & most complete pathways, input 'G'\n\n | ||
- Press Enter to generate the most complete pathways\n\n | ||
- To add custom pathways, input pathway numbers (ex: 00053,01220)") | ||
|
||
flush.console() | ||
CUSTOM_PATHWAY_IDS <- get_variable_from_link_or_input('CUSTOM_PATHWAY_IDS', name = 'Pathways Accession', default = '') | ||
|
||
if (CUSTOM_PATHWAY_IDS == "") { | ||
CUSTOM_PATHWAY_IDS <- list() | ||
} else if (CUSTOM_PATHWAY_IDS == "G") { | ||
CUSTOM_PATHWAY_IDS <- list("00010", "00020", "00030", "00061", "01232","00240", "00190") | ||
} else { | ||
CUSTOM_PATHWAY_IDS <- strsplit(CUSTOM_PATHWAY_IDS, ",")[[1]] | ||
} | ||
|
||
message(if (length(CUSTOM_PATHWAY_IDS) > 0) { | ||
paste("\nUsing", CUSTOM_PATHWAY_IDS, " - ", sapply(CUSTOM_PATHWAY_IDS, function(id) paste(get_pathway_info(id)[1]," : ",get_pathway_info(id)[2])), "as a Custom Pathway") | ||
} else { | ||
"\nUsing NONE as a Custom Pathway" | ||
}) | ||
return(CUSTOM_PATHWAY_IDS) | ||
} | ||
|
||
|
||
# Clearing the current working directory and displaying generated figures from `pathway_plots/` directory | ||
generatePathwayPlots <- function() { | ||
# Clearing the current working directory | ||
if (!dir.exists("pathway_plots")) { | ||
dir.create("pathway_plots") | ||
} | ||
|
||
file.copy(from = list.files(pattern = "./*pathview.png"), to = "./pathway_plots/", overwrite = TRUE) | ||
|
||
png_files <- list.files(path = ".", pattern = "*.png") | ||
xml_files <- list.files(path = ".", pattern = "*.xml") | ||
files <- c(png_files, xml_files) | ||
output <- capture.output({ | ||
unlink(files) | ||
}) | ||
|
||
# Accessing the png files and displaying it | ||
images <- list.files("pathway_plots", full.names = TRUE) | ||
|
||
for (pathway in images) { | ||
display_markdown(get_pathway_info(gsub("[^0-9]", "", basename(pathway)))$pathway_name) | ||
display_png(file = pathway) | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
library(glue) | ||
|
||
get_variable_from_link_or_input <- function(variable, name = 'accession', default = NA) { | ||
# Get a variable value, either from an ENV VAR that would have been set by the jlab_query_params extension, or through direct user input. | ||
var <- Sys.getenv(variable, unset = NA) | ||
if (!is.na(var)) { | ||
print(glue('Using {name} = {var} from the link you followed.')) | ||
} else { | ||
determiner <- ifelse(grepl(tolower(substr(name, 0, 1)), 'aeiou'), 'an', 'a') | ||
var <- readline(prompt = glue("Type {determiner} {name} [default: {default}]")) | ||
} | ||
var <- ifelse(is.na(var) || var == '', default, var) | ||
print(glue('Using "{var}" as {name}')) | ||
var | ||
} | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.