Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tutorial.md #4937

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ redirect_from:
- /topics/proteomics/tutorials/clinical-mp-database-generation/tutorial
---

Metaproteomics is the large-scale characterization of the entire complement of proteins expressed by microbiota. However, metaproteomics analysis of clinical samples is challenged by the presence of abundant human (host) proteins which hampers the confident detection of lower abundant microbial proteins {% cite Batut2018 %} ; [{% cite Jagtap2015 %} .
Metaproteomics is the large-scale characterization of the entire complement of proteins expressed by microbiota. However, metaproteomics analysis of clinical samples is challenged by the presence of abundant human (host) proteins which hampers the confident detection of lower abundant microbial proteins {% cite Batut2018 %} ; {% cite Jagtap2015 %} .

To address this, we used tandem mass spectrometry (MS/MS) and bioinformatics tools on the Galaxy platform to develop a metaproteomics workflow to characterize the metaproteomes of clinical samples. This clinical metaproteomics workflow holds potential for general clinical applications such as potential secondary infections during COVID-19 infection, microbiome changes during cystic fibrosis as well as broad research questions regarding host-microbe interactions.

Expand Down Expand Up @@ -106,7 +106,7 @@ The first workflow for the clinical metaproteomics data analysis is the Database
>
> 1. **Import the workflow** into Galaxy:
>
> {% snippet faqs/galaxy/workflows_run_trs.md path="topics/proteomics/tutorials/clinical-mp-database-generation/workflows/main_workflow.ga" title="Pretreatments" %}
> {% snippet faqs/galaxy/workflows_run_trs.md path="`https://usegalaxy.eu/u/galaxyp/w/wf1databasegenerationworkflow`" title="Pretreatments" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:036> reported by reviewdog 🐶
The linked file (https://usegalaxy.eu/u/galaxyp/w/wf1databasegenerationworkflow) could not be found.

>
>
> 2. Run **Workflow** {% icon workflow %} using the following parameters:
Expand Down Expand Up @@ -167,7 +167,7 @@ For this tutorial, a literature survey was conducted to obtain 118 taxonomic spe
## Merging databases to obtain a large comprehensive database for MetaNovo
Once generated, the Species UniProt database (~3.38 million sequences) will be merged with the Human SwissProt database (reviewed only; ~20.4K sequences) and contaminant (cRAP) sequences database (116 sequences) and filtered to generate the large comprehensive database (~2.59 million sequences). The large comprehensive database will be used to generate a compact database using MetaNovo, which is much more manageable.

> <hands-on-title> Download contaminants with **Protein Database Downloader </hands-on-title>
> <hands-on-title> Download contaminants with **Protein Database Downloader** </hands-on-title>
>
> 1. {% tool [Protein Database Downloader](toolshed.g2.bx.psu.edu/repos/galaxyp/dbbuilder/dbbuilder/0.3.4) %} with the following parameters:
> - *"Download from?"*: `cRAP (contaminants)`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,15 @@ This step is to identify proteins based on mass spectrometry data. The algorithm
>
{: .hands_on}

> # Import Workflow
> <hands-on-title>Running the Workflow</hands-on-title>
>
> 7. **Import the workflow** into Galaxy:
>
> {% snippet faqs/galaxy/workflows_run_trs.md path="`https://usegalaxy.eu/u/galaxyp/w/wf2discovery-workflow`" title="Pretreatments" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:036> reported by reviewdog 🐶
The linked file (https://usegalaxy.eu/u/galaxyp/w/wf2discovery-workflow) could not be found.

>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
A None was opened, but closed with hands_on on line 106

{: .hands_on}


# Peptide identification
Using the compact database generated by MetaNovo as the input database, we will match MS/MS data to peptide sequences via sequence database searching.
Expand Down Expand Up @@ -458,7 +467,7 @@ MaxQuant is an MS-based proteomics platform that is capable of processing raw da
> <question-title></question-title>
>
> 1. What is the Experimental Design file for MaxQuant?
> >
>
> > <solution-title></solution-title>
> >
> > 1. In MaxQuant, the **Experimental Design** file is used to specify the experimental conditions, sample groups, and the relationships between different samples in a proteomics experiment. This file is a crucial component of the MaxQuant analysis process because it helps the software correctly organize and analyze the mass spectrometry data. The **Experimental Design** file typically has a ".txt" extension and is a tab-delimited text file. Here's what you might include in an Experimental Design file for MaxQuant: **Sample Names** (You specify the names of each sample in your experiment. These names should be consistent with the naming conventions used in your raw data files.), **Experimental Conditions** (You define the experimental conditions or treatment groups associated with each sample. For example, you might have control and treated groups, and you would assign the appropriate condition to each sample.), **Replicates** (You indicate the replicates for each sample, which is important for assessing the statistical significance of your results. Replicates are typically denoted by numeric values (e.g., "1," "2," "3") or by unique identifiers (e.g., "Replicate A," "Replicate B")), **Labels** (If you're using isobaric labeling methods like TMT (Tandem Mass Tag) or iTRAQ (Isobaric Tags for Relative and Absolute Quantitation), you specify the labels associated with each sample. This is important for quantification.), **Other Metadata** (You can include additional metadata relevant to your experiment, such as the biological source, time points, or any other information that helps describe the samples and experimental conditions.)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,16 @@ Interestingly, the PepQuery tool does not rely on searching peptides against a r
> 6. Users can create a database collection of the MGF files.
>
> {% snippet faqs/galaxy/datasets_add_tag.md %}
>
{: .hands_on}

# Import Workflow
> <hands-on-title>Running the Workflow</hands-on-title>
>
> 1. **Import the workflow** into Galaxy:
>
> {% snippet faqs/galaxy/workflows_run_trs.md path="`https://usegalaxy.eu/u/galaxyp/w/wf3verificationworkflow`" title="Pretreatments" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:036> reported by reviewdog 🐶
The linked file (https://usegalaxy.eu/u/galaxyp/w/wf3verificationworkflow) could not be found.

> 2. Import and Run the workflow.
{: .hands_on}

# Extraction of Microbial Peptides from SearchGUI/PeptideShaker and MaxQuant
Expand Down Expand Up @@ -273,7 +282,8 @@ We will use the Query Tabular tool {% cite Johnson2019 %} to search the PepQuery
> > <comment-title>SQL Query information</comment-title>
> > The query input files are the list of peptides and the peptide report we obtained from MaxQuant and SGPS. The query is matching each peptide (m.pep) from the PepQuery results to the peptide reports so that each verified peptide has its protein/protein group assigned to it.
> {: .comment}
shiltemann marked this conversation as resolved.
Show resolved Hide resolved
>
{: .hands_on}

> <hands-on-title> Remove Header with Remove beginning </hands-on-title>
>
> 1. {% tool [Remove beginning](Remove beginning1) %} with the following parameters:
Expand Down Expand Up @@ -332,8 +342,9 @@ Again, we will use the Query Tabular tool to retrieve UniProt IDs (accession num
> - *"Use first line as column names"*: `Yes`
> - *"Specify Column Names (comma-separated list)"*: `pep,prot`
> ` *"SQL Query to generate tabular output"*: `SELECT distinct(prot) AS Accession
> from t1`
> *"include query result column headers"*: `No`
> from t1`
>
> - *"include query result column headers"*: `No`
>
>
{: .hands_on}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,15 @@ In this current workflow, we perform Quantification using the MaxQuant tool and
> 6. Create a dataset of the RAW files.
>
> {% snippet faqs/galaxy/datasets_add_tag.md %}
{: .hands_on}

# Import Workflow
> <hands-on-title>Running the Workflow</hands-on-title>
>
> 1. **Import the workflow** into Galaxy:
>
> {% snippet faqs/galaxy/workflows_run_trs.md path="`https://usegalaxy.eu/u/galaxyp/w/wf4quantitationworkflow`" title="Pretreatments" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:036> reported by reviewdog 🐶
The linked file (https://usegalaxy.eu/u/galaxyp/w/wf4quantitationworkflow) could not be found.

> 2. Import and Run the workflow
{: .hands_on}


Expand All @@ -96,7 +104,7 @@ In this current workflow, we perform Quantification using the MaxQuant tool and

In the [Discovery Module](https://github.com/subinamehta/training-material/blob/main/topics/proteomics/tutorials/clinical-mp-discovery/tutorial.md), we used MaxQuant to identify peptides for verification. Now, we will again use MaxQuant to further quantify the PepQuery-verified peptides, both microbial and human. More information about quantitation using MaxQuant is available, including [Label-free data analysis](https://gxy.io/GTN:T00218) and [MaxQuant and MSstats for the analysis of TMT data](https://gxy.io/GTN:T00220).

The outputs we are most interested in consist of the `MaxQuant Evidence file`, `MaxQuant Protein Group`s, and `MaxQuant Peptides`. The `MaxQuant Peptides` file will allow us to group them to generate a list of quantified microbial peptides.
The outputs we are most interested in consist of the `MaxQuant Evidence file`, `MaxQuant Protein Groups`, and `MaxQuant Peptides`. The `MaxQuant Peptides` file will allow us to group them to generate a list of quantified microbial peptides.

> <hands-on-title> Quantify verified peptides (from PepQuery2) </hands-on-title>
>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,12 @@ The final workflow in the array of clinical metaproteomics tutorials is the data
>
> {% snippet faqs/galaxy/datasets_add_tag.md %}
>
> # Import Workflow
> <hands-on-title>Running the Workflow</hands-on-title>
>
> 7. **Import the workflow** into Galaxy:
>
> {% snippet faqs/galaxy/workflows_run_trs.md path="`https://usegalaxy.eu/u/galaxyp/w/wf5datainterpretationworklow`" title="Pretreatments" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:036> reported by reviewdog 🐶
The linked file (https://usegalaxy.eu/u/galaxyp/w/wf5datainterpretationworklow) could not be found.

{: .hands_on}


Expand Down
Loading