Basic report for NIIOR CCI (Neurorehabilitation) and ACI (Pneumonia) groups

Report summary

Perform preprocessing of user data and analyze essential taxonomic and functional composition (without analysis of factors = meta-data about the samples).

Created13/11/2018
Updated16/03/2022
TypeBasic report
ProjectNIIOR CCI (Neurorehabilitation) and ACI (Pneumonia) groups
Uploaded samples82

Data quality

Assessment of raw data quality.

Number of reads

Read quantity distribution

Number of the reads per sample before and after the quality filtering. Quality filtering (using split_libraries_fastq.py QIIME1.9 script (Caporaso et al. 2010)) included: trimming of low-quality read ends (quality threshold = 20) and discarding of trimmed reads shorter than 75% of the initial length. Vertical line denotes minimal number of reads (5000 reads).

Download read_quant_distr.svg

Samples with low coverage

List of samples that had insufficient number of high-quality reads after the quality filtering (< 5000 reads) and were excluded from further analysis.

All samples passed the filter.

Read classification statistics

Reads were classified using a closed-reference OTU picking (uclust_ref algorithm) implemented in QIIME1.9 (Caporaso et al. 2010) against a 16S rRNA sequence database (Greengenes v. 13.5 (DeSantis et al. 2006), 97% OTU similarity).

Proportion of classified reads

Distribution of the successfully classified reads for each sample.

Download classif_reads_frac.svg

Samples with insufficient proportion of classified reads

Warning: there are samples with low proportion of classified reads (<70%). It is recommended to repeat the analysis by creating an additional project without including these samples.

All samples passed the filter.

Taxonomic composition

Heatmap of taxonomic composition

The interactive heatmap represents relative abundance of major microbial taxa (columns) in the samples (rows). Using the drop-down list “Heatmap settings” on the right of the heatmap, users can select taxonomic rank of interest. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of cell on the Legend and additionally the displayed abundance of top 10 taxa of corresponding sample (click again or on the cross near sample name to “unfreeze”). Use the Top control to change the way of major composition display between the top features in the selected sample and the top features across all samples on the average.

Major taxa

The boxplots represent distribution of relative abundance for 25 most abundant taxa across all samples (for each taxonomic rank). For proper display on log scale, zero values were replaced with a pseudocount not higher than minimum value of relative abundance of major taxa.

Complete taxonomic composition

The table contains relative abundance of all microbial taxa for each taxonomic rank.

Taxonomic core

The plot represents the proportion of OTUs shared across the varying proportion of samples.

Download taxa_core.svg

Analysis of outliers

Automatic filtering of the user samples with extreme taxonomic composition (based on the combined analysis of user and external data). Analysis of outliers: samples in upper 1% tail of distribution of median distance between each sample and closest 50% of neighbours approximated by normal distribution. List of outliers:

IonXpress.019.run0, IonXpress.006.run1

PCoA visualization based on taxonomic composition

Distribution of the samples by their taxonomic composition in reduced dimensionality. The closer the samples (points) on the plot, the more similar their composition. Vectors show the directions in which the levels of the respective major taxa increase. Method of dimension reduction: PCoA (Principal Coordinate Analysis); dissimilarity metric: weighted UniFrac. Clicking on a dot “freezes” the detailed information about the sample on the right of the plot (click again or on the cross near sample name to “unfreeze”). Switch between the display modes with or without outliers and with or without vectors showing major microbial “drivers” using the respective controls.

Enterotypes

Enterotyping (cluster analysis of samples by their composition) was performed using the Dirichlet multinomial mixtures (DMM) method for the probabilistic modelling of microbial metagenomics data (Holmes et al., 2012). The optimal number of clusters was determined according to the lowest Laplace estimation on the DMM model evidence. Silhouette width is a measure of the clustering quality and was determined using Bray-Curtis distance. For each of the enterotypes, there is a list of its drivers – microbial taxa distinguishing the samples belonging to the cluster from the other samples. A table of samples and their enetrotypes is provided.

Number of enterotypes

3

Laplace approximation of the model evidence

53737.327

Average silhouette width of the clusters

0.079

Microbial drivers

Enterotype name: Enterotype 1

Table

taxon prevalence_percent
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacteriales;f__Enterobacteriaceae;g__ 4.4149
k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides 4.0240
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__;g__ 3.1686
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus 2.9642
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Ruminococcaceae;g__ 2.6196
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__ 2.1841
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__Enterococcus 1.7730
k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Parabacteroides 1.6645
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Blautia 1.6282
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Dorea 1.5539

Enterotype name: Enterotype 2

Table

taxon prevalence_percent
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__Enterococcus 43.5454
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus 6.6595
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacteriales;f__Enterobacteriaceae;g__ 4.8598
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__ 2.7684
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella 1.0144
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__;g__ 0.9119
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Lactobacillus 0.8913
k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Staphylococcaceae;g__Staphylococcus 0.8895
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacteriales;f__Enterobacteriaceae;g__Klebsiella 0.8099
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__Vagococcus 0.7728

Enterotype name: Enterotype 3

Table

taxon prevalence_percent
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Enterococcaceae;g__Enterococcus 7.8704
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__[Ruminococcus] 5.0795
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Dorea 4.5567
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Blautia 3.1035
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__;g__ 3.0483
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacteriales;f__Enterobacteriaceae;g__ 2.9254
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Ruminococcaceae;g__ 2.8886
k__Bacteria;p__Firmicutes;c__Erysipelotrichi;o__Erysipelotrichales;f__Erysipelotrichaceae;g__ 2.8483
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__ 2.7810
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus 2.7801

Sample-enterotype table

Samples and their enterotypes as determined by the DMM model.

Download sample_enterotypes.csv

Hierarchical clustering

The tree shows clustering of the samples by similarity of their taxonomic composition at varying levels of detail. Dissimilarity metric: weighted UniFrac; linkage: Ward’s method.

Alpha-diversity

Static plots

Shannon index

Chao1 index

Interactive plot

The measure describes the conditional number of taxa in each sample. Metric: Shannon index. Clicking on a dot “freezes” the displayed value on Y axis and additionally the abundance of top 10 taxa (click on it or on the cross near the sample name to “unfreeze”). In addition, the mean and confidence interval value appear when the mouse is over the boxplot. Controls at the top and bottom-right allow to change the displayed data.

Taxa co-occurence analysis

Co-occurence graph

Co-occurrence of microbial genera was analyzed basing on correlation analysis of their relative abundance using SPIEC-EASI software. In the graph, vertices show genera; pairs of highly co-occurring genera are connected with blue lines. The graph shows the members of the cooperatives - groups of highly co-occurring genera corresponding to isolated components (singleton vertices are omitted). Parameters of SPIEC-EASI algorithm: Meinshausen and Bühlmann neighbourhood selection method (MB), minimum lambda ratio= 0.1, number of lambda iterations = 20, model selection using StARS algorithm (number of StARS subsamples = 50).

Members of the cooperatives

Cooperative content.

Download members_cooperative.csv

Abundance of the cooperatives

Relative abundance of each cooperative in the samples.

Download sample_cooperative.csv

Reconstruction of metabolic potential

Predicted functional composition of microbiota.

Heatmap of functional composition

The interactive heatmap represents relative abundance of major pathways (columns) in the samples (rows). To switch between KEGG or MetaCyc nomenclatures, use the drop-down list in “Heatmap settings”. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of the cell in the displayed abundance of top features of the sample (click again or on the cross near the sample name to “unfreeze”). Use the Top control to change the way of major composition display between the top features in the selected sample and the top features across all samples on the average.

Vitamins synthesis

Gut microbes are known to produce a number of vitamins. The boxplots represent median, standard deviation and quartiles of the vitamin biosynthesis pathways in the samples.

Gene groups

Relative abundance of KEGG Ortology gene groups involved in vitamins synthesis.

Download vitamin_genes.csv

Pathways

Relative abundance of pathways involved in vitamins synthesis.

Download vitamin.csv

Plots

Total relative abundance of the genes involved in vitamins biosynthesis summed across the respective pathways.

KEGG pathways

Complete functional composition

The table contains relative abundance of all functional features.

Synthesis of short-chain fatty acids (SCFAs)

Gut microbes are known to produce SCFAs. The boxplots represent median, standard deviation and quartiles of the SCFAs biosynthesis pathways in the samples.

Synthesis of butyrate

Gene groups

Relative abundance of KEGG Ortology gene groups involved in butyrate synthesis.

Download butyrate_genes.csv

Pathways

Relative abundance of pathways involved in butyrate synthesis.

Download butyrate.csv

Plots

Total relative abundance of the genes involved in butyrate synthesis summed across the respective pathways.

KEGG pathways

Synthesis of propionate

Gene groups

Relative abundance of KEGG Ortology gene groups involved in propionate synthesis.

Download propionate_genes.csv

Pathways

Relative abundance of pathways involved in propionate synthesis

Download propionate.csv

Plots

Total relative abundance of the genes involved in propionate synthesis summed across the respective pathways.

KEGG pathways

All features tables

All calculated features can be downloaded here.

Alpha-diversity data

The table contains alpha-diversity values of all samples.

Download alpha_diversity.xlsx

Complete taxonomic composition

The table contains relative abundance of all microbial taxa for each taxonomic rank.

Complete functional composition

The table contains relative abundance of all functional features.

Beta-diversity data

Table of weighted UniFrac distances between samples

Download beta_diversity.csv

datalab: 3.10.0
knb_lib: 4.8.71
knb_interactive: 2.0.2