picture logo

Features

  • FMAP provides a more sensible reference protein sequence database based on UniRef90.
  • Identification of differentially-abundant genes (KEGG Orthology)
  • Mapping differentially-abundant genes to pathways (KEGG Pathway)
  • Mapping differentially-abundant genes to operons (ODB3)

Downloads

  • GitHub page
  • Example script
    • The example data was designed from Case study 4.
    • The whole process needs 1.2 GB disk space and < 4GB RAM. It will take about 1 hour with 4 cpus (Intel 2.60GHz).
  • Example script (custom)
    • A custom database of "Fusobacterium nucleatum" and "UniRef100" will be built using "FMAP_database.pl" and "FMAP_prepare.pl" instead of "FMAP_download.pl".
    • It will take about 6 hours with 4 cpus (Intel 2.60GHz).
  • FMAP_ubuntu.vmdk (virtual machine disk, login: fmapuser, password: fmap)

Requirements

  • Perl - Scripting language
  • R - Statistical computing
  • Statistics::R - Perl interface with the R statistical program
    • Use CPAN to install the module
      perl -MCPAN -e 'install Statistics::R'
    • or download the source and compile manually
      wget 'http://search.cpan.org/CPAN/authors/id/F/FA/FANGLY/Statistics-R-0.33.tar.gz'
      tar zxf Statistics-R-0.33.tar.gz
      cd Statistics-R-0.33
      perl Makefile.PL
      make
      make test
      make install
  • Mapping program providing BLASTX search of sequencing reads: DIAMOND or USEARCH
  • Linux commands: wget, cat, sort
  • Bio::DB::Taxonomy - Access to a taxonomy database (which is required only if you want to build a custom database.)

FMAP workflow

image
  • You can generate a count table for metagenomeSeq, DESeq, or edgeR using the option "-c" of "FMAP_table.pl".

Case study 1 - SRP002423: Crohn's disease (8) vs. Healthy (4)

  • Data type: metagenomic sequencing (454, Single-end)
  • Data preprocessing: human sequence removal
  • Sample selection: 12 samples used in the publication of PMID:23209564
  • Result tables (mapping, orthology abundance, operon, pathway): MS Excel, Text zip
  • Heatmap of differentially-abundant genes (log2 RPKM)
image
  • Differentially-abundant operons
image
  • Pathways enriched with differentially-abundant genes
image

Case study 2 - SRP000109: ocean microbiome 25m (20) vs. 500m (20)

  • Data type: metagenomic sequencing (454, Single-end)
  • Result tables (mapping, orthology abundance, operon, pathway): MS Excel, Text zip
  • Photosynthesis
image
  • Caprolactam degradation
image

Case study 3 - SRP050543: root caries (9) vs. sound root surfaces (10)

  • Data type: metatranscriptomic sequencing (Illumina, Single-end, Read length: 101)
  • Data preprocessing: NGS QC Toolkit and human sequence removal
  • Result tables (mapping, orthology abundance, operon, pathway): MS Excel, Text zip
  • Heatmap of differentially-abundant genes (log2 RPKM)
image
  • Example pathways
    • Flagellar assembly
    • Bacterial chemotaxis
    • Phosphotransferase system (PTS)
image

Case study 4 - SRP044400: schizophrenia (5) vs. none (9)

  • Data type: metagenomic sequencing (Illumina, Single-end, Read length: 101)
  • Data preprocessing: NGS QC Toolkit and human sequence removal
  • Sample selection: 14 SRA samples used in the analysis were selected based on QC-passed read counts.
  • Result tables (mapping, orthology abundance, operon, pathway): MS Excel, Text zip
  • Nitrotoluene degradation pathway including pyruvate ferredoxin oxidoreductase operon
image

References

  • UniProt: a hub for protein information. Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12.
  • Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014 Jan;42(Database issue):D199-205.
  • KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000 Jan 1;28(1):27-30.
  • ODB: a database for operon organizations, 2011 update. Nucleic Acids Res. 2011 Jan;39(Database issue):D552-5.
  • ODB: a database of operons accumulating known operons across multiple genomes. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D358-62.

Citation

  • Kim J, Kim MS, Koh AY, Xie Y, Zhan X.
    "FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies"
    BMC Bioinformatics. 2016 Oct 10;17(1):420.
    PMID: 27724866

Contacts

  • Jiwoong Kim (Jiwoong.Kim@UTSouthwestern.edu)
  • Min Soo Kim (MinS.Kim@UTSouthwestern.edu)
  • Xiaowei Zhan (Xiaowei.Zhan@UTSouthwestern.edu)