Skip to content

MegaPath-Nano: Accurate Compositional Analysis and Drug-level Antimicrobial Resistance Detection Software for Oxford Nanopore Long-read Metagenomics; MegaPath-Nano-Amplicon: filtering module for metagenomic amplicon data


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation




The ultra-long ONT sequencing technology benefits metagenomic profiling with high alignment specificity. Yet, its high sequencing error per read remains a hurdle to distinguish among closely related pathogens at lower taxonomic ranks, and for refined drug-level antimicrobial resistance prediction. In this study, we present MegaPath-Nano, successor to the NGS-based MegaPath, an accurate compositional analysis software with drug-level AMR identification for ONT metagenomic sequencing data. MegaPath-Nano takes ONT raw reads as input, and performs data cleansing, taxonomic profiling, and drug-level AMR detection within a single workflow. The major output of our tool includes 1) a taxonomic profiling report down to strain level with abundance estimated; and 2) an integrated class and drug level AMR report in tabular format with supportive information from different detection tools. As a key feature for taxonomic profiling, MegaPath-Nano performs a global-optimization on multiple alignments and reassigns predictably misplaced reads to a single most likely species. To perform a consistent and comprehensive AMR detection analysis, MegaPath-Nano uses a novel consensus-based approach to detect AMR, incorporating a collection of AMR software and databases. We benchmarked against other state-of-the-art software, including WIMP, Kraken 2, MetaMaps, ARMA and ARGpore using real sequencing data, and we achieved the best performance in both tasks. MegaPath-Nano is therefore a well rounded ONT metagenomic tool for clinical use in practice.


Storage requirement: 80G

Option 1: Bioconda

# prioritize channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

conda create -n mpn -c bioconda megapath-nano
conda activate mpn

Option 2: Conda Virtual Environment Setup

# prioritize channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

conda create -n mpn python=3.6.10
conda activate mpn

# installing all dependencies for both modules
conda install pandas psutil pybedtools porechop==0.2.4 bioconvert seqtk minimap2 bcftools samtools==1.9 'pysam>=0.16.0' tabulate cgecore==1.5.6 "ncbi-amrfinderplus>=3" "rgi>=5"
# MegaPath-Nano-Amplicon filter module
conda install clair=2.1.1 parallel=20191122 

# git clone MegaPath-Nano
git clone --depth 1

# MegaPath-Nano-Amplicon filter module
cd MegaPath-Nano/bin/realignment/realign/
g++ -std=c++14 -O1 -shared -fPIC -o realigner ssw_cpp.cpp ssw.c realigner.cpp
g++ -std=c++11 -shared -fPIC -o debruijn_graph -O3 debruijn_graph.cpp
gcc -Wall -O3 -pipe -fPIC -shared -rdynamic -o ssw.c ssw.h
cd - 
cd MegaPath-Nano/bin/Clair-ensemble/Clair.beta.ensemble.cpu/clair/
g++ ensemble.cpp -o ensemble
cd -
cd MegaPath-Nano/bin/samtools-1.13
./configure && make && make install

Option 3: Docker

sudo docker build -f ./Dockerfile -t mpn_image . 
sudo docker run -it mpn_image /bin/bash

Pre-built Database Download

# Option 1, Bioconda: cd ${CONDA_PREFIX}/MegaPath-Nano
# conda info --env can show the ${CONDA_PREFIX} in the current environment.
# Option 2, Conda Virtual Env: cd ./MegaPath-Nano (the git clone)
# Option 3, Docker: cd /opt/MegaPath-Nano

# Taxon
wget -c -O - | tar -xvz

rgi load --card_json bin/amr_db/card/card.json
amrfinder -u

# Amplicon filter module
wget -c -O - | tar -xvz

Alternative: Online Database Installation for taxon and AMR detection

The latest RefSeq database can be downloaded with the scripts under db_preparation/.

# Taxon
# download RefSeq:
./ [${DB_DIR}=MegaPath-Nano/genomes/refseq/]

# build assembly metadata:
./ [${DB_DIR}=MegaPath-Nano/genomes/refseq/] [${ASSEMBLY_DIR}=MegaPath-Nano/genomes/]

# generate config files:
./ [${DB_DIR}=MegaPath-Nano/genomes/refseq/] [${CONFIG_DIR}=MegaPath-Nano/config/]

# prepare SQL db data:
./ [${DB_DIR}=MegaPath-Nano/genomes/refseq/] [${SQL_DIR}=MegaPath-Nano/db/]

# (optional) add custom FASTA sequences to the decoy database 
python --decoy_fasta ${fasta}

# prepare AMR databases:

Basic usage

(1) Run taxonomic analysis and AMR deteciton module

python --query ${fq/fa} [options]

required arguments:
                              Query file (fastq or fasta)

optional arguments:
  --max_aligner_thread INT    Maximum number of threads used by aligner, default: 64. Actual number of threads is min( available num of cores, threads specified)
  --output_prefix             Output Prefix, default: query file name
  --output_folder             Output folder, default: current working directory 

(2) Run taxonomic analysis module only

python --query ${fq/fa} --taxon_module_only [options]

(3) Run AMR deteciton module only with FASTQ/FASTA

python --query ${fq/fa} --AMR_module_only [options]

(4) Filter FQ/FA only: Adaptor trimming, read filtering and trimming, human or decoy filtering

python --query ${fq/fa} --filter_fq_only [options]

For all available options, please check

(5) Run AMR deteciton module only with BAM

python --query_bam ${bam} --output_folder ${dir} [options]

required arguments:
  --query_bam QUERY_BAM
                              Input bam
  --output_folder OUTPUT_FOLDER
                              Output directory

optional arguments:
  --taxon TAXON               Taxon-specific options for AMRFinder [e.g. --taxon Escherichia], see usage for the full list of curated organisms
  --threads THREADS           Max num of threads, default: available num of cores

(6) Run amplicon filter module with **FASTQ**
./MegaPath-Nano/bin/ -r ${fq}

Demo data

The demo data for AMR detection of five patient isolates are available for download on Samples were prepared using ONT Rapid Sequencing Kit, and sequenced using ONT R9.4.1 flowcells.

The experimental validation results of these AMR demo datasets are listed on Supplementary_info_AMR.

Demo run

python --query Escherichia_coli_isolate2_HKUBAL_20200103.fastq


MegaPath-Nano: Accurate Compositional Analysis and Drug-level Antimicrobial Resistance Detection Software for Oxford Nanopore Long-read Metagenomics; MegaPath-Nano-Amplicon: filtering module for metagenomic amplicon data








No packages published