Skip to content

Ephantus-Wambui/EMseq_nextflow_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EMseq nextflow pipeline

This pipeline is designed to process raw data from EM-seq experiments. It is based on the nf-core framework.

Quick Start

  1. Install nextflow

Ensure also on your system you have installed fastqc, multiqc, trim galore and bismark.

  1. Install fastqc
  2. Install multiqc
  3. Install trim galore
  4. Install bismark

Clone repo

git clone [`EMseq pipeline`](https://github.com/Ephantus-Wambui/EMseq_nextflow_pipeline.git)

Run pipeline

Before running the pipeline ensure you have the following files in the data directory:

  1. genome_test directory which contains TMEB117_chr16.fasta reference genome

  2. high yield and low yield fastq files

After ensuring everything is in place, activate the conda environment which contains fastqc, multiqc, trim galore and bismark dependencies.

conda activate EMseq # activate conda environment

Before running the EMseq_pipeline.nf script, first run the QC pipeline to check the quality of the fastq file.

** Note: cd into scripts folder **

nextflow run EMseq_fastqc.nf

After running the QC pipeline, run the EMseq pipeline to align the reads to the reference genome and generate methylation calls.

nextflow run EMseq_pipeline.nf

** Note: Adjust trim galore parameters according to the fastqc results and then run the pipeline. **

Output

The pipeline will generate the following files directories in the output directory:

  1. Both high yield and low yield directories which will contain individual fastqc reports of the fastq files.

  2. Both high yield and low yield directories which will contain multiqc reports of the fastqc reports.

  3. Both high yield and low yield directories which will contain trimmed fastq files.

  4. Both high yield and low yield directories which will contain bismark alignment reports.

  5. Both high yield and low yield directories which will contain bismark methylation calls.

Pipeline summary

  1. QC pipeline: This pipeline will generate fastqc reports of the fastq files and a multiqc report of the fastqc reports.

  2. EMseq pipeline: This pipeline will align the reads to the reference genome and generate methylation calls.

Contributors

  1. Ephantus Wambui

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published