bio-methylation-bismark-alignment
SKILL.md
Bismark Alignment
Prepare Genome Index
# One-time genome preparation (creates bisulfite-converted index)
bismark_genome_preparation --bowtie2 /path/to/genome_folder/
# Genome folder should contain FASTA files (e.g., hg38.fa, chr1.fa, etc.)
# Creates Bisulfite_Genome/ subdirectory with CT and GA converted indices
Basic Single-End Alignment
bismark --genome /path/to/genome_folder/ reads.fastq.gz -o output_dir/
Paired-End Alignment
bismark --genome /path/to/genome_folder/ \
-1 reads_R1.fastq.gz \
-2 reads_R2.fastq.gz \
-o output_dir/
Common Options
bismark --genome /path/to/genome_folder/ \
--bowtie2 \ # Use bowtie2 (default)
--parallel 4 \ # Number of parallel instances
--temp_dir /tmp/ \ # Temporary directory
--non_directional \ # For non-directional libraries
--nucleotide_coverage \ # Generate nucleotide coverage report
-o output_dir/ \
reads.fastq.gz
RRBS Mode
# Reduced Representation Bisulfite Sequencing
bismark --genome /path/to/genome_folder/ \
--pbat \ # For PBAT libraries (post-bisulfite adapter tagging)
reads.fastq.gz
# MspI digestion (RRBS standard)
# Bismark handles MspI-digested libraries automatically
PBAT Libraries
# Post-Bisulfite Adapter Tagging (e.g., scBS-seq)
bismark --genome /path/to/genome_folder/ --pbat reads.fastq.gz
Non-Directional Libraries
# For libraries where all 4 strands are present
bismark --genome /path/to/genome_folder/ --non_directional reads.fastq.gz
With Quality/Adapter Trimming (Pre-alignment)
# Trim adapters first with Trim Galore (recommended)
trim_galore --illumina --paired reads_R1.fastq.gz reads_R2.fastq.gz
# Then align
bismark --genome /path/to/genome_folder/ \
-1 reads_R1_val_1.fq.gz \
-2 reads_R2_val_2.fq.gz
Multicore Processing
# --parallel sets instances per alignment direction
# Total threads = parallel * 2 (for directional) or parallel * 4 (non-directional)
bismark --genome /path/to/genome_folder/ \
--parallel 4 \
reads.fastq.gz
Output Files
# Bismark produces:
# - reads_bismark_bt2.bam # Aligned reads
# - reads_bismark_bt2_SE_report.txt # Alignment report
# View alignment report
cat output_dir/reads_bismark_bt2_SE_report.txt
Sort and Index BAM
# Bismark output is unsorted
samtools sort output.bam -o output.sorted.bam
samtools index output.sorted.bam
Deduplicate (Optional)
# Remove PCR duplicates (recommended for WGBS, not RRBS)
deduplicate_bismark --bam output_bismark_bt2.bam
# For paired-end
deduplicate_bismark --paired --bam output_bismark_bt2_pe.bam
Check Alignment Statistics
# Bismark generates detailed report
cat *_SE_report.txt
# Key metrics:
# - Sequences analyzed
# - Unique alignments
# - Mapping efficiency
# - C methylated in CpG context
Genome Preparation with HISAT2 (Recommended for Large Genomes)
# HISAT2 is faster and uses less memory for large mammalian genomes
bismark_genome_preparation --hisat2 /path/to/genome_folder/
# Align with HISAT2
bismark --genome /path/to/genome_folder/ --hisat2 reads.fastq.gz
# HISAT2 paired-end
bismark --genome /path/to/genome_folder/ --hisat2 \
-1 reads_R1.fastq.gz \
-2 reads_R2.fastq.gz
Key Parameters
| Parameter | Description |
|---|---|
| --genome | Path to genome folder |
| --bowtie2 | Use Bowtie2 aligner (default) |
| --hisat2 | Use HISAT2 aligner |
| --parallel | Parallel alignment instances |
| --non_directional | Non-directional library |
| --pbat | PBAT library protocol |
| -o | Output directory |
| --temp_dir | Temporary file directory |
| --nucleotide_coverage | Generate nuc coverage report |
| -N | Mismatches in seed (0 or 1, default 0) |
| -L | Seed length (default 20) |
Library Types
| Type | Parameter | Description |
|---|---|---|
| Directional | (default) | Standard WGBS/RRBS |
| Non-directional | --non_directional | All 4 strands |
| PBAT | --pbat | Post-bisulfite adapter tagging |
Related Skills
- methylation-calling - Extract methylation from Bismark BAM
- methylkit-analysis - Import Bismark output to R
- sequence-io/read-sequences - FASTQ handling
- alignment-files/sam-bam-basics - BAM manipulation
Weekly Installs
3
Repository
gptomics/bioskillsInstalled on
windsurf2
trae2
opencode2
codex2
claude-code2
antigravity2