(howto) Revert a BAM file to FastQ format

NOTE: This tutorial has been replaced by a more recent and much improved version, Tutorial#6484.

Objective

Revert a BAM file back to FastQ. This comes in handy when you receive data that has been processed but not according to GATK Best Practices, and you want to reset and reprocess it properly.

Prerequisites

Installed HTSlib

Steps

Shuffle the reads in the bam file
Revert the BAM file to FastQ format
Compress the FastQ file
Note for advanced users

1. Shuffle the reads in the bam file

Action

Shuffle the reads in the bam file so they are not in a biased order before alignment by running the following HTSlib command:

htscmd bamshuf -uOn 128 aln_reads.bam tmp > shuffled_reads.bam

Expected Result

This creates a new BAM file containing the original reads, which still retain their mapping information, but now they are no longer sorted.

The aligner uses blocks of paired reads to estimate the insert size. If you don’t shuffle your original bam, the blocks of insert size will not be randomly distributed across the genome, rather they will all come from the same region, biasing the insert size calculation. This is a very important step which is unfortunately often overlooked.

2. Revert the BAM file to FastQ

Action

Revert the BAM file to FastQ format by running the following HTSlib command:

htscmd bam2fq -a shuffled_reads.bam > interleaved_reads.fq

Expected Result

This creates an interleaved FastQ file called interleaved_reads.fq containing the now-unmapped paired reads.

Interleaved simply means that for each pair of reads in your paired-end data set, both the forward and the reverse reads are in the same file, as opposed to having them in separate files.

3. Compress the FastQ file

Action

Compress the FastQ file to reduce its size using the gzip utility:

gzip interleaved_reads.fq

Expected Result

This creates a gzipped FastQ file called interleaved_reads.fq.gz. This file is ready to be used as input for the Best Practices workflow.

BWA handles gzipped fastq files natively, so you don’t need to unzip the file to use it later on.

4. Note for advanced users

If you’re feeling adventurous, you can do all of the above with this beautiful one-liner, which will save you a heap of time that the program would otherwise spend performing I/O (loading in and writing out data to/from disk):

htscmd bamshuf -uOn 128 aln_reads.bam tmp | htscmd bam2fq -a - | gzip > interleaved_reads.fq.gz

(howto) Revert a BAM file to FastQ format

NOTE: This tutorial has been replaced by a more recent and much improved version, Tutorial#6484.

Objective

Prerequisites

Steps

1. Shuffle the reads in the bam file

Action

Expected Result

2. Revert the BAM file to FastQ

Action

Expected Result

3. Compress the FastQ file

Action

Expected Result

4. Note for advanced users

Trending Articles

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

Suspected burglar to know fate in January

Who Is Sisanda Jonas? | Biography| Profile| History Of South African Media...

God of war 3 PPSSPP Download For Android 1.3 GB

18A St. Fintan's Villas, Deansgrange, Co. Dublin - €365,000

Walkthrough Pokemon Victory Fire Complete | English Language

99 God Status for Whatsapp, Facebook

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Attharintiki Daaredhi: Bappu Gari Bommo Lyrics Translation

Not much punishment for substantial benefit fraud

Practice Sheet of Right form of verbs for HSC Students

Thomas Grundy – Bradwell

Black Angus Grilled Artichokes

[MP3] Texzy Ft Dr. Ritzy –“Leg Over” (Prod. @DrRitzy & @KezzyKlef)

The 10 Tennessee Cities With The Largest Black Population For 2021

Breaking Down Bumpy’s Boys: NYC Black Mob Boss Of Old Surrounded Himself With...

Sarangapur Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers List...

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Cattivissimo.Me.3.2017.iTALiAN.MD.WEBDL.XviD-iSTANCE Seed (318)/Leech (148)