
Rough summery for generation of test cases.

The reads were hand-picked from a input generated by the
Mason read simulator. Indel realignment intervals extracted
by hand from GATK output of the RealignmentTargetCreator.

Mouse reference from:

wget ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Mus_musculus/GRCm38/Primary_Assembly/assembled_chromosomes/FASTA/chrY.fa.gz

Mason from:

http://www.seqan.de/projects/mason/

Mouse reads created by:

./bin/mason illumina -sq -n 100 -hn 2 -pi 0.005 -pd 0.005 -mp -i -rn 1 /home/andre/biotools/mouse_chrY.fa

Convert sam to bam and index:

samtools view -bS mouse_chrY.fa.fastq.sam > mouse_chrY.fa.fastq.bam
samtools index mouse_chrY.fa.fastq.bam

Fix read groups and such:

./picard-tools.sh AddOrReplaceReadGroups.jar I= mouse_chrY.fa.fastq.bam O= mouse_chrY.fa.fastq.fixed.bam SORT_ORDER=coordinate RGID="read_group_id" RGLB="library" RGPL="illumina" RGPU="platform_unit" RGSM="sequencing_center" CREATE_INDEX=True;

Reference sequence dictionary:

./picard-tools.sh CreateSequenceDictionary.jar R= mouse_chrY.fa O= mouse_chrY.dict

Here we notice that the true bam fails GATK's mapping quality test?!

bwa index mouse_chrY.fa
bwa mem -M -t 4 mouse_chrY.fa mouse_chrY.fa_1.fastq mouse_chrY.fa_2.fastq > mouse_chrY.fa.bwa.sam

... and repeat the steps from above
samtools view -bS mouse_chrY.fa.bwa.sam > mouse_chrY.fa.bwa.bam
samtools sort mouse_chrY.fa.bwa.bam mouse_chrY.fa.bwa.sorted
samtools index mouse_chrY.fa.bwa.sorted.bam
./picard-tools.sh AddOrReplaceReadGroups.jar I= mouse_chrY.fa.bwa.sorted.bam O= mouse_chrY.fa.bwa.sorted.fixed.bam SORT_ORDER=coordinate RGID="read_group_id" RGLB="library" RGPL="illumina" RGPU="platform_unit" RGSM="sequencing_center" CREATE_INDEX=True;

Generate samtools mpileup for comparison:

samtools mpileup -f /home/andre/biotools/mouse_chrY.fa small_realignment_targets.bam > small_realignment_targets.pileup

Notice that MD tag is missing so generate it:

samtools calmd small_realignment_targets.bam /home/andre/biotools/mouse_chrY.fa > small_realignment_targets.sam_new
