I have been using Flux Capacitor to quantify transcript abundance for samples in my RNA-seq study. However, for 2 of 126 samples, Flux Capacitor crashes with an out of memory error. I have increased the java heap space to 16G using the environmental variable FLUX_MEM=16G, but the error still occurs. The BAM file is about 2.4GB with about 42 million paired end reads. Do you have any idea what the problem could be? Strangely, I am telling Flux Capacitor to use only 2 threads, but it uses 22 threads on my server. So I wonder if it could be related to this?
/net/wonderland/home/cgillies/programs/flux-capacitor-1.6.1/bin/flux-capacitor -i /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015//25870//final.bam -a /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX//gtf.filtered.sorted.gtf -m PAIRED -o /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX//25870.gtf --count-elements SPLICE_JUNCTIONS,INTRONS --threads 2 --force --tmp-dir /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX//25870_tmp/
[INFO] Flux-Capacitor v1.6.1 (Flux Library: 1.29)
[PRE-CHECK] I am checking availability of the required lpsolve JNI libs.
[PRE-CHECK] * successfully loaded lpsolve JNI (version 5.5,release 0,build 14)
Scanning annotation file Checking GTF ********** OK (00:00:09)
scanning OK (00:00:48)
[WARN] Skipped 268333 lines.
[INFO] 53182 loci, 215170 transcripts, 1306656 exons.
OK (00:00:48)
Scanning mapping file [SAM] Setting validation stringency to SILENT
[INFO] The Flux Capacitor is not using the SAM flags for counting the number of reads in the mapping file.
[WARN] This process can be long for big files!
OK (00:14:23)
[INFO] 85007194 mapped reads, 85007194 mappings: R-factor 1.0
[INFO] 76440206 entire, 8566988 split mappings (10.077957%)
OK (00:14:23)
[INFO] Annotation and mapping input checked
[HEHO] We are set, so let's go!
[ANNOTATION_FILE] /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX/gtf.filtered.sorted.gtf
[MAPPING_FILE] /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/25870/final.bam
[INFO] minimum intron length 24
[SORT_IN_RAM] true
[TMP_DIR] /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX/25870_tmp
[STDOUT_FILE] /net/assembly/cgillies/data/NEPTUNE/RNA-Seq/11_24_2015/FLUX/25870.gtf
[INFO] mate pairing information considered
[PROFILE] Loading profile
[PROFILE] Scanning the input and getting the attributes.
profiling Exception in thread "Thread-5" java.lang.OutOfMemoryError: Java heap space
at net.sf.samtools.DefaultSAMRecordFactory.createBAMRecord(DefaultSAMRecordFactory.java:30)
at net.sf.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:201)
at net.sf.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:557)
at net.sf.samtools.BAMFileReader$BAMFileIndexIterator.getNextRecord(BAMFileReader.java:664)
at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:531)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:521)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:480)
at net.sf.samtools.BAMFileReader$BAMQueryFilteringIterator.advance(BAMFileReader.java:749)
at net.sf.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:719)
at net.sf.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:672)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:664)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:642)
at barna.io.sam.SAMMappingSortedIterator$1.run(SAMMappingSortedIterator.java:103)
at java.lang.Thread.run(Thread.java:744)
1 Comment
christopher gillies
I found out what the problem was. The flux capacitor shell script passes the environmental variable $JAVA_OPTS into java execution command after the Xmx16G option. My environmental variable $JAVA_OPTS was set to -Xmx1024m. So this overrode the -Xmx16G command because it occurred after it.
java -Xmx$FLUX_MEM -DwrapperDir="$dir/bin" $MISC \
-Dflux.tool=capacitor \
-Dflux.app="barna.capacitor" \
${JAVA_OPTS} \
-cp $cp barna.commons.launcher.Flux "$@"