Bcl2fastq2¶
The Illumina bcl2fastq2 conversion software demultiplexes sequencing data and converts base call (BCL) files into FASTQ files. For every cycle of a sequencing run, the Real-Time Analysis (RTA) software generates a BCL file containing base calls and associated quality scores (Q-scores).
Bcl2fastq2 is available as a module on Apocrita.
Usage¶
To run the default installed version of Bcl2fastq2, simply load the bcl2fastq2
module:
$ module load bcl2fastq2
$ bcl2fastq -h
Usage:
bcl2fastq [options]
For usage documentation, run bcl2fastq2 -h.
Example job¶
Serial job¶
Threading options
By default, Bcl2fastq2 will run on as many threads as installed CPUs during
conversion/demultiplexing, plus additional threads for reading/writing
data. Therefore, to prevent overloading a compute node, you must set the
number of processing threads to the number of cores requested, or simply
use the SLURM_NTASKS variable, as shown in the example job below.
The file i/o threads (loading/writing) are typically inactive and consume minimal processing time. By default, each will use 4 threads but we recommend using 1 for reading and 1 for writing unless you need to perform a lot of i/o. If your job is i/o intensive, increasing the loading/writing threads will improve the overall job performance, especially if paired with using your scratch space.
Here is an example job running on 4 cores and 8GB of memory:
#!/bin/bash
#SBATCH -n 4 # (or --ntasks=4) Request 4 cores
#SBATCH --mem-per-cpu=2G # Request 2GB RAM per core
#SBATCH -t 1:0:0 # Request 1 hour runtime
module load bcl2fastq2
bcl2fastq --runfolder-dir <runfolder_dir> \
--output-dir <output_dir> \
--processing-threads ${SLURM_NTASKS} \
--loading-threads 1 \
--writing-threads 1