RepeatModeler¶
RepeatModeler screens DNA sequences for interspersed repeats and low complexity DNA sequences.
RepeatModeler is available to install from the Bioconda Anaconda channel.
Installation¶
Load the default Miniforge module:
module load miniforge
If required, create a new Conda environment:
mamba create -n rm_env
Activate your Conda environment:
mamba activate rm_env
In your activated environment, install RepeatModeler from the Bioconda Anaconda channel, additionally specifying the Conda Forge channel for any additional required dependencies:
mamba install -c bioconda -c conda-forge repeatmodeler
Usage¶
To run the installed version of RepeatModeler, simply load the miniforge module
and activate your Conda environment:
module load miniforge
mamba activate rm_env
For usage documentation, run RepeatModeler -help:
(rm_env) $ RepeatModeler -help
No database indicated
NAME
RepeatModeler - Model repetitive DNA
SYNOPSIS
RepeatModeler [-options] -database <XDF Database>
Example jobs¶
Core Usage
To ensure that RepeatModeler uses the correct number of cores, the
-threads ${SLURM_NTASKS} option must be used.
Serial jobs¶
Here is an example job running on 1 core and 1GB of memory:
#!/bin/bash
#SBATCH -n 1 # (or --ntasks=1) Request 1 core
#SBATCH --mem-per-cpu=1G # Request 1GB RAM per core
#SBATCH -t 1:0:0 # Request 1 hour runtime
module load miniforge
mamba activate rm_env
# create a database for RepeatModeler
BuildDatabase -name DB_NAME INPUT.fa
Here is an example job running on 4 cores and 4GB of memory:
#!/bin/bash
#SBATCH -n 4 # (or --ntasks=4) Request 4 cores
#SBATCH --mem-per-cpu=1G # Request 1GB RAM per core
#SBATCH -t 1:0:0 # Request 1 hour runtime
module load miniforge
mamba activate rm_env
RepeatModeler -database DB_NAME -threads ${SLURM_NTASKS}