PDFtoText¶
PDFtoText is a tool for converting Portable Document Format (PDF) to text.
PDFtoText is available as a module on Apocrita.
Usage¶
To run the default installed version of PDFtoText, simply load the
pdftotext module:
$ module load pdftotext
$ pdftotext --help
pdftotext version X.Y.Z
Usage: pdftotext [options] <PDF-file> [<text-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-r <fp> : resolution, in DPI (default is 72)
...(output has been truncated)
For full usage documentation, run pdftotext --help or see the
user guide.
Example job¶
Here is an example job running on 1 core and 1GB memory:
#!/bin/bash
#SBATCH -n 1 # (or --ntasks=1) Request 1 core
#SBATCH --mem-per-cpu=1G # Request 1GB RAM per core
#SBATCH -t 1:0:0 # Request 1 hour runtime
module load pdftotext
# Convert PDF file to text
pdftotext input-file.pdf output-file.txt