A command-line utility and Ruby library for splitting documents into text, images, PDFs, and metadata.