12/01/2022 By RuneLite
Unlike a text file, you can't edit a PDF directly. There are multiple ways to generate PDF files using text. But what if you want to go the other way round and convert PDFs to text files?
Luckily, Linux allows you to easily modify these files from the terminal. This article will demonstrate how to convert a PDF file to a text document on Linux.
Convert PDF to Text From the Terminal
Poppler is a software library used to render and modify PDF files. It contains a utility, known as pdftotext, that allows users to generate text files from PDFs. Since poppler-utils is not a part of the standard Linux packages, you'll have to install it manually using a package manager.
On Ubuntu and Debian:
sudo apt install poppler-utils
To install Poppler on Arch Linux:
sudo pacman -S poppler
Installing the poppler-utils package on CentOS, Fedora, and other RHEL-based distributions is easy.
sudo dnf install poppler-utilssudo yum install poppler-utils
Convert an Entire PDF to Text
The basic syntax of the pdftotext command is:
pdftotext [options] pdffile textfile
...where pdffile is the absolute or relative path to the PDF file, and textfile is the name of the output file.
For example, to convert lorem-ipsum.pdf to a text file:
pdftotext lorem-ipsum.pdf text.txt
If the file you're converting has watermarks or unaligned text, you can discard them in the output by using the -nodiag flag.
pdftotext -nodiag lorem-ipsum.pdf random.text
Process Pages Within a Specific Range
Use the -f and -l flag if you want to convert pages that fall within a specific range. For example, to convert pages one to five in lorem-ipsum.pdf to text:
pdftotext -f 1 -l 5 lorem-ipsum.pdf output.txt
To convert only the first page of the PDF file:
pdftotext -f 1 -l 1 lorem-ipsum.pdf output.txt
Convert Password-Protected PDF Files to Text
Pdftotext can even convert password-protected PDFs to text files. The -upw and -opw flags, which stand for user password and owner password respectively, take care of the authentication process while converting the PDF files.
pdftotext -upw password lorem-ipsum.pdf output.txtpdftotext -opw password lorem-ipsum.pdf output.txt
Make sure to replace password with the password of the PDF file.
You can also combine multiple flags to get the desired output. For example, to convert pages one to three of a password-protected PDF to text:
pdftotext -f 1 -l 3 -upw password lorem-ipsum.pdf output.txt
Related: How to Convert a PDF File to Images in Linux
Graphically Convert PDF to a Text File
If working with the command line is not your cup of tea, you can convert PDFs to text files using graphical software like Calibre. It is an ebook management application that you can use to view, organize, and modify PDF files on your system.
Calibre is available on the official Linux distro repositories and anyone can download it using a package manager.
To install Calibre on Ubuntu and Debian:
sudo apt install calibre
On Arch Linux:
sudo pacman -S calibre
On RHEL-based distributions like CentOS and Fedora, you can download Calibre using either DNF or Yum.
sudo dnf install calibresudo yum install calibre
How to Use Calibre to Convert PDF Files
Once installed, launch Calibre on your system using the Applications Menu. Alternatively, you can start Calibre from the terminal by typing:
To generate text files using PDF with Calibre:
Click on the Add Books option from the menu.
Locate and select the PDF file that you want to convert.
Highlight the PDF file from the center panel and select Convert Books from the menu.
From the Output format dropdown, select TXT.
Finally, click on OK to continue.
Calibre will now start converting the specified PDF file to a text document. You can check the status of the process by clicking on the Jobs option, located at the bottom-right of the window.
Working With PDF Files in Linux
When you want to share a document with someone, converting it into a PDF before sharing is the most efficient way. Before, users had to install a dedicated PDF viewer on their system to display PDF files, but now, almost every browser comes with a built-in PDF viewer.
You can find several applications that allow a user to view and edit PDF files easily. Many Linux installations ship with LibreOffice, an office software suite, that can be used as a PDF editor.
The 5 Best Linux PDF Editors You Should Try
Need to edit a PDF file in Linux? These Linux PDF editors are free to install and easy to use.
About The Author
(78 Articles Published)
Deepesh is the Junior Editor for Linux at MUO. He writes informational guides on Linux, aiming to provide a blissful experience to all newcomers. Not sure about movies, but if you want to talk about technology, he's your guy. In his free time, you can find him reading books, listening to different music genres, or playing his guitar.
More From Deepesh Sharma
Subscribe to our newsletter
Join our newsletter for tech tips, reviews, free ebooks, and exclusive deals!
Click here to subscribe