How To Search in PDF Files with PDFGREP on the Terminal
We show you how to use pdfgrep commands to search for text within PDF files in the Linux terminal.
What is PDFGREP?
Pdfgrep is a command line utility to search for text in PDF files simply and functionally, saving us time to access each file and search the text with our PDF tools.
Step 1: Install Pdfgrep
In this case, we will use Ubuntu, so simply execute the following line.
sudo apt install pdfgrep
Other installation options are:
- Download the .TAR.GZ file at the following link.
- Or execute the following command:
git clone https://gitlab.com/pdfgrep/pdfgrep.git
Then enter each of the following lines in your order:
./configure make sudo make install
Step 2: How To Use Pdfgrep
Once installed pdfgrep this will be the syntax to use:
pdfgrep [OPTION...] PATTERN [FILE]
Each of the elements is:
- Option: Indicates the attributes that we can add in the search, for example -i or –ignore-case, which ignore the distinction of uppercase and lowercase letters between the pattern we have indicated and the one that must match the file.
- Pattern: Indicates an extended regular expression.
- File: This is the PDF file where the search will be executed.
We will start with a simple search, for example, we will look for the word Solvetic in the Solvetic.pdf file, for this, we execute the following:
pdfgrep Solvetic Solvetic.pdf
In this case, there is only once this term in that file, but now we will look for the term Windows in an official PDF file of Microsoft, and this will be the result that we will see:
We can see that the searched word is highlighted which facilitates its location.
Now, if we add the -in parameter, it will be possible to see the results with the page number where the term has been detected:
Another option that we can use with pdfgrep is to list the PDF file (s) that contain a specific term, for this we execute the following:
pdfgrep Solvetic *pdf
In this way the PDF file where the term Solvetic is found will be listed:
If we want to open the PDF file we can execute the following command:
With this pdfgrep becomes an ideal solution when working with PDF files in Linux environments.
Great piece of work Thanks you.
Great piece of work Thank you.
Previous comment should read, “Great piece of work Thank you.”