A simple program to remove the watermark from a PDF file.
- convert the PDF file into images using
pdf2image
- convert the images to numpy array
- find the specific pixel by watermarks' rgb values and change them into (255,255,255)
- save the modified images
First you need to install the dependencies:
$ pip install pdf2image
$ pip install scikit-image
Inside the repository create a directory that will receive the modified images:
$ mkdir jiangyi3
To execute:
$ python watermark.py
Don't forget to indicate the pdf's path you want to convert.