This Python script updates the metadata, specifically the Producer and Creator fields, of given PDF files. The purpose is to make it possible to open non-scanned files, for example, using ScanSnap Home. Additionally, the script creates 'output', 'input', and 'config' folders in the root directory to store the processed files.
Dennis Biehl
MIT License
- os
- PyPDF2
- configparser
- Ensure that PyPDF2 is installed using
pip install PyPDF2. - Run the script by executing
start.pyto create 'output', 'input', and 'config' folders. - Place PDF files in the 'input' folder.
- The script will merge each PDF's pages with its metadata and save the result in the 'output' folder.
create_folders(): Creates 'output', 'input', and 'config' folders if they don't exist and returns their paths.create_config(config_file, config): Creates a configuration file with a 'Metadata' section.update_config(config_file, config): Updates the configuration file with metadata values.process_pdf(original_file_path, output_folder, config): Merges the pages of a PDF with its metadata and saves the result.
- The 'output', 'input', and 'config' folders are created in the root directory.
- Metadata such as Producer and Creator are set during the merging process.
1.0.0 (2024-01-14)
- 2024-01-14: Initial release.
Ensure Python is installed on your system, navigate to the root folder, and execute:
python start.py- The script assumes a specific folder structure. Adjust paths if your project structure is different.
- Confirm PyPDF2 installation using
pip install PyPDF2before running the script.
