Repeating multiple image to PDF conversions: Easily automate with Python code

Previous Poston how to convert multiple images to PDF? Today, we're going to take it a step further: what if you've created a multi-image PDF file from a subfolder (A) in a specific folder, and that specific folder has other images in another subfolder (B)? What if a folder contains a mix of folders where you've already created PDF files and folders where you haven't, and you want to convert each folder to PDF? It's a bit more complicated than it sounds when I write it down, so take a look below to see what I mean.

여러 이미지 PDF 변환 포스트 그림

Code description: Convert multiple images to PDF with Python

In this code, the Pillowand ReportLab Use libraries to automate multiple image to PDF conversions. During image processing and PDF creation, you can navigate through a directory structure to convert multiple images to PDFs individually.

여러 이미지 PDF 변환 포스트 그림

1. load the required Python libraries

import os
from PIL import Image
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import portrait

The first step is to load the necessary libraries, in this case the os, PILThe Imageand ReportLabThe canvas and portrait Use the Page Size option.

  • os: Browse files in a directory and create paths.
  • Pillow (PIL): Open and process an image file.
  • ReportLab: Create a PDF file and insert an image.

2. Write an image processing function

def process_images(c, dir_path):
    img_list = sorted([img_name for img_name in os.listdir(dir_path) if img_name.endswith(".png")])
    if img_list:
        for img_name in img_list[1:]:
            img_path = os.path.join(dir_path, img_name)
            img = Image.open(img_path)
            img_width, img_height = img.size
            c.setPageSize((img_width, img_height))
            c.showPage()
            c.drawInlineImage(img_path, 0, 0, width=img_width, height=img_height)

This function is a key part of processing image files and converting them to PDFs.

  1. img_list: Get only ".png" files from the given directory and sort them.
  • os.listdir(dir_path)will get all files in the directory, if img_name.endswith(".png") The condition selects only PNG files.
  • Open an image file: After creating the image path Image.open()to open the image.
  • Setting the page size: setPageSize()to set the page size to fit the image size.
  • Add an image: drawInlineImage()to insert an image into a PDF page.
  • showPage(): Finish the page and prepare a new page.

    This process converts each PNG file into one PDF page.

    3. Handle subdirectories within a directory

    def process_subdirs(main_dir, output_dir):
        for subdir in os.listdir(main_dir):
            subdir_path = os.path.join(main_dir, subdir)
            if os.path.isdir(subdir_path):
                for second_subdir in os.listdir(subdir_path):
                    second_subdir_path = os.path.join(subdir_path, second_subdir)
                    if os.path.isdir(second_subdir_path):
                        pdf_path = os.path.join(output_dir, subdir, f"{second_subdir}.pdf")
                        if not os.path.exists(pdf_path):
                            os.makedirs(os.path.dirname(pdf_path), exist_ok=True)
                            c = canvas.Canvas(pdf_path, pagesize=None)
                            process_images(c, second_subdir_path)
                            c.save()
                            print(f"Generated PDF file: {pdf_path}")

    This function is responsible for processing all subdirectories within a given main directory, generating a PDF file for each directory.

    1. os.listdir(): Gets a list of files in the main directory and subdirectories.
    • os.path.isdir() Condition to determine if it's a directory.
    • Create a path to a PDF file: os.path.join()to create the path to the PDF file, and will only create the PDF file if it doesn't already exist.
    • Create and save PDFs: by creating a Canvas object, process_images() function to process the image and save the PDF file.
    • os.makedirs(): Create a folder to store the PDF file if it doesn't already exist.

      This function traverses the directory structure and automatically converts the PNG files in each subdirectory to PDF.

      4. call the main function

      main_dir = "C:\\Users\\user\\Documents\\Book"
      output_dir = "C:\\Users\\user\\Documents\\Book_pdf"
      process_subdirs(main_dir, output_dir)

      Finally, run the PDF conversion job on the given path.

      1. main_dir: Path to the main directory, including images.
      2. output_dirPath where the PDF file will be saved.
      3. process_subdirs() function is called, which converts all images in the main directory and saves them as PDFs.

      Full Python code

      import os
      from PIL import Image
      from reportlab.pdfgen import canvas
      from reportlab.lib.pagesizes import portrait
      
      def process_images(c, dir_path):
          img_list = sorted([img_name for img_name in os.listdir(dir_path) if img_name.endswith(".png")])
          if img_list:
              for img_name in img_list[1:]:
                  img_path = os.path.join(dir_path, img_name)
                  img = Image.open(img_path)
                  img_width, img_height = img.size
                  c.setPageSize((img_width, img_height))
                  c.showPage()
                  c.drawInlineImage(img_path, 0, 0, width=img_width, height=img_height)
      
      def process_subdirs(main_dir, output_dir):
          for subdir in os.listdir(main_dir):
              subdir_path = os.path.join(main_dir, subdir)
              if os.path.isdir(subdir_path):
                  for second_subdir in os.listdir(subdir_path):
                      second_subdir_path = os.path.join(subdir_path, second_subdir)
                      if os.path.isdir(second_subdir_path):
                          pdf_path = os.path.join(output_dir, subdir, f"{second_subdir}.pdf")
                          if not os.path.exists(pdf_path):  # Run only if no PDF file has already been created
                              os.makedirs(os.path.dirname(pdf_path), exist_ok=True)
                              c = canvas.Canvas(pdf_path, pagesize=None)
                              process_images(c, second_subdir_path)
                              c.save()
                              print(f"Generated PDF file: {pdf_path}")
      
      # starting point
      main_dir = "C:\\Users\\user\\Documents\\Book"
      output_dir = "C:\\Users\\user\\Documents\\Book_pdf"
      process_subdirs(main_dir, output_dir)

      This code automates the process of finding PNG image files in multiple directories and converting them to PDF. Now you too can use Python to efficiently handle multiple image to PDF conversion tasks!

      # Additional description

      I'm going to try to explain what I want to do in Python code, which might make it more difficult, so I'm going to write some additional details. For example, let's say I have 1, 2, 3... subfolders under folder A, and the same 1, 2, 3... subfolders under folder B. Each subfolder in folder A contains png image files, so after doing the above, you'll have multiple images in folder B, each with the same folder name, in a single pdf file.

      However, if we want to leave the images in folder A, where the pdf file was already created in folder B, undeleted, we need to code the Python so that the subfolder of folder A, where the pdf file was created, should be skipped when the above operation is performed. In the full code above, you can see the '#Run only if no PDF file has already been created' will do the job. Now you should be able to easily convert multiple images to PDF per folder.

      Similar Posts