I came a cross an amazing Python code snippet that convert PDF e-books into an audiobook with a minimal code.
The code snippet uses two Python packages:
- PyPDF2: a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. PyPDF2 can retrieve text and metadata from PDFs as well.
- PyTTSx3 which is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline, and is compatible with both Python 2 and 3.
The code is pretty straightforward, and it demonstrates how simple and cool Python is.
First install the required packages
pip install PyPDF2 pip install pyttsx3
Now create your Python script file, and add:
import PyPDF2 import pyttsx3 # Read the pdf by specifying the path in your computer pdfReader = PyPDF2.PdfFileReader(open('clcoding.pdf', 'rb')) # Get the handle to speaker speaker = pyttsx3.init() # split the pages and read one by one for page_num in range(pdfReader.numPages): text = pdfReader.getPage(page_num). extractText() speaker.say(text) #clcoding.com speaker.runAndWait() # stop the speaker after completion speaker.stop() # save the audiobook at specified path engine.save_to_file(text, 'E:\audio.mp3') engine.runAndWait()
I found a pretty close tutorial from 2020 that explains more, by Aman Kharwal.