PDF Generator
20/12/2022
A team that I am currently working with asks users to print off a paper form, fill it out, scan and finally email it to the appropriate people. This has lots of issues such as allowing user errors, not enforcing required fields, sending data the the wrong person or hard to read handwriting.
We could use a solution like Google Forms but I don’t think it’s professional to send users to another website for simple forms.
I’ll create a form on our website that is sent to an API to generate a PDF from the form data and send it to the appropriate people.
Generate a PDF
There are various python libraries that are available to use to generate .pdf
files, however many of them require the developer to install an executable and add it to the path. Some of the choices I came across were:
reportlab
pdfkit
WeasyPrint
fpdf2
pdfkit
looked promising since it could convert HTML along with css styling to a pdf but required the install of wkhtmltopdf
which broke my requirements. WeasyPrint
also required too much set up and ReportLab
was too complex.
I chose to go ahead with FPDF2
since it seemed to be the simplest to use. The form that I want to generate is not complicated. Using HTML to format the pdf seems to be the easiest way forward for me although a drawback of this package is that it does not support css styling. Therefore, I’m limited to a small number of core HTML tags that the package supports. Here is the documentation.
Tech stack
The data flow, from website to email, starts with my SvelteKit
frontend website. Users enter the data via a HTML form whic is then sent to an existing Python Flask API that I’ve modified. It was too much work to set up a whole new API just to receive data from one form. The API generates the PDF, saves it locally and then sends a copy to both the person who handles the data and one to the customer in a confirmation email.
Generating the PDF
As I said before, I’ll be using the FPDF2
library which can be installed via pip
.
pip install fpdf2
From here, creating a PDF is pretty straigh forward. I found that HTML is the best way to go forward rather than dealing with cells and formatting via the API.
First, let’s create a pdf and set the font. We set the font here since we can’t use styling within the HTML.
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.set_font(family='helvetica', size=0)
From here, I use a .html
file that only uses basic tags. This makes up the template for my PDF file within which are variables that I can replace. Wherever I want to replace a variable, I enter {{variable_name}}
which is replaced using the replace()
method:
with open('template.html', encoding='UTF-8') as file:
html = file.read()
replacement = 'john.smith@gmail.com'
html = html.replace('{{email}}', replacement, 2)
Here, my HTML was a link with an email address. I need to replace the {{email}}
variable in 2 places so I set the occurences to 2.
<a href='mailto:{{email}}'>{{email}}</a>
From here, this can be writted to the PDF via the write_html
method. It’s also useful to add a title and author to the PDF to make it look nice:
pdf.write_html(html)
pdf.set_title('My Awesome PDF')
pdf.set_author('Fraser Rennie')
Finally, to export the file, you set the file name and call the output
method:
pdf.output('hello-world.pdf')
We are planning to send the PDF in an email so it’s important that you return the file name to use later.
Unique file name
If you are saving files on the server, you’ll likely want to prevent overwrites. In my case, I am saving the files as firstname-lastname-form-title.pdf
. Now what can happen is that if two people with the same name fill out the form, the second person will overwrite the first form causing a loss of data.
To get around this, I just add 5 random letters to the front of the file name:
import random
import string
letters = ''.join(random.choice(string.ascii_letters) for _ in range(5))
file_name = letters + '-' + file_name
There are of course other, and possibly better, ways of doing this like adding a timestamp at time of creation. Additionally, you can create logic to check if the current file exists, despite it being highly unlikely, and request a new string of random letters in the case that it does exist.
Sending emails
The form should be emailed to both the person that handles the form data as well as the person who submitted the form as confirmation.
Sending emails is pretty straight forward when it comes to just a simple HTML message. Below is a method to send the email given the sender
, receiver
(which can either be a string or list of strings) and the msg
. The msg
variable often causes some issues.
It is extremely important that you store your credentials in environment variables, especially if the source code goes anywhere on the internet, including GitHub etc.
I’ve made the mistake of uploading credentials which resulted in tens of thousands of spam emails being sent from my account. Once that has happened, your account will be flagged as spam and you’ll have a really tough time making it out of the spam folder. At that point, just create a new email to send from.
import os
import ssl
import smtplib
from dotenv import load_dotenv
load_dotenv()
def send_email(sender, receiver, msg):
email_username= os.getenv('EMAIL_USER')
email_pass = os.getenv('EMAIL_PASS')
context = ssl.create_default_context()
with smtplib.SMTP_SSL('mail.yourserver.co.uk', 465, context=context) as server:
server.login(email_username, email_pass)
server.sendmail(sender, receiver, msg.as_string())
server.quit()
To create the email we attach both a text and HTML version of the content. First we define our variables and import the salient libraries.
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.utils import formataddr
sender = 'no-reply@domain.com'
receiver = 'john.smith@gmail.com'
text = 'Hello World'
with open('email_template.html', encoding='UTF-8') as file:
email_html = file.read()
# Replace your HTML variables if you need to.
From here, we format the header information of the message. Much of this is just useful information for the recipient. If this were not included, the email would be sent but there would be no information about the sender or recipient.
msg = MIMEMultipart('alternative')
msg['From'] = formataddr(('Sender Name', sender))
msg['To'] = receiver
msg['Subject'] = 'Your Subject'
part1 = MIMEText(text, 'plain')
part2 = MIMEText(email_html, 'html')
msg.attach(part1)
msg.attach(part2)
So far, our email consists of the text/html content along with metadata about the email. At this point, you can send off your email, although I want to attach the PDF that we created earlier. There are various ways that I found online however this works for me every time:
with open(pdf_file_name, 'rb') as f:
attach = MIMEApplication(f.read(),_subtype='pdf')
attach.add_header('Content-Disposition','attachment',filename=str(pdf_file_name))
msg.attach(attach)
# Now send your email
send_email(sender, recipients, msg)
Frontend Form
The form is just a simple HTML form where the names of each field make up the json
data that is posted to the API endpoint.
Here, the isSubmitting
variable disables the submission button so that the user cannot spam the form. The isSubmitted
variable displays a success message under the form once the data has been sent. This is reset after 8 seconds.
let isSubmitted = false
let isSubmitting = false
async function send_data(data) {
const res = await fetch('https://api.com/route/to/endpoint', {
method: 'POST',
body: JSON.stringify(data)
})
isSubmitting = false
}
function onSubmit(e) {
isSubmitting = true
const form_data = new FormData(e.target);
const data = {};
for (let field of form_data) {
const [key, value] = field;
data[key] = value;
}
send_data(data)
const form = document.querySelector('form');
form.reset();
isSubmitted=true
setTimeout(() =>{
isSubmitted=false
}, 8000);
}
The form is pretty standard and you’ll be able to find lots of Svelte examples on how to handle form submission. You may have noticed that I don’t bother to handle the status code of the response from the POST
request. This is for no good reason and I’ll implement that in the future if the form receives high traffic.