Convert HTML to PDF with JavaScript Using Puppeteer

Hey developers! In this article, I am going to show you how to convert from HTML to PDF using JavaScript with Puppeteer. Here we will be looking at various Puppeteer operations. For instance, fetching raw HTML from a URL path and conversion of HTML to PDF from it. We shall also see how to add an external CSS file in the Puppeteer script. 

What is Puppeteer? 

Now, for those of you who do not know, Puppeteer is a NodeJS library that is free to use. Check out here: Puppeteer NPM. It provides a high-level API to control headless Chrome over the DevTools Protocol. It is essentially an automation tool, which also excels in testing UI. You can read more about what is Puppeteer in this GitHub Repository. It has a wide scope of uses, including: 

  • UI testing keyboard input 
  • Form submission UI testing 
  • Crawl a SPA (Single-Page Application) 
  • Generate pre-rendered content 
  • Puppeteer for web scraping
  • Generate screenshots and pdfs 
  • Automate form submission UI 
  • … (And the list goes on) 

What is Headless Browser? 

Simply put… A headless browser means a browser without the GUI. Headless browsing is performing most of the tasks, without opening the browser in the first place. Rather, it performs it on a terminal. And in this Puppeteer JavaScript tutorial, Puppeteer is going to be very useful, as our project does not have a GUI either. In this article, we are only going to use one of its features in a headless browser which is conversion of HTML to PDF. So, let’s begin. 

Table of Content

  1. Create NodeJS Project
  2. Install Puppeteer in NodeJS
  3. Create Backend service in NodeJS
  4. Convert HTML to PDF Using JavaScript
  5. Testing Conversion of HTML to PDF in Postman
  6. Conclusion

Step 1: Create NodeJS Project

Firstly, create a nodejs project by making a new folder named pdf_generator. Then open a terminal and change the current directory to inside pdf_generator. Then execute the following command: 

				
					npm init –y 
				
			

Next, create a new file named “index.js. Now you have started a NodeJS project. Let’s install all the required node packages in the next step. 

STEP 2: Install Puppeteer in NodeJS

Open a terminal inside your project root directory (“pdf_generator”). Now, execute these commands to install all the node packages, including Puppeteer and ExpressJS 

				
					npm i puppeteer 
npm i express 
npm i node-fetch@2.6.5 

				
			

STEP 3: Create Backend service in NodeJS

In order to convert HTML to PDF, we first need to receive a request to perform the conversion. Puppeteer, coupled with expressjs will make a backend service that will be open to incoming HTTP calls. This way, we can return the PDF converted file back to the user. Let’s first create routes for our Nodejs service. Add the following code in the index.js file: 

				
					const express = require('express'); 
const app = express(); 
const service = require('./services') 
app.use(express.json()); 
  
app.get("/test", service.hello_world); 
app.post("/pdf", service.generate_pdf); 
  
app.listen(3000, ()=> { 
    console.log(`project running on port 3000`); 
}); 
				
			

STEP 4: Convert HTML to PDF Using JavaScript

For the conversion of HTML to PDF, we need the HTML code first. We will be reading HTML content from a template file stored beforehand in our project. But of course, you can also have HTML sent as a payload in the request body to the service. I’ve created a “dumy.html” file in the project and added dummy HTML code to it. 

Now, let’s create a file named “service.js” and add the following code to it.  

				
					const puppeteer = require("puppeteer");
const fs = require("fs"); 
  
  
exports.hello_world = async (req, res) => res.send('Hello World'); 
  
exports.generate_pdf = async (req, res) => { 
  const browser = await puppeteer.launch({ headless: true }); 
  const page = await browser.newPage(); 
  const html = await `${fs.readFileSync(`./dummy.html`, "utf8")}`; 
  await page.setContent(html, { waitUntil: "domcontentloaded" }); 
  
  const pdf = await page.pdf({ 
    format: "A4", 
    printBackground: false, 
    preferCSSPageSize: true, 
    displayHeaderFooter: true, 
  
    headerTemplate: `<div class="header" style="font-size:20px; padding-left:15px;"><h1>Main Heading</h1></div> `, 
    footerTemplate: '<footer><h5>Page <span class="pageNumber"></span> of <span class="totalPages"></span></h5></footer>', 
    margin: { top: "200px", bottom: "150px", right: "20px", left: "20px"}, 
  }); 
  
  //  SENDING BACK PDF IN RESPONCE TO API CALL 
  res.contentType("application/pdf"); 
  res.send(pdf);   
}; 
				
			

The “headerTamplate”, and “footerTemplate” parameters are the Header and Footer layouts for your PDF. These will be replicated on all of your pdf pages when the conversion of HTML to PDF is taking effect. Lastly, the “margin” parameter is used to inject CSS. To see how to inject CSS in various ways, check out this article.

On a side note, you should also check out my article on how to download PDF in Angular Framework through a button click.

STEP 5: Testing Conversion of HTML to PDF in Postman

Lastly, we are now going to test the conversion of HTML to PDF using Puppeteer. Open Postman and type the following URL: http://localhost:3000/pdf 

Make sure to keep the request method to POST as shown in the picture below.  

Testing to convert from HTML to PDF using JavaScript using Puppeteer
Testing to convert from HTML to PDF using JavaScript using Puppeteer

As a result, we have successfully converted raw HTML to PDF using Puppeteer. The generated PDF has been sent back to the user in the response body by the NodeJS service.  

Conclusion

In summary, we were successful in performing the following steps:  

  • Creating a NodeJS backend service 
  • Implementing ExpressJS to perform routing and other HTTP services 
  • Reading HTML from a static file
  • Convert HTML to PDF using JavaScript with Puppeteer
  • Adding Header and Footer separately using Puppeteer 
  • Sending the generated PDF to the user as HTTP Response 
  • Have demonstrated how to add CSS in PDF here 

As a result, we have successfully converted raw HTML to PDF using Puppeteer. The generated PDF has been sent back to the user in the response body by the NodeJS service.  

That’s all folks! I hope now you have a good idea of how to convert from HTML to PDF using JavaScript with Puppeteer.  For any queries, please leave your comments below.  

Have a great one!