How to Use Puppeteer in Node.js (with Examples)

Puppeteer is a powerful Node.js library that allows you to control the browser programmatically. In this article, we will explore how to use Puppeteer, along with some practical examples like taking website screenshots, saving HTML content to PDF, and downloading images from a website.
by Josephine Loo ·

Contents

    If you’re looking for a Node.js library that can run the browser and take screenshots of a website, save its content to a PDF file, or download images from it, try Puppeteer. As a developer, you may have come across scenarios where you have to perform these tasks programmatically for various reasons. It could be taking screenshots of your website to verify its appearance or behavior visually after a code change, saving content from another website for offline reference, or any other reasons that require you to perform these tasks repeatedly.

    In all these situations, Puppeteer can be a handy tool. In this article, we’ll learn how to install the Puppeteer library in your Node.js project using npm, and use it to perform the tasks mentioned above. Let’s dive in!

    What is Puppeteer

    Puppeteer is an open-source Node.js library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. It can be used to run the Chrome/Chromium browser in headless mode to automate the browser and perform tasks like taking website screenshots, generating PDFs, downloading images, and more, all using code.

    Although it is developed to be run on Chrome, the collaboration between Puppeteer and Mozilla to support cross-browser testing coverage for developers has made Puppeteer available on Firefox from version 2.1.0 onwards.

    Pre-requisites

    To use Puppeteer in Node.js, you will need to have these installed:

    For reference, the versions used in this tutorial are 20.5.0 for Node.js and 9.8.0 for npm. That said, it’s not necessary to use the same versions as this tutorial.

    Installing Puppeteer

    Create a new folder for your project by running the mkdir command followed by the folder name in your terminal/command prompt. The command below will create a new folder named “puppeteer-tutorial”:

    mkdir puppeteer-tutorial
    

    Then, navigate to the project directory:

    cd puppeteer-tutorial
    

    Next, initialize a new Node.js project in the folder by running the command below:

    npm init
    

    It will prompt you to input values for a few properties of the project, press enter all the way to use the default values. Otherwise, you can enter your custom values.

    Once you have run through all the npm init steps, a package.json file will be generated in the project directory.

    screenshot of project directory

    Next, run the command below to install the Puppeteer library in your Node.js project using npm:

    npm i puppeteer
    

    When you run the command above to install Puppeteer, a recent version of Chromium which is guaranteed to work with the Puppeteer API will also be downloaded automatically. You will see it in the terminal/command prompt:

    screenshot of Chromium download progress

    Now that we have the Puppeteer library installed in our Node.js project, let's learn how to use it!

    🐻 Bear Tips: You can refer to How to Run Puppeteer with Firefox to run it on Firefox too.

    Using Puppeteer in Node.js

    In your project directory, create a new file named “index.js”. This is where we will write our code to use Puppeteer.

    In the “index.js” file, import puppeteer:

    const puppeteer = require('puppeteer');
    

    Next, create a self-invoking function. We will write our code within it to launch the browser and perform various actions using Puppeteer.

    (async () => {
    	// use Puppeteer here
    })();
    

    Alternatively, you can also create a function and call the function explicitly to do the same thing.

    function launchPuppeteer(){
    	// use Puppeteer here
    }
    
    launchPuppeteer();
    

    Typically, we will start with launching the browser, opening a new page, and then navigating to the target URL. Then, perform various actions on the web page and close the browser when it's done.

    (async () => {
    
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        await page.goto('https://www.bannerbear.com/');
    
        // perform various browser actions
    
        await browser.close();
    
    })();
    

    Let’s look at some examples…

    Taking Website Screenshots

    One thing that you can do with Puppeteer is to take screenshots of a website. After navigating to the target URL, we can use the page.screenshot() method to take a screenshot of the web page. We can also control the size of the screenshot by setting the viewport to specific width and height, as shown in the code below:

    const puppeteer = require('puppeteer');
    
    (async () => {
    
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
    
      // Set viewport width and height
      await page.setViewport({ width: 1280, height: 720 });
    
      const website_url = 'https://www.bannerbear.com/blog/how-to-convert-html-into-pdf-with-node-js-and-puppeteer/';
      await page.goto(website_url, { waitUntil: 'networkidle0' });
    
      // Capture screenshot
      await page.screenshot({
        path: 'screenshot.jpg',
      });
    
      await browser.close();
    })();
    

    Here’s the result of the screenshot:

    screenshot taken by Puppeteer

    This can be useful for taking scheduled screenshots of a website automatically for archiving purposes. And by setting the viewport to different sizes, you can take screenshots of the website on different devices (eg. desktop, tablet, and mobile) with a single piece of code.

    Saving HTML to PDF

    Puppeteer can also be used in Node.js to save the HTML content of a web page to PDF. You can either pass the URL of the web page to Puppeteer or convert the HTML code into a PDF file directly.

    Similar to taking screenshots of a website, we will first launch the browser, open a new page, and navigate to the target URL. To enhance the output in the PDF, we need to change the CSS media type to screen using page.emulateMediaType('screen'). Next, save the web page to PDF using page.pdf() with properties like path, margin, printBackground, and format:

    const puppeteer = require('puppeteer');
    
    (async () => {
    
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        const website_url = 'https://www.bannerbear.com/blog/how-to-download-images-from-a-website-using-puppeteer/'; 
        await page.goto(website_url, { waitUntil: 'networkidle0' }); 
    
        //To reflect CSS used for screens instead of print
        await page.emulateMediaType('screen');
    
        // Download the PDF
        const pdf = await page.pdf({
            path: 'result.pdf',
            margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' },
            printBackground: true,
            format: 'A4',
        });
    
        await browser.close();
    })();
    

    The content of the web page will be saved to a PDF file in the specified file path:

    screenshot of the PDF file

    Different from taking a screenshot, saving the content of a web page in a PDF file not only saves its appearance but also preserves the links. You can click on the links on the PDF file and open them in a browser.

    screenshot of the PDF with the mouse hovering over a link

    To convert HTML code into PDF, follow the full tutorial on How to Convert HTML into PDF with Node.js and Puppeteer.

    Downloading Images

    You can also use Puppeteer to download images from a web page in different formats including JPEG, PNG, SVG, and GIF. As you can listen to page events using page.on(),  it allows you to listen to response and catch responses where the URL match the specified regular expression.

    The code below will catch responses with URL ending with ".jpg", ".png", ".svg", or ".gif" and save them into a folder named /images after renaming them. They will be renamed to image-x, where x is an index that counts the images.

    const puppeteer = require('puppeteer');
    const fs = require('fs');
    
    (async () => {
    
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
    
      let counter = 0;
      page.on('response', async (response) => {
        const matches = /.*\\.(jpg|png|svg|gif)$/.exec(response.url());
        console.log(matches);
        if (matches && (matches.length === 2)) {
          const extension = matches[1];
          const buffer = await response.buffer();
          fs.writeFileSync(`images/image-${counter}.${extension}`, buffer, 'base64');
          counter += 1;
        }
      });
    
      await page.goto('https://www.bannerbear.com/solutions/automate-your-marketing/');
    
      await browser.close();
    })();
    

    Here's a screenshot of the folder with the downloaded images:

    screenshot of the folder containing the downloaded images

    Using Bannerbear API

    If you want to add a watermark to the screenshots and images, or use them to create other visual content, you can automate the process with Bannerbear. It provides a simple REST API that offers easy integration into any existing platform or app to generate designed visual content like images and videos automatically.

    First, you will need a design template in your Bannerbear account like the one in the screenshot below:

    screenshot of a Bannerbear design template

    The image on the phone screen can be replaced with other images by passing their URLs to the Bannerbear API. We can integrate it into a Puppeteer task like one of the examples above by passing the images we got using Puppeteer (eg. a website screenshot) to the template, and it's all done with code.

    Here's an example of the result:

    screenshot of the image created automatically by Bannerbear using the design template

    As different images can be passed to the same Bannerbear template to generate unique results, it can be useful for scenarios where you need to generate a large number of images with different content. It can save you lots of time and effort compared to designing and creating each image manually.

    🐻 Bear Tips: You don't have to create the design template from scratch—duplicate one from the Template Library in seconds!

    What’s Next

    Now that you have learned how to use Puppeteer in Node.js to perform various tasks and how Bannerbear can be seamlessly integrated into the workflow to automate them, why not further explore how you can utilize Bannerbear?

    You can start for free and follow these tutorials:

    About the authorJosephine Loo
    Josephine is an automation enthusiast. She loves automating stuff and helping people to increase productivity with automation.

    Integrating Bannerbear Into Your Web Applications

    Bannerbear is an API that helps you automatically generate custom images, videos, and more from templates. In this article, we’ll explore how you can integrate Bannerbear into your web applications.

    5-Minute Crash Course to Facebook Single Image Ads (A 2024 Guide)

    Facebook single image ads can help both B2B and B2C brands reach and engage a diverse audience. Read our guide to learn what you need to know about this ad type.

    FFmpeg Video Editing Essentials: Trimming, Merging, Subtitling, and More!

    As an open-source tool, FFmpeg provides a wide array of functions for handling media files. Let's learn how to use FFmpeg to edit videos programmatically, including trimming, merging, adding subtitles, etc.

    Automate & Scale
    Your Marketing

    Bannerbear helps you auto-generate social media visuals, banners and more with our API and nocode integrations

    How to Use Puppeteer in Node.js (with Examples)
    How to Use Puppeteer in Node.js (with Examples)