How to Use Crawlers by Bannerbear to Auto Generate Instagram Stories from a News Site or Blog

June 2020

automationnocodetutorial

Crawlers is a new integration from Bannerbear that enables you to pull data in from public websites in order to generate images from your Bannerbear templates. One particular use case for this is auto-generating Instagram Stories based off blog / news articles. Here's how!

Here's a common scenario.

If you regularly add content to a blog, or regularly add products to an ecommerce store, most of the time you'll want to promote that content across multiple other channels. This means making visual assets, which can get very tedious, very quickly - I've been there!

There must be a better way, right?

I've already posted a tutorial on how to use Zapier, the Shopify API and Bannerbear to auto generate ecommerce banners. But what if there's no API to use? Or what if you don't have access to the API?

If you're creating automations for clients, you might not have access to their API or their site might not even have an API to begin with - in this scenario you can use Bannerbear Crawlers to extract the data you need.

Auto-generate Instagram Stories from Medium.com

In this tutorial we are going to use Heated by Medium as the data source.

The final auto-generated images will look like this - the long portrait style is Instagram Story size:

In this tutorial I will cover:

  1. Adding the Bannerbear Template
  2. Creating the Crawler
  3. Testing the Crawler
  4. Starting a Crawl
  5. Starting a Crawl via API

Lets get started!

Adding the Bannerbear Template

If you haven't created a Bannerbear account yet, create one now (it's free!) and start a new project called Heated (or anything you like).

I have already prepared a sample template you can use for this tutorial, which you can find in the template library. Add this to your Heated project:

Creating the Crawler

On the template page you will see the new Crawler option. Click it and choose Create New Crawler:

The only information Bannerbear needs at this point is the website you want to pull data from. Enter http://heated.medium.com here:

After saving, you will see more options.

Every layer of the template you just added is listed out for you to define a CSS Selector. These are the containers or tags on the page that you want to pull data from. To find the right selector for pages you are crawling you will either need to inspect the page's source code or you can try using a plugin such as Copy CSS Selector.

For this tutorial, enter the following configuration:

Background

  • meta[property="og:image"]
  • content

Avatar

  • article a[rel="noopener"] img
  • src

Author

  • meta[name="author"]
  • content

Date

  • article span span div a[rel="noopener"]
  • text

Title

  • meta[property="og:title"]
  • content

Reading Time

  • meta[name="twitter:data1"]
  • value

Like so:

Testing the Crawler

We can test the crawler right here on the config screen. Add any heated.medium.com url to the testing box and hit Run Test. e.g. you can use this url: https://heated.medium.com/eating-red-beans-and-rice-is-a-version-of-self-care-54516505adbd

The test will take a few seconds. If the test is successful you will see a results box showing the data that Bannerbear found on the page, according to your above config:

Troubleshooting Common Problems

If the test comes back with empty or unexpected data, it probably means your CSS Selectors aren't quite right. There are a couple of things to check:

  • Are there any accidental spaces in your selectors (eg before or after)? This will confuse the crawler so delete these.
  • Are there more than one of the specified element on the page? The crawler will return the first element only, so you need to try to be more specific.

Getting the right CSS selectors defined can take a little bit of time. Tip - try using a Chrome plugin such as Copy CSS Selector.

Starting a Crawl

Now that your crawl config is defined and tested, you can start initiating crawls. A crawl is when Bannerbear scans a url you give it, parses the data according to your config, and then generates an image. It is a one-step process that, behind the scenes, goes through multiple steps.

Crawls take time to complete from start to finish. A single crawl resulting in a final image will take anywhere from 3 to 10 seconds to complete, depending on the webpage you are crawling

In your crawler list, choose Start a Crawl on the Crawler you want to use:

The crawl will then begin and the page will automatically update when the final image is rendered:

Voila, a fresh image auto generated by Bannerbear and your crawler!

Starting a Crawl via API

Using the Start a Crawl link in Bannerbear is really just there for more testing or for quick ad-hoc crawls. The best way to use your new crawler is via the API!

Every crawler has an ID:

With this ID you can call a new API endpoint /crawls to initiate a new crawl.

//POST https://api.bannerbear.com/v2/crawls
{
  "crawler": your_crawler_id,
  "url": "https://heated.medium.com/eating-red-beans-and-rice-is-a-version-of-self-care-54516505adbd"
}

The /crawls endpoint works a lot like the /images endpoint. Immediately after making the POST request a new crawl will be created and returned with a "pending" status:

{
  "url": "https://heated.medium.com/eating-red-beans-and-rice-is-a-version-of-self-care-54516505adbd"
  "created_at": "2020-06-08T07:32:23.902Z",
  "self": "https://api.bannerbear.com/v2/crawls/W7ErAeVjZAb6G1MpNQ",
  "uid": "W7ErAeVjZAb6G1MpNQ",
  "status": "pending",
  "image": null,
  "crawler": your_crawler_id,
}

You can then GET the url in "self" to poll for the completed image which will be available in the "image" child object of the crawl object. As I mentioned above, crawls can take up to 10 seconds as Bannerbear is loading and parsing your requested website in addition to then generating an image based on your template.

Crawls also support webhooks!

More info in the Bannerbear API Docs

Author
Jon Yongfook@yongfook

Jon is the founder of Bannerbear. He has worked as a designer and programmer for 20 years and is fascinated by the role of technology in design automation and design efficiency. Jon is a digital nomad and can be found riding a motorcycle around Asia, lives out of Airbnbs and works from coworking spaces.

Follow the Journey

Hello I'm Jon, the founder of Bannerbear — every 2 weeks I send a newsletter with updates from the Product, Marketing and Business sides of my startup, subscribe below to receive it!