Željko Šević | Node.js Developer

Web scraping with cheerio

January 19, 2024

Web scraping means extracting data from websites. This post covers extracting data from the page's HTML tags.

create a scraper object with load method by passing HTML content as an argument

set decodeEntities option to false to preserve encoded characters (like &) in their original form

const $ = load('<div><!-- HTML content --></div>', { decodeEntities: false });

find DOM elements by using CSS-like selectors

const items = $('.item');

iterate through found elements using each method

items.each((index, element) => {
  // ...
});

Please check the website's terms of service before scraping it. Some websites may have terms of service that prohibit such activity.

Runnable code for this post lives in the scraping-cheerio-demo folder. Get access via code demos.