Browser automation with Puppeteer
Puppeteer is a headless browser for automating browser tasks. Here's the list of some of the features:
Turn off headless mode
const browser = await puppeteer.launch({headless: false// ...});Resize the viewport to the window size
const browser = await puppeteer.launch({// ...defaultViewport: null});Emulate screen how it's shown to the user via the
emulateMediaType
methodawait page.emulateMediaType('screen');Save the page as a PDF file with a specified path, format, scale factor, and page range
await page.pdf({path: 'path.pdf',format: 'A3',scale: 1,pageRanges: '1-2',printBackground: true});Use preexisting user's credentials to skip logging in to some websites. The user data directory is a parent of the
Profile Path
value from thechrome://version
page.const browser = await puppeteer.launch({userDataDir: 'C:\\Users\\<USERNAME>\\AppData\\Local\\Google\\Chrome\\User Data',args: [],});Use Chrome instance instead of Chromium by utilizing the
Executable Path
from thechrome://version
URL. Close Chrome browser before running the scriptconst browser = await puppeteer.launch({executablePath: puppeteer.executablePath('chrome'),// ...});Get value based on evaluation in the browser page
const shouldPaginate = await page.evaluate((param1, param2) => {// ...}, param1, param2);Get HTML content from the specific element
const html = await page.evaluate(() => document.querySelector('.field--text').outerHTML,);Wait for a specific selector to be loaded. You can also provide a timeout in milliseconds
await page.waitForSelector('.success', { timeout: 5000 });Manipulate with a specific element and click on some of the elements
await page.$eval('#header', async (headerElement) => {// ...headerElement.querySelectorAll('svg').item(13).parentNode.click();});Extend execution of the
$eval
methodconst browser = await puppeteer.launch({// ...protocolTimeout: 0,});Manipulate with multiple elements
await page.$$eval('.some-class', async (elements) => {// ...});Wait for navigation (e.g., form submitting) to be done
await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 0 });Trigger hover event on some of the elements
await page.$eval('#header', async (headerElement) => {const hoverEvent = new MouseEvent('mouseover', {view: window,bubbles: true,cancelable: true});headerElement.dispatchEvent(hoverEvent);});Expose a function in the browser and use it in
$eval
and$$eval
callbacks (e.g., simulate typing using thewindow.type
function)await page.exposeFunction('type', async (selector, text, options) => {await page.type(selector, text, options);});await page.$$eval('.some-class', async (elements) => {// ...window.type(selector, text, { delay: 0 });});Press the
Enter
button after typing the input field valueawait page.type(selector, `${text}${String.fromCharCode(13)}`, options);Remove the value from the input field before typing the new one
await page.click(selector, { clickCount: 3 });await page.type(selector, text, options);Expose a variable in the browser by passing it as the third argument for
$eval
and$$eval
methods and use it in$eval
and$$eval
callbacksawait page.$eval('#element',async (element, customVariable) => {// ...},customVariable);Mock response for the specific request
await page.setRequestInterception(true);page.on('request', async function (request) {const url = request.url();if (url !== REDIRECTION_URL) {return request.continue();}await request.respond({contentType: 'text/html',status: 304,body: '<body></body>',});});Intercept page redirections (via interceptor) and open them in new tabs rather than following them in the same tab
await page.setRequestInterception(true);page.on('request', async function (request) {const url = request.url();if (url !== REDIRECTION_URL) {return request.continue();}await request.respond({contentType: 'text/html',status: 304,body: '<body></body>',});const newPage = await browser.newPage();await newPage.goto(url, { waitUntil: 'domcontentloaded', timeout: 0 });// ...await newPage.close();});Intercept page response
page.on('response', async (response) => {if (response.url() === RESPONSE_URL) {if (response.status() === 200) {// ...}// ...}});
Boilerplate
Here is the link to the boilerplate I use for the development.