Replaying network requests on puppeteer

status
Published
date
Oct 15, 2022
slug
replaying-network-requests-in-puppeteer
Published
tags
summary
You may intercept network requests on puppeteer scripts. It’s a cool way to do reverse engineering or something tricky; the trickier thing is replaying those requests. So, I made a plugin for that.
type
Post
You possibly tried to read the request header or response body of incoming and outgoing requests by network intercepting if you’ve made some developments on puppeteer headless browser scripts. I’ve been working on puppeteer scripts for reverse engineering and scraping purposes and realized that there is no request replaying method. I needed that ― so I made a plugin for that.

How to replay requests?

First of all, I want to explain how it works. CDP and Devtools allow you to copy a request with its body and headers as a prepared code by using the “Copy as Fetch code” option. If you want to replay the request, it’s enough to paste it into the JavaScript console and modify the requested content if it’s needed and send it.
Voilà, you replayed a request. It’s that easy.
💡
But, what about CORS policies? No worries about that, since you sent that code through the console ― so the requests fit into the CORS policies. If you send the request through your bash CLI as a curl command or on another tab console, your request could be rejected ― if we assume that there are proper CORS policies on the server.
notion image

The methods

There are two methods:
  • .catchRequest in the page object
  • .replay method that in the return value of .catchRequest
You can check the interface declarations to see the exact values.
.catchRequest takes two arguments. The first is a regex pattern to catch requests by their URLs, and the second is a callback method to trigger the wanted request. If it’s being triggered initially as the page opens, then page.goto() must be called in the trigger callback function.
(async () => {
  const request = await page.catchRequest({ pattern: /task.json\?taskname=login/},
		() => page.goto("https://twitter.com"));
  const response = await request.replay();
})();
If the request is being triggered as clicks on a button, then .click() function must be defined in the callback as here:
(async () => {
		await page.goto("https://twitter.com");

    const request = await page.catchRequest({pattern: /task.json\?taskname=login/},
			async () => {
				await page.waitForSelector("a[href='/login']")
        await page.click("a[href='/login']")
			});

    const response = await request.replay();
})();
.replay() can take url, method, headers, and body as optional parameters, and you can modify the request by redefining them when replaying.
const response = await request.replay({
	url: "https://twitter.com/logout", // defining a new URL is possible
	method: "POST", // changing the request method is possible as well
 	body: JSON.stringify({test: true}),
	headers: {test: true},
});
If you want to get and use the default values of the request instead of overwriting, use a callback function and return the new value.
const response = await request.replay({
	url: url => url.replace(/login/, "/logout"),
	method: "POST", // changing the request method is possible as well
 	body: body => body + ";test=true",
	headers: headers => {...headers, test: true}
});

One last note

You can check out the CDP Network tab if you want to see replayed requests during the development process. Notice that the initiator value of the requests is different.
notion image
Happy coding!

© Samet 2017 - 2024