#userAgent in different crawlers
1 messages · Page 1 of 1 (latest)
In different crawlers? Or in different requests for the same crawler?
like cheerio, puppeteer. the apis are not always the same.
for my case i need to differentiate between cheerio ua and puppeteer ua.
You should use preNavigationHooks for it:
https://crawlee.dev/api/puppeteer-crawler/interface/PuppeteerCrawlerOptions#preNavigationHooks
example for Cheerio:
preNavigationHooks: [
(crawlingContext, requestAsBrowserOptions) => {
requestAsBrowserOptions.headers = {
'User-Agent': 'La Centrale/6.17.1 (iPhone; iOS 13.6; Scale/2.00)',
'accept-language': 'en-US;q=1',
Accept: 'application/json',
};
},
],
for Puppeteer you should use page object.
you can try to use setExtraHTTPHeaders() (inside preNavigationHooks too):
https://pptr.dev/next/api/puppeteer.page.setextrahttpheaders
example:
await page.setExtraHTTPHeaders({
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36',
'upgrade-insecure-requests': '1',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,en;q=0.8'
})
or just add it to request object: {url, headers: { 'user-agent': '[UA-STRING]' } }