#How to make a request within a handler

1 messages · Page 1 of 1 (latest)

twilit palm
#

I'm using Crawlee + PlaywrightCrawler and PlaywrightRouter. I have scenarios where inside one of my link handlers I want to gather data from the page that I need to make a multiple API calls inside of that same page handler. I gather data from those API calls to addRequests for additional pages.

Obviously I could reach for a non crawlee http client like Axios but I was wondering if there was any suggested way for inline API requests.

Thanks for any help in advance

twilit palm
#

I found that the handler provides a sendRequest parameter in it's callback. It seems like this is what I was looking for but I'm open to other thoughts.

sturdy finch
#

Hi, sendRequest is the way to go. It also uses your proxy settings out of the box, which you would have to set manually with axios or fetch.

spare tapir
#

Perfect, thanks!

warm light
#

@twilit palm you want to gather data from some other page while in PlaywrightCrawler.requestHandler - right?

I am experimenting with page.goto for this. Well it works... can not say a lot about side effects, disadvantages etc...

Would you show some example code with sendRequest pls?

rose ether
#

sendRequest uses the got HTTP library so it is much faster than page.goto

split walrus
split walrus
#

is this a good way of doing it?

rose ether
#

Hmm, I'm not sure where the type is to be honest

split walrus
# rose ether Hmm, I'm not sure where the type is to be honest

yeah, it was actually under PlaywrightCrawlingContext, I just used sendRequest: PlaywrightCrawlingContext['sendRequest'] and TS was happy with it. Just a suggestion but it might be good add in the notes for the typescript section that we can import the types for Locator and Page from playwright itself since crawlee is agnostic of playwright version, I just saw the package.json today. And I think its probably the same for Puppeteer too

#

and just one last question, will sendRequest auto pick up browser context, session key etc if I call it under a Playwright crawler? since sendRequest is defined for CheerioCrawler in docs, I need it to get the link after a redirect and am using page.goto for it