#Web Scraping After Input, Button Click, and Extracting Text

1 messages · Page 1 of 1 (latest)

fringe inlet
#

Hello @carmine ginkgo ,

I believe you are using PuppeteerCrawler or PlaywrightCrawler

the context.jQuery just allows you to use jQuery selectors for obtaining the data (as the comments says) but not really a modification to the page like setting input (or is this a JsDomCrawler? I don't have many experiences with it...)

There is nothing like Javascript Compiler for validating the code, it just check the general syntax, it even allows you to run code with non defined variable, I suggest you to use TypeScript for better controll over the code.

There is differences for JS in browser and in Node.js - like you don't have the document object accessible in Node.js.

I am not sure if this if valid syntax for the jQuery selectors, don't you have in mind something like:
$('input[type="password"]') and $('input[type="text"]')

If you want to stick with this browser-like implementation you need to wrap it into the

const token = context.page.evaluate(() => {
   document.querySelector('input[type="password"]').value = "abc";
   document.querySelector('input[type="text"]').value = "abc";

   // etc ...
   return     document.querySelector('textarea').textContent;
})
carmine ginkgo
#

Hmm still making out errors

fringe inlet
#

Sorry I cannot help you more if I don't see run or the errors, you are getting..

trail kilnBOT
#

@carmine ginkgo just advanced to level 1! Thanks for your contributions! 🎉

carmine ginkgo
tired crow
#

@carmine ginkgo take snapshot, looks like targeted site is bot-protected and when you opening it under crawler you not getting web form controls as expected.

carmine ginkgo
tired crow
#

Header usually not blocked by anti-bot scripts, page content blocked because its matters. If puppeteer is blocked then next best bet is playwright, to check what happening need to save snapshot, what you see in your browser not necessarily equal to what scraper getting at runtime