#Need help with shadow-root

1 messages · Page 1 of 1 (latest)

vast cliff
#

I was making an actor for my business, which extracts reviews from different platforms, but i was having issues with a website- it has every data inside shadow-root so no results are coming because of that, i couldn't find a solution over internet so i came here.
any help would be appreciated!

#

this is the link of the page i need to scrape reviews from.

midnight magnet
vast cliff
midnight magnet
#

Here is how you can get all of them:

let reviews = $('#main > page-container').shadowRoot.querySelector('x-lazy').shadowRoot.querySelector('listing-reviews').shadowRoot.querySelectorAll('ace-review')

reviews.forEach((item => console.log(item.shadowRoot.querySelector('.review-item'))));
vast cliff
# midnight magnet Here is how you can get all of them: ``` let reviews = $('#main > page-containe...

async requestHandler({ request, page, enqueueLinks }) { console.log(Scraping ${request.url}...`);
console.log('New page created')

    // let pageData = await page.evaluate(
    //     () => document.querySelector("*").outerHTML
    // )

    
    let pageData = await page.evaluate(
        () => document.querySelector("page-container").shadowRoot.querySelector("analytics-handler>div>x-lazy").shadowRoot.querySelector("#reviews-panel").shadowRoot.querySelectorAll("ace-review").shadowRoot.innerHTML
    );


    


    const $ = cheerio.load(pageData);
    const data = [];

    $('.review-item').each((i, el) => {
        let reviewDate = $(el).find("ace-link[data-testid='review-date-link']").text();
        let reviewAuthor = $(el).find("a>span.bolded").text();
        let reviewTitle = $(el).find("p.review-content.bolded").text();
        let reviewDesc = $(el).find("p.review-content").text();
        let overAllRatings = $(el).find("div[slot='label']").text().split(".")[0];
        data.push({
            author: reviewAuthor,
            date: reviewDate,
            sourceCollector: 'appexchange.salesforce.com',
            sourceURL: request.url,
            title: reviewTitle,
            description: reviewDesc,
            ratings: overAllRatings
        });
    });


    await Actor.pushData(data);
   },`
#

how do i include that code into this request handler

vast cliff
midnight magnet
#

querySelectorAll("ace-review") returns an array of nodes, you need to loop over each one to access the shadowRoot.innerHTML

#

Also cheerio.load doesn't work on arrays and cheerio.load is async, so you need to use await => const $ = await cheerio.load(pageData)