#how to handle variable selectors on pages to scrapped ?

1 messages · Page 1 of 1 (latest)

naive smelt
#

hi i am attempting to run code using apify playwright crawler, attempting to scrape using selectors, the issue is not all pages have same selectors. some page have 2 selectors some have 3, how do i manage that, the code i am using is :

    const investmentRange = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[3]').innerHTML();
    const sweetSpot = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[4]').innerHTML();
    const investmentsOnRecord = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[5]').innerHTML();
    const currentFundSize = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[6]').innerHTML();```
undone pasture
#

You can use page.locator to grab all items under a single selector, then check the length of the list with locator.count()

https://playwright.dev/docs/api/class-locator#locator-count

Which page are you scraping by the way?

Locators are the central piece of Playwright's auto-waiting and retry-ability. In a nutshell, locators represent a way to find element(s) on the page at any moment. Locator can be created with the page.locator(selector[, options]) method.

twin finch
#

You can also use CSS selectors that support multiple selectors separated by comma, e.g. page.$(${selector1},${selector2})

naive smelt
# undone pasture You can use `page.locator` to grab all items under a single selector, then check...

thanks for the tip, let me try it. the actual url i am attempting is https://signal.nfx.com/investors/james-currier