scrape job status | Firecrawl | Page 1

frozen pecan Jul 3, 2024, 12:00 PM

#

Hi!

I noticed when scraping web sites with a lot of pages (>1000) scrape job gets stuck (or just job status) in a state from where I don't know any more what is going on.

For example, right now I have a running job (job has been limited to max of 1000 scrape urls), and if I fetch the status using the API, I get the following data:

status: active
current: 1000
total: 1000
data: 0 items in the list
partial_data: 50 items in the list

This state is now the same for hours. Items in partial_data are the same, from index 951 to 1000.

And that's it. Nothing is coming to the webhook, and the job isn't listed in logs (https://www.firecrawl.dev/app/logs).

Should this kind of behaviour be expected?
Should we wait for hours to get all the complete data?

frozen pecan Jul 4, 2024, 8:38 AM

#

Finally, the job failed, the status is failed.
But this came after a very long time and without any information to our webhook or anywhere else.

ivory girder Jul 4, 2024, 3:30 PM

#

Hm.. that's odd @frozen pecan Looking into it.

#

Can you dm me your email so we can analyze the logs and see what happened?

#

It shouldnt' have had this behavior

frozen pecan Jul 5, 2024, 7:38 AM

#

dm sent

covert rapids Jul 9, 2024, 8:19 PM

#

I am getting a very similar issue on several websites I am attempting to crawl. I don't even get the current or total and its just stuck in active but on the dashboard I can download the documents. JobID: 52204c2f-6360-4851-97a0-a353fd2f4569

Gets response:
{
"success": true,
"status": "active",
"data": null,
"partial_data": []
}

In others I get no activity log and have nothing to look into for the reason of failure.

dreamy agate Jul 9, 2024, 9:57 PM

#

Hey all! Wanted to give y'all an update. We've built a fix for this and we're currently testing it. This behavior is an edge case that should only happen sometimes currently. If all goes well the fix will be live soon.

frozen pecan Jul 10, 2024, 2:48 PM

#

hi @dreamy agate , did it go well?

dreamy agate Jul 10, 2024, 2:58 PM

#

frozen pecan hi <@265888902219431946> , did it go well?

Hey, it's looking promising but we still need to iterate and test. I'll keep you updated 🤞🏻

frozen pecan Jul 11, 2024, 9:51 AM

#

thanks 👍

#scrape job status