#Crawler issues - Cannot crawl static document
1 messages · Page 1 of 1 (latest)
@lapis charm it could be that the website is displayed different to the crawler: e.g. an „are you a robot?“ wall
That is happening with alnsot Evers crawl right now, sometimes with the same doc that worked seconds before
I reported that somewhere already
@here I've further debugged the issue on us-east-1 and it seems the Website does an invalid redirect on HEAD requests
https://sentry.internal.jrbit.de/share/issue/0a942882e1d44021a859bfd5f5e5b924/
protocol mismatch AssertionError /var/www/crawler/functions/async.crawl.js IncomingMessage. False IncomingMessage.(functions:async.crawl)
Sadly not on us as initial HEAD requests are required
Thanks for clarifying, I’ll send an email regarding the request
Would it be possible to create a new error message specifically for this?. Then we curators know that what the issue is instead of relying on the devs to tell us.
i'll pass it onto the phoenix team 👍🏻
Furthermore, I seem to not be able to find where the issue is. I see no request header in the head request when running it manually.
You state that it does an invalid redirect. and the error you sent shows that the protocols mismach but for me the head request also seems to check out
Only when doing a http:// request we get a similar issue
@craggy harness sorry for pinging again. But could you repeat the request but instead use https://?
The document is in https, the site is still broken with HEAD requests