#proxy not working today?

1 messages · Page 1 of 1 (latest)

neon pecan
#

Yesterday I was getting around my JavaScript skill issues by building a simple HTTP crawler on Elixir using the HTTPoison module.

It worked. Today I try the same code again and I’m getting an error. I tried different groups and no juice.

#

Any ideas on how I could troubleshoot? I checked if there are any issues with my proxy token on the console and no issues, enough credit.

wise skiff
#

Well, can you share the code base?

neon pecan
#

Thanks for replying. The problem seemed to be due to the fact that I was using Task.async_stream which seems to be hammering the proxy endpoint all at once.

Here's the simplified version of I was using before

def crawler do
  Apps.list_apps()
  |> Task.async_stream(&update_app/1)
end

def update_app(app) do
  url = app.url
  case HTTPoison.get(url, [],
           timeout: 10_000,
           recv_timeout: 10_000,
           follow_redirect: true,
           proxy: {"proxy.apify.com", 8000},
           proxy_auth: {"groups-RESIDENTIAL", @apify_proxy}
         ) do
  ....
  # handles a bunch of errors
end

Here's the error I was getting:

[error] proxy error: "HTTP/1.1 590 UPSTREAM400\r\nConnection: close\r\nDate: Thu, 08 Aug 2024 14:08:48 GMT\r\nContent-Length: 0\r\n\r\n"

I'm going to try rebuilding the crawler with Crawly which seems to be a port of Crawlee to Elixir. It might have something to do with your reply to my other thread #1270567482398081084

GitHub

Crawly, a high-level web crawling & scraping framework for Elixir. - GitHub - elixir-crawly/crawly: Crawly, a high-level web crawling & scraping framework for Elixir.