#Create multiple thread in a loop

17 messages ยท Page 1 of 1 (latest)

dusky elm
#

Hello! I'm making a web scraper using a JSON config file. In my config file, I can have multiple objects representing a website. With multiple URLs if needed. I want to create a thread that will download the HTML of the given website for each object inside my config file simultaneously. Let me be more precise:

Here's my config file:

{
  "websites": [
    {
      "id": "example",
      "name": "example of a website object",
      "urls": [
        "https://example.com",
        "https://example2.com"
      ]
    },
    {
      "id": "example2",
      "name": "example of a website object2",
      "urls": [
        "https://example.com",
        "https://example2.com"
      ]
    }
  ]
}

In my scraper, I get all objects from the website vec and iterate through it, and for each object, I want to create a thread. But I don't know how, I'm not comfortable with a loop in rust... For now, I just have something like that:

for website in &websites {
        download_thread(website);
    }

But it will not create threads simultaneously but one after another, so it's not what I want.

eager pelican
#

You will probably also need to scope the threads to deal with the lifetime issue

std::thread::scope(|scope| {
    let threads = Vec::new();
    for website in &websites {
        let thread = scope.spawn(|| download(website));
    }
    
    for thread in threads {
        thread.join();
    }
}
dusky elm
eager pelican
#

the lifetime issue is that thread::spawn requires that any passed references may live forever, since the type system has no guarantee that you will definitely join the threads again.
thread::scope adds this guarantee: all remaining threads are joined at the end of the scope

dusky elm
#

Mmmh yeak ok, I don't get all of it but I will later. Thank you ๐Ÿ™‚

dusky elm
eager pelican
#

ah yeah, it's just for general threading stuff and join handles

errant garden
eager pelican
#

ah yeah whoops

#

I'm sure they'll figure it out

errant garden
# dusky elm I already read it but I didn't remember anything about lifetime issue, but thank...

the last section, "Using move Closures with Threads", briefly shows the lifetime issue. this code doesn't compile, because thread::spawn doesn't allow borrowing```rust
use std::thread;

fn main() {
let v = vec![1, 2, 3];

let handle = thread::spawn(|| {
    println!("Here's a vector: {v:?}");
});

handle.join().unwrap();

}
```the book recommends making the closure on lines 6-8 a move closure, so that there's no borrowing. scoped threads are an alternative to thread::spawn that does allow borrowing, so the move isn't necessary.

dusky elm
dusky elm
eager pelican
#

you use the move closure specifically to not pass a reference