#Usage of RefCell vs Just Cloning a String

1 messages · Page 1 of 1 (latest)

sudden junco
#

I had this code that needs to clone a String value to add it to a vector that is then returned at the end of the function:

let file = File::open(entry.path())?;
let mut reader = BufReader::new(file);
let line_buf = RefCell::new(String::new());

while reader.read_line(&mut line_buf)? > 0 {
    let line_buf_trim = line_buf.trim();

    if ["- [ ]", "- [x]"]
        .iter()
        .any(|s| line_buf.trim().to_lowercase().starts_with(*s))
        || org_regex.0.is_match(line_buf_trim)
        || org_regex.1.is_match(line_buf_trim)
    {
        if !file_tasks.contains_key(entry.path()) {
            file_tasks.insert(entry.path().to_path_buf(), Vec::new());
        }

        let Some(tasks) = file_tasks.get_mut(entry.path()) else {
            warn!(
                "Unable to add task to the list associated with {}",
                entry.path().display()
            );
            continue;
        };

        tasks.push(line_buf.clone());
    }

    line_buf.clear();
}

I wanted to try and avoid the potentially expensive String clone so I decided to try and refactor it using RefCell (and I did make an implementation of this) but would it even be worth it? Is there a big difference between a RefCell clone and a regular clone?

stark timber
#

RefCell doesn't have anything to do with how expensive a clone is

#

in a case like this, you need to allocate as much memory as needed to hold all the lines/tasks, and it happens that in this particular way of writing the program, clone() is the function that is doing the allocation and copying that you must do

#

there isn't a cheaper way to do it.

sudden junco
#

I see, ty!!

stark timber
#

to make this code faster, you should move file_tasks.get_mut(entry.path()) out of the loop so that you're not doing the exact same lookup many times. you can still check it's Some inside the loop if you want to delay the warning

#

also, avoid .to_lowercase() because that makes a copy of the entire string, and instead check for both "- [x]" and "- [X]" prefixes

#

wait I misread, you're inserting into file_tasks

#

so the warning is impossible to reach

#

it's a little tricker to make that optimal, but in any case the performance of this code is likely going to be dominated by the file IO and other fixed factors so it doesn't matter a whole lot