What's the most memory-efficient way to filter an array of objects based on their 'id' property? | Bun | Page 1

subtle sorrel Apr 1, 2024, 4:33 PM

#

Sorry for the noob question, but I'm only asking because currently I'm using .filter() to perform this action and my app is taking up 4+ gigs of memory to do this! I know you can also use .reduce() for this so I'll be testing that to see if it's any better. I figured I'd ask here to see if there's a bun-specific answer :) And for the record, this is done on an array of 150k very large objects.

toxic thicket Apr 1, 2024, 4:34 PM

#

Unless there is some sort apparent pattern in your id. I don't see how you can optimize it. Because at the end, you would have to hit every item at least one time..

#

Also no it is not a noob question..

subtle sorrel Apr 1, 2024, 4:40 PM

#

I'll run some tests and post the results :p I'm curious to see if reduce will use less memory. Idk how I'll measure memory I'll prob just use compute time as a loose measurement unless there's some standard way to do things. I'm wondering too if I could make a set of the IDs and then filter the list based on that. I think it's copying the entire array into memory again and that's why it's failing

acoustic solstice Apr 1, 2024, 4:47 PM

#

Index the object like:

{
   47: { ... }
}

Then look up via myObject[id] in a forEach loop for every id you need to find. Should at the very least be an order of magnitude faster (due to O(1) lookup)

subtle sorrel Apr 1, 2024, 4:58 PM

#

acoustic solstice Index the object like: ```ts { 47: { ... } } ``` Then look up via `myObject...

This is the mega brain play I didn't even think of this

#

Yeah I could just store my items in an object instead of a list

#

I implemented this and it's working so well

#

This is perfect thank you so much @acoustic solstice !

boreal geyser Apr 1, 2024, 6:00 PM

#

@subtle sorrel can you explain the solution? Are you creating a new object where id is the index, filtering that, then indexing the object to return data?

subtle sorrel Apr 1, 2024, 6:40 PM

#

boreal geyser <@139923905400930304> can you explain the solution? Are you creating a new objec...

Yeah so just to be clear, I'm writing a function that fetches all records returned from an API endpoint. This endpoint is paginated, so I wrote a loop that fetches each page and adds the page's records to a master array. Each record is an object with an id property that's a random string. This function's output should only contain unique objects, so I wrote a step that filters the array using .filter() to only get unique records.

Here was the old filter method:

items.filter((item, index, self) =>
  item.id ? index === self.findIndex((t) => t.id === item.id) : false
);

The new solution was to use an object to store all the records instead of an array. So in the fetch loop, I'd add the records by doing something like if(item.id) items[item.id] = item for each record returned by that page. Then I'd repeat that for each page incrementally until there are no more records being returned like I did before. At the end, I can just return Object.values(items) to get an array of all those records and they're going to be unique based on the id by default. Basically it just skips the whole filter step.

#

#

(The bandwidth is going to be the same, I just included it to show that the jobs align. Thats the API pages being fetched)

#

Damn this is one of those solutions that makes you think back on old code and be like "huh I should really go in there and fix that like right now" lmao

#

But thinking about collecting the records in an object instead of an array is pretty smart I'm glad I was exposed to this way of thinking about problems like this

#

Thanks again @acoustic solstice

#

Also some points, the IDs are UUIDs not ints. Also I can't scrape these pages concurrently if anyone was going to say anything about that lol

#What's the most memory-efficient way to filter an array of objects based on their 'id' property?