Redis Pipeline Support | NestJS | Page 1

signal wagon Sep 26, 2023, 5:24 AM

#

Anyone know how to enable Redis Pipeline support for sending batches of data to Redis without waiting for Redis to response?

#

https://redis.io/docs/manual/pipelining/

Redis

Redis pipelining

How to optimize round-trip times by batching Redis commands

rigid quail Sep 26, 2023, 5:27 AM

#

@signal wagon - Pipelining is for sending redis commands in batches, not sending data in batches. What do you need this for anyway? I've never seen a Nest dev require this.

signal wagon Sep 26, 2023, 5:28 AM

#

I have a scheduler/cron task that is fairly heavy and I am wanting to optimize the time it takes to store to Redis.

#

Cache data

#

That is regularly calculated and sent to Redis

rigid quail Sep 26, 2023, 5:29 AM

#

Is it just one command to store the data?

signal wagon Sep 26, 2023, 5:29 AM

#

Many.

rigid quail Sep 26, 2023, 5:30 AM

#

Hmm... then I'd imagine you'd need to work with the client itself. Right?

signal wagon Sep 26, 2023, 5:30 AM

#

Sure, if there's no direct support in the cache-manager and the cache-manager-redis-store

rigid quail Sep 26, 2023, 5:32 AM

#

Yeah, so "cache manager" and the redis store packages are for caching and not really for batch processing. You'd need to come up with something custom with the client/ driver for sure.

signal wagon Sep 26, 2023, 5:32 AM

#

Well this is cache data

#

I'd just like to batch the cache stores in a single batch, rather than as separate connections

rigid quail Sep 26, 2023, 5:34 AM

#

What are you batch processing in a cron for your cache?

signal wagon Sep 26, 2023, 5:34 AM

#

Just storing keys.

#

Caching data.

#

I have a couple of thousand of individual keys that are set

#

i.e. await this.cacheService.set(cacheKey, data); in a loop where cacheKey is composed and data is dynamically set.

rigid quail Sep 26, 2023, 5:35 AM

#

Part of what process?

signal wagon Sep 26, 2023, 5:35 AM

#

i have a separate nest.js deployment just for scheduled tasks

#

scheduler/scheduler.service.ts, and they run at @Interval()'s

rigid quail Sep 26, 2023, 5:36 AM

#

Yes, but what are the cacheKeys used for at what point in the process?

signal wagon Sep 26, 2023, 5:37 AM

#

the keys are used to store data to each key in redis, and then they are looked up by other nest.js deployments

#

i.e. to avoid hitting the database for commonly accessed data (that changes relatively frequently)

rigid quail Sep 26, 2023, 5:37 AM

#

And at what point do you attach data to a key?

signal wagon Sep 26, 2023, 5:37 AM

#

in this scheduled job

rigid quail Sep 26, 2023, 5:38 AM

#

Ah, you changed the code above I see.

signal wagon Sep 26, 2023, 5:38 AM

#

Yeah

#

It was pseudocode

#

Sounds like cache-manager doesn't have pipelining support.

rigid quail Sep 26, 2023, 5:44 AM

#

So, I'd put forward that this isn't how cache should work anyway. My understanding is, you'd normally do the following:

Process starts.
Cache is checked for data.
No data, then data is pulled from the database.
Data is cached, with a TTL. TTL should be long enough for lowering the database load, but also not to end up with stale data, IF you intend to not have a mechanism to refresh the cache.
Then it is back to 1.

"Priming" a cache with all results isn't needed. You are only "loading" from the database once (which you'd need to do too for the priming anyway).

signal wagon Sep 26, 2023, 5:44 AM

#

Right, that's normal scenarios.

#

And that's fine, for some cache needs.

#

In this case, I have a highly realtime need for cached data primed and regularly updated.

#

I am dealing with millions of inbound data every day, and there are some heavy queries where Redis caching of specific data saves a lot of load on the DB cluster.

rigid quail Sep 26, 2023, 5:45 AM

#

"regularly updated" data isn't a normal for a cache.

signal wagon Sep 26, 2023, 5:45 AM

#

That's not true.

#

There's a lot of uses for Redis, it's not just "page fragment" caches or "api endpoint output" caches or basic JSON data caches

#

It's fine, I can simply extract this job to a go/rust/dart microservice, I was just trying to keep it in Nest.js for simplicity.

rigid quail Sep 26, 2023, 5:46 AM

#

How can regularly updated data be croned in a batch process to a cache?

#

How do you insure you aren't ending up with stale data?

signal wagon Sep 26, 2023, 5:47 AM

#

rigid quail How can regularly updated data be croned in a batch process to a cache?

A batch can have many SET statements to Redis, and there's significant performance increase. There's no need to serialize or even parallelize the SETs with new connections and lots of roundtrips.

#

It's not stale because it's regularly updated (remember, it's a scheduled/cron job)

rigid quail Sep 26, 2023, 5:47 AM

#

That doesn't answer my question.

signal wagon Sep 26, 2023, 5:48 AM

#

I want to firehose it into Redis. I don't care if it's occasionally stale. It updates very frequently.

#

Of course, a TTL is set, so that if something goes wrong, the data goes away.

#

The cron/schedule job is very simple -- it pulls data from DB, it shoves into Redis. Other apps can pull from Redis (and then only hit DB when it doesn't exist). It's pretty straightforward.

#

It's working fine now. It's just slower than I'd like. I'd like to shave off a lot of overhead by pipelining it, since it's thousands of set() calls.

#

pipelining is truly beneficial for this scenario (I've done it many times before in other frameworks)

rigid quail Sep 26, 2023, 6:05 AM

#

Sorry, but "updates very frequently" kills the whole premise of caching in my mind. If you update frequently, that means you batch the database for these "heavy queries" and that also puts a considerable load on the database too and "regularly"? You might be caching data for users who rarely even need the service, so wasting resources possibly?

Sounds like you actually need a more performant database. 🤷🏻‍♂️ Or, you need a way to capture "heavy queries" to get faster replies, via something like database views, which are also a type of caching.

But, if you insist on making this happen, I'm sorry. I don't know of a Nest way to make it happen. And usually when there isn't a solution already made for your problem out there, it probably means it is either a very niche and yet legit problem or your approach is wrong. 🤷🏻‍♂️ I'll just accept it is the "it's a niche but legit problem" on this one, as I don't have the worldly experience to consider it a bad approach. 😄

#

This package is a bit older, but it looks like it can offer direct client access. https://www.npmjs.com/package/nestjs-redis

signal wagon Sep 26, 2023, 6:10 AM

#

It's alright. I'm going to do it in a go microservice.

#

Not looking to bend Nest.js to do something it doesn't do easily.

rigid quail Sep 26, 2023, 6:11 AM

#

That package works with ioredis, which has pipelining. 🙂

signal wagon Sep 26, 2023, 6:11 AM

#

Yeah, but if I'm writing non-framework code and trying to shoehorn it into the service layers, I might as well just do it in a microservice and be done with it.

#

It's not so much an issue with the database. I mean, it is, in that I have billions of rows of data (terabytes of data) and it's just not sane to be hitting it for this data all the time.

#

You can only tune/optmize the DB so much, before you have to just stop trying to hit it for all requests.

#

For scale purposes, hitting a keystore in memory is so much faster than trying to let the database burn IO, mem, and CPU to hit giant indexes and burn the planner constantly.

rigid quail Sep 26, 2023, 6:16 AM

#

I'm not questioning the reasoning for caching. I'm questioning your methodology of using the cache. 🙂 But, like I said. I don't have the worldly knowledge to actually say it is a poor methodology. I've just never heard of doing a cache priming batch service that regularly updates a cache. That nestjs-redis package gets you close to using a redis client in a Nestjs manner for your service for sure. 🙂

signal wagon Sep 26, 2023, 6:18 AM

#

Oh yeah, Redis is used for stuff like this all the time.

rigid quail Sep 26, 2023, 6:19 AM

#

Another process I know of for this kind of thing is called a data river.

signal wagon Sep 26, 2023, 6:19 AM

#

Made a quick test in go. This will work fine.

rigid quail Sep 26, 2023, 6:20 AM

#

Cool. Now you just need to set up the microservice.

signal wagon Sep 26, 2023, 6:20 AM

#

#

You can see how drastically faster it is.

rigid quail Sep 26, 2023, 6:21 AM

#

Sure, but the drag will be on gathering the data to store from the database, no?

signal wagon Sep 26, 2023, 6:21 AM

#

That's already happening.

#

Currently there's a 12-15 second cost to all the set()'s with the data to Redis without pipelining.

#

Once I pipeline it, I expect it'll be sub 1 second

rigid quail Sep 26, 2023, 6:22 AM

#

How is the data for the caching being already gathered?

signal wagon Sep 26, 2023, 6:23 AM

#

From the database.

#

There are highly tuned queries.

#

Retrieving a batch of results, then iterating on the results.

#

Pretty basic stuff.

rigid quail Sep 26, 2023, 6:23 AM

#

So, a middleware?

#

Not in the nest sense, but in the integrations sense.

#

You are building a middlware.

signal wagon Sep 26, 2023, 6:24 AM

#

Right now it's queried in the scheduler job in nest.js. But I'll implement the same queries in this new microservice.

#

No, it's a microservice, not a middleware.

rigid quail Sep 26, 2023, 6:25 AM

#

It's a microservice functioning as a middleware. Sure thing. But, a simple one. 🙂 Ok. I get the process now. And yeah, NestJS covers a lot of bases, but not this one. 🙂

signal wagon Sep 26, 2023, 6:25 AM

#

🙂

rigid quail Sep 26, 2023, 6:27 AM

#

If I may ask, what is the overall application being served? Is it something to do with IoT?

signal wagon Sep 26, 2023, 6:29 AM

#

Flight tracking

#

https://airframes.io

rigid quail Sep 26, 2023, 6:37 AM

#

Oh. Cool! I'm a private pilot (too?).

signal wagon Sep 26, 2023, 6:37 AM

#

#

Well that's not 10,000 calls (i just didn't update that output)

rigid quail Sep 26, 2023, 6:39 AM

#

I can imagine ingesting all that data is a concern for your database too.

signal wagon Sep 26, 2023, 6:40 AM

#

Oh indeed. Spent a lot of time there.

#

That's a Dart service for the aggregator. Highly optimized, low memory, very tuned. Each ingest is broken out into separate service instances, etc.

#

Writes to the DB cluster's primary node, while reads go to the replicas.

#

there we go

#

Pretty substantial savings!

#

That's of course a very minimal struct though... will have to expand it out.

signal wagon Sep 26, 2023, 6:47 AM

#

rigid quail Oh. Cool! I'm a private pilot (too?).

Awesome! It's an awesome hobby.

#

We have a pretty thriving Discord if you're interested.

rigid quail Sep 26, 2023, 7:06 AM

#

Looks like it is more about commercial flights. So, nothing I've been directly involved with as a private pilot. It all sounds new to me. 😄

signal wagon Sep 26, 2023, 7:21 AM

#

rigid quail Looks like it is more about commercial flights. So, nothing I've been directly i...

Not just commercial flights. Any airframes that have transponders.

#

So there are definitely corporate, military, etc

#

Not a lot of little private cessnas of course

signal wagon Sep 26, 2023, 7:37 AM

#

Not too shabby... and that's on my dev system reaching across a VPN to production servers, so deployed in production natively should be even faster.

reef thorn Sep 26, 2023, 7:59 AM

#

signal wagon Not too shabby... and that's on my dev system reaching across a VPN to productio...

did you json marshal those structs with the Go's internal encoding/json package?

signal wagon Sep 26, 2023, 8:37 AM

#

reef thorn did you json marshal those structs with the Go's internal `encoding/json` packag...

yeah

reef thorn Sep 26, 2023, 8:38 AM

#

signal wagon yeah

give go-json a try! it's a drop-in replacement of encoding/json that performs better with large datasets.
also, I'm not sure how frequently you execute those pipelines, but increasing the go-redis connection pool option will also decrease the waiting time for busy connections to be ready 👍🏼

signal wagon Sep 26, 2023, 8:39 AM

#

reef thorn give [go-json](<https://github.com/goccy/go-json>) a try! it's a drop-in replace...

Really great tips, thanks 🙂

#

And thank you @rigid quail for the back and forth earlier as well

#

Production definitely faster for all things of course. Still need to add more data to the JSON structs (nested details), which will bring the DB time back up, but necessary. And the go-json improvement will benefit there of course.

#Redis Pipeline Support