#How to handle concurrency problems?

1 messages ยท Page 1 of 1 (latest)

gray jungleBOT
#

<@&987246584574140416> please have a look, thanks.

#

Here is an AI assisted attempt to answer your question ๐Ÿค–. Maybe it helps! In any case, a human is on the way ๐Ÿ‘. To continue talking to the AI, you can use </chatgpt:1108714622413963314>.

#
TJ-Bot
I want each user who accesses the database to get a different row every time they try to fetch a row from the database. I don't want multiple users to get the same row in the database, how do I handle it without an error? Currently I'm using a redis lock b

To handle concurrency problems in a Java application where different users need to access the database and retrieve unique rows, you can implement optimistic locking mechanisms. One common approach is using version numbers or timestamps to track changes to the database rows.

Here's how you can achieve this:

  1. Add a version number or timestamp column to your database table.
  2. When a user fetches a row, include the version number or timestamp in the query.
  3. When the user updates the row, increment the version number or update the timestamp.
  4. Before updating the row, check if the version number or timestamp has changed since it was fetched.
  5. If there is a mismatch, handle it as a concurrency issue by either rejecting the update or retrying the operation.

This approach ensures that each user gets a unique row by detecting conflicts and preventing multiple users from accessing and modifying the same row simultaneously.

By implementing optimistic locking strategies like version numbers or timestamps, you can effectively manage concurrency problems in your Java application and ensure data consistency for your users accessing the database.

potent bramble
#

Row can have the same timestamp as well. Please tag me, also locking mechanism I'm using redis lock but then I want to fetch different rows for each user.

#

I think I can use select for update and then update but then it might lock the rows in the database and if something goes wrong the db will be affected

#

And suppose the database has 1million uses and 32 users taking actions of the latest requests at the same time

odd charm
#

what kind of db

#

and why is redis involved

#

and what do these rows represent

potent bramble
#

Uhh it's postgres

odd charm
#

good

potent bramble
#

Actually the rows represent user data

#

Where we need to like approve or reject whatever the users have entered

#

And then we have like approvers or rejecters who take actionables on the rows based on the latest rows

potent bramble
# odd charm and why is redis involved

So that no two approvers or rejecters can take action on the same case at the same time i use it and throw an error saying it's already being done by someone else

#

What I want to achieve is I don't want two different approver or rejecters to get the same case at the same time

odd charm
#

okay so you misunderstand what redis is for, but thats fine

#

what is your db schema for this?

#

and at what point is someone assigned?

potent bramble
#

wdym db schema? I don't understand

odd charm
#

postgres

#

you have a bunch of create table statements

potent bramble
#

ya I use postgres

#

No not create table

odd charm
#

what are those create table statements

#

someone at some point created a table

potent bramble
#

Just the normal stuff yea

odd charm
#

what does that table look like

potent bramble
#

create_time data1 data1_status Data1_lut data2 data2_status data_2_lut

odd charm
#

what in this wonderful green planet of ours do any of this mean data1 data1_status Data1_lut data2 data2_status data_2_lut

potent bramble
#

hmm well

pastel lion
#

Your title and body don't align.

potent bramble
#

One min can u come dms I'll explain what exactly it is

pastel lion
cosmic zealot
#

no dms, share it here

potent bramble
#

Ok

potent bramble
pastel lion
#

That's not what redis is for.

potent bramble
#

If the key already exists in redis and values don't match with the approver/rejector I don't allow anyone else to touch it

odd charm
#

okay so redis, in that context, can be used for "distributed locks"

#

i.e. locks that work with multiple servers

pastel lion
#

Or you could just use Hibernate's @Version or something equivalent.

potent bramble
#

Yea it's for distributed locks

odd charm
#

in this case you do not need a distributed lock

potent bramble
#

I'm using spring jdbc

odd charm
#

you just need a normal lock

#

or, more likely, no lock at all

pastel lion
#

You just want optimistic concurrency control.

potent bramble
#

I require a lock

odd charm
#

you require a logical lock - in that you want only one user acting on a resource

potent bramble
#

I actually want 2 different people to get two different rows at the same time which have the same timestamp

odd charm
#

you do not neccesarily require a database lock

potent bramble
odd charm
#

no that doesn't matter

#

your users aren't running sql directly on your db

potent bramble
#

ok how do I do that

odd charm
#

describe more of the exact use case

potent bramble
#

But my program is

odd charm
#

yep

potent bramble
#

ok so it's like a kyc system of sorts

pastel lion
#

You're dragging in an extra component for nothing.

odd charm
odd charm
potent bramble
#

Well I want one for jdbc

odd charm
#

i guarentee you do not

#

but okay so you want someone

potent bramble
odd charm
#

so users enter info

potent bramble
#

that is correct

odd charm
#

and a group of admins need to approve or deny each person

potent bramble
#

yes exactly

odd charm
#

but each admin needs to have sole control of each review

potent bramble
#

now those group of admins are many

odd charm
#

yep

potent bramble
#

yep

odd charm
#

so instead of an implicit lock

potent bramble
#

and the problem is each admin gets assigned the same user at times

#

which I'm handling with a redis lock

odd charm
#

i.e. your database's locking functionality

potent bramble
#

but Instead of giving them an error i want to show them a different row

odd charm
#

or a redis lock (which is insane)

potent bramble
#

Well I'm scared of the database locking functionality so I don't touch those at all

odd charm
#

make an explicit row in a table marking that an admin has "claimed" a review

potent bramble
#

to prevent that im using this

odd charm
odd charm
#

if we only need to have a lock during a "claim" process

#

thats a much shorter scope of time

#

after a row has been "claimed" you can then enforce that claim with some backend logic like "cannot submit if unclaimed"

#

and you can manually manage who does and does not have access

potent bramble
#

I don't understand

odd charm
#

imagine the deli counter

potent bramble
#

Actually my problem is now I have an api to fetch a case

#

2 users use the api

odd charm
#

i understand your problem

potent bramble
#

And the same case gets fetched and they try performing actions

#

ya

odd charm
#

okay so imagine the deli counter

potent bramble
#

yes?

odd charm
#

you have 50 people in the line

#

they all want meats

potent bramble
#

correct

odd charm
#

what we need to do is give them each their own number

#

you don't want two people coming up when one number is called

potent bramble
#

yes that's correct

odd charm
#

but once you've given out a number you no longer need them to coordinate

#

you just need to check "is this a number I can serve right now"

potent bramble
#

uhm yea

odd charm
#

okay

#

so instead of locking a row in the database, we give them a number

#

i.e. if you have

#
kyc_document

id | created_at | updated_at | data...
#

then all we need to do is atomically insert a

kyc_document_review

id | kyc_document_id | reviewer_id | assigned_at | completed_at
potent bramble
#

insert?

odd charm
#

yeah, insert

potent bramble
#

no actually we are just updating it right now

odd charm
#

no actually we should be inserting

potent bramble
#

and we don't want to insert anything

odd charm
#

and we want to insert something

potent bramble
#

why insert though?

odd charm
#

how long does it take to review one of these documents?

potent bramble
#

Super fast lol

#

10s at max

odd charm
#

thats super slow

potent bramble
#

It's not only documents it's matching pan nos etc

#

Well it's human verification

odd charm
#

right

#

so how long does it take to insert a row

potent bramble
#

1s or less

#

same with updates

odd charm
#

point is, the longer a lock is open the more chance for shenanigans

potent bramble
#

And we don't need the same case to be audited twice

odd charm
#

and databases do not do well with locks that are open for a long time

potent bramble
#

yea

odd charm
#

so instead of giving someone a row and only accepting updates from that one person

#

you explicitly assign them to a case

potent bramble
#

ok but the point is the system already works like that so how to fix it

potent bramble
#

wait what

odd charm
#

and thats what inserting a row does

potent bramble
#

how?

odd charm
#

you insert data saying "this reviewer is in charge of reviewing this case"

#

so long as you

potent bramble
#

yea I get ur point but I don't want a particular case to be stuck to a particular reviewer

odd charm
#

you can have an unassigning process

#

but this whole conversation is about how you only want one person to do a thing

potent bramble
#

it's about how I want multiple people to do different cases

odd charm
#

i'm saying that process will be much easier to manage if you do it explicitly as opposed to praying to distributed system gods

potent bramble
#

instead of multiple people getting the same case if there's only one user at that time

odd charm
#

right

#

so in the assigning process we take care of that

potent bramble
#

but I don't want it to be explicitly assigned to someone

#

I want first come first serve

odd charm
#

thats when it will be assigned

#

like the deli counter

#

they come up, first come first serve, and get their ticket

#

i am saying that having a database row representing the ticket is all you need

#

yes, we need to figure out assigning a ticket

#

but thats a much easier task than "keep this exclusive for a minute"

#

if you want to have a process where if they ignore it it gets unassigned, now you can do that

potent bramble
#

I guess I'll stick to the redis locks then

odd charm
#

that is the actual worst option

#

its insane you even considered it

potent bramble
#

why?

#

so u want me to do what exactly? just handle a db exception?

odd charm
#

you are using a distributed lock - famously one of the hardest things in computer science to get right - for a non-distributed system

odd charm
potent bramble
#

man i really need to get my fundamentals right ๐Ÿ˜ญ im screwed I need to start everything from beginning, first job and making me forget everything I've learned

odd charm
#

we use the presence of that row to avoid assigning the same document to a different person

#

and in this way lower the time that you need to maintain a database lock

potent bramble
#

What if two people do it at the same time

odd charm
#

now we get into the FOR UPDATE SKIP LOCKED

potent bramble
#

I was gonna use update skip locked but the db is fricking huge

#

Ok waittt

#

U want me to do this on the new table where we insert our rows?

#

Sorry I'm unable to follow

odd charm
#
SELECT job_id, queue, job_type, payload, attempts, enqueued_at, process_at FROM %s
WHERE
    queue = ?
    AND process_at <= ?
ORDER BY process_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
#

here is some SQL from a library that handles background jobs

#

it makes sure that only one worker handles a job at a time

potent bramble
#

Actually I was thinking of using skip locked but idk if it will screw the db

#

Because using that on 1Mil records = idk

#

What I was thinking is me removing the redis lock use update skip locked and then use commit after updating

#

But what if the dataset grows huve

#

And the locks is what I'm afraid of

odd charm
#

locking one row at a time is fine

#

so long as you do it for a short period of time

#

do you currently have 1 million rows?

potent bramble
#

I guess what's a mess shall be let that way then I'll only use it for what I'm creating now

potent bramble
#

For the one in creating it'll become that over time

odd charm
#

you better not work for abound, i'll flip

#

but okay

potent bramble
#

What happens if that person does it and then exits mid way

odd charm
#

what do you want to happen?

potent bramble
#

Honestly I have no idea of what's actually happening with it, I'm reusing existing code and i doubt they delete it ๐Ÿ’€

#

Well ideally it should expire within 24h and be deleted

#

I'll have to check

odd charm
#

okay so what you can do is

#
kyc_document_review

id | kyc_document_id | reviewer_id | assigned_at | unassigned_at
#

have a background job that looks for document reviewers that were assigned 24 hours ago but did not finish

#

and explicitly mark them as unassigned

potent bramble
#

Well there's no need of using a redis lock right on this new table

#

I think select for update should do the job

odd charm
#

there''s no need of a redis lock in your entire system

potent bramble
#

But yea what happens if something goes wrong in between?

#

What happens with that select for update?

odd charm
#

in the query i showed, if all the rows are locked it returns no rows

potent bramble
#

no no

#

That i know

#

But what if suppose I'm person A

#

I get that case and something goes wrong

#

Idk say the server shuts down or something

#

Is that low locked forever?

#

Or do the locks expire?

#

If im person b and I wanna do something will I be able to see that case?

odd charm
#

thats why we only take out the lock for a very brief period

#

and use them to insert an explicit record that you can build functionality on top of

#

because these questions are insanely hard to answer

#

generally no, the db isn't locked forever - but they are tied to a single connection

#

we do sometimes use db locks when doing prod fire fighting

potent bramble
#

no I mean for the row

odd charm
#

like we need to fix some rows in prod so we do

#
BEGIN;

COMMIT;
potent bramble
#

what happens to it

odd charm
#

in the scheme i am suggesting

#

person A gets the case

#

this means that a row was inserted in a table that relates people to cases

#

the server shuts down

#

the server starts up

#

that row is still there and thus person B will not get the case

#

with database locks

#

person A gets the case, server restarts, the locks are lost, person B can claim it

#

thats why we put a column on the assignment record for unassignment

#

we can even do something like this

potent bramble
#

How u do that

odd charm
#
kyc_document_review

id | kyc_document_id | reviewer_id | assigned_at | unassigned_at
#

so if unassigned_at is null, you are still assigned

#

if unassigned_at is not null, it is no longer a valid assignment

#

and you can have any number of processes fill that in

#

and you know what else you can do here? make a conditional index

#
UNIQUE (kyc_document_id) WHERE unassigned_at IS NULL
potent bramble
#

I don't want to assign a case to someone

odd charm
#

this means that the database will automatically enforce that only one person is able to update a case at a time

odd charm
#

and you want to unassign it from someone if they take too long and open it up for someone else to grab

potent bramble
#

wait ur saying I assign it to someone

#

correct?

odd charm
odd charm
potent bramble
#

no but it's not solving my problem

#

2 different people ask for a case at the same time

odd charm
#

yes

potent bramble
#

how to give them different cases

odd charm
#

and lets walk through what will happen

#

person 1 will get case A first

#

then person 2 will try to get case A - lets ignore skip locked as an option for now

#

so when person 2 tries to get case A, that means that this will be executed:

potent bramble
#

Sorry to interrupt but wait ur talking about the no insert part right? Just update the status directly

odd charm
#
INSERT INTO kyc_document_review(kyc_document_id, reviewer_id)
VALUES ('A', '2')
odd charm
#

because it leads to insane issues

potent bramble
#

Oh ok please go ahead

odd charm
#

so now, this sql

#

it will hit this uniqueness constraint

#

UNIQUE (kyc_document_id) WHERE unassigned_at IS NULL

#

and fail

pastel lion
#

I missed a bit, but one thing I was wondering about in the initial solution. Why do you always take a lock, you could have a read only connection as well.

odd charm
#

and you can - worst case - just try again

potent bramble
#

Ahhh ok, so u want me to insert with a constraint

#

So there's no locks involved smart

#

It'll lead to extra table and storage but ok

odd charm
#

it might end up being better to have a tiny lock at the start to not have to deal with running in to that constraint

potent bramble
#

This will work

odd charm
#

but yeah - thats the point of the explicit ticket system

#

you can manage the assignment yourself

#

instead of having the assignment live implicitly in something batshit like a redis lock

potent bramble
odd charm
potent bramble
#

Suppose person A leaves the job

#

Or is on leave

#

How do I allow someone else to use it without a service restart

odd charm
#

UPDATE kyc_document_review
WHERE reviewer_id = (SELECT id from reviewer WHERE email = '[email protected]') AND unassigned_at IS NULL
SET unassigned_at = NOW()

potent bramble
#

Oh ok

odd charm
#
UPDATE kyc_document_review
WHERE ((NOW() - assigned_at) > '10'::minutes) AND unassigned_at IS NULL
SET unassigned_at = NOW()
#

(I forget the exact postgres time query syntax)

#

but yeah, you just set unassigned to now

potent bramble
#

like I don't understand how to remove the locked rows

#

nvm the query syntax

odd charm
potent bramble
#

Even if we update the unassigned at isn't it still locked?

odd charm
#

the row itself is a logical lock

potent bramble
#

what

odd charm
#

you update your code to check "is this person assigned to the case"

#

and that is all

#

the database lock no longer exists after the row is inserted

#

or at all, depending

potent bramble
#

oh wait

#

if we don't put skip locked it'll update the locked row and the lock disappears?

odd charm
#

lets step back

#

you said its first come first serve, right?

potent bramble
#

Yep

#

And 2 ppl come at the same time

odd charm
#

so if person A requests to review a document it kinda doesn't matter which one so long as it hasn't been reviewed yet?

potent bramble
#

I mean it can be random

#

It can be partially reviewed as well

#

But suppose u have to review of a user and there 2 admins

#

I want to only allow one admin to fetch the case

#

Or suppose there's 2 rows and one user comes at 20:00 and other is at 19:00 we are fetching the latest user first, I don't want both the admins to get the same user who came at 20:00 I want one user to get the 20:00 used and other the 19:00 user

odd charm
#
INSERT INTO kyc_document_review(kyc_document_id, reviewer_id)
VALUES (
    SELECT kyc_document.id, ? as reviewer_id
    FROM kyc_document
    WHERE NOT EXISTS (
        SELECT id
        FROM kyc_document_review
        WHERE unassigned_at IS NULL AND completed_at IS NULL
    )
)
RETURNING id;
#

might need some tweaking but

#

this requires no locks

#

you just assign atomically by putting the assignment conditions in the insert query

potent bramble
#

Won't it still return an error?

odd charm
#

nope

potent bramble
#

Saying it already exists

#

Waittt

odd charm
#

how could it, its only inserting a row if one doesn't already exist

potent bramble
odd charm
#

yep

#

thats something your database can manage

potent bramble
#

Even for a database with a million rows?

odd charm
#

yes

potent bramble
#

It's better than a redis lock?

odd charm
#

way better

potent bramble
#

Ok thanks you very much sir

odd charm
#

but you need to design your system better

potent bramble
#

Where do I learn system design?

#

Man i need to start my cs fundaments from scratch

#

Same with java coding

odd charm
#

there are books and stuff

potent bramble
odd charm
#

but i learned what i know the hard way - by making systems that were so fucked up they caused immesurable pain for everyone

odd charm
potent bramble
#

So if someone else wants to approve the same user they can right

#

I mean there'll be a PK for sure

odd charm
#

you don't need it, but its a nice touch to put one

#

especially if you want to be sure no two people are assigned at the same time

potent bramble
#

but like someone else can approve the same doc if needed right

odd charm
#

oh

#

if you have a single column that says "status" and it is like "approved"/"unapproved"/"rejected"

potent bramble
#

yes

odd charm
#

i implore you to more to a status log

potent bramble
#

I have that

#

how?

odd charm
#

and use a trigger to update the root status for efficiency

#

well, like this

#

here's a smidge of code

#

basically you have a log of what the status is

#

and you can even attach who asserted that status

potent bramble
odd charm
#

like in my work we have this pattern a bunch of places

#

one is with payables

#

you want to send $100 to someone, but we have an approval process that people will want to go through

#

so person A makes the payable

#

person B approves it as being valid

#

person C says they have the money, send it

potent bramble
#

Btw the documents consist of multiple stages and 3 queues

odd charm
#

so instead of a status on payable, we have a table record the status in a log

potent bramble
#

Approval rejection and approved

odd charm
#

yep very similar to our system

#

do not just have a status column

#

you will want to know who did what and when

#

a status column we've found is sometimes needed - the query for "latest status" can get slow with a big enough table

#

but you want the full log

potent bramble
#

well im maintaining everything in a single table

odd charm
#

why the fuck are you maintaining everything in a single table

potent bramble
#

atleast that's how it was created a mess

#

because our manager doesn't seem to care and it's my first job and everything that I learned is going down the drain at this job angerysad

odd charm
#

the notion that a million people are giving their personal information to this system is actually scary

potent bramble
#

because nobody knows anything

potent bramble
odd charm
#

and what are you thinking at second

potent bramble
#

I'm not learning anything new, everything is a mess. I'm thinking of starting from scratch

odd charm
#

well you learned something new today

#

maybe a few things

gray jungleBOT
#

@potent bramble

Your question has been closed due to inactivity.

If it was not resolved yet, feel free to just post a message below
to reopen it, or create a new thread.

Note that usually the reason for nobody calling back is that your
question may have been not well asked and hence no one felt confident
enough answering.

When you reopen the thread, try to use your time to improve the quality
of the question by elaborating, providing details, context, all relevant code
snippets, any errors you are getting, concrete examples and perhaps also some
screenshots. Share your attempt, explain the expected results and compare
them to the current results.

Also try to make the information easily accessible by sharing code
or assignment descriptions directly on Discord, not behind a link or
PDF-file; provide some guidance for long code snippets and ensure
the code is well formatted and has syntax highlighting. Kindly read through
https://stackoverflow.com/help/how-to-ask for more.

With enough info, someone knows the answer for sure ๐Ÿ‘