#Securing temp keys used in the client
1 messages · Page 1 of 1 (latest)
I see the request has now changed its status from "Pending" to "Uknown". While I know "Pending" means the connection hasn't closed yet, why would there ever be "Uknown" status? I actually have another "Uknown" created earlier today, request ID 39037176-b2f6-4d43-9af5-77f18c01aa05 - can you close this connection too please.
BTW, why do your usage logs for "Pending" or "Uknown" statuses only show the most basic fields i.e. "Request ID", "Endpoint", "Status", "Performed at", "API Key identifier" and "Language"? I can imagine showing "Audio duration" and "Cost" may be hard because the connection has not finished yet, but "Models", "Tags", and the connection params like "Endpointing" are missing too, even though they are known at connection open. Especially "Tags" would be useful (I'm now using it to know which user has opened the connection).
I am confident there is no attack on your account. How are you generating keys? How do you think a bad actor might be taking them?
The labels in our dashboard are ... not always super helpful. I've given the team this feedback before, but unfortunately they're very busy.
Pending requests on streaming are not just open connections, but connections we've not had another log about to determine it's next status (basically). If something happens during the connection to cause it to close into an unknown state, we wait X hours before we change it to "unknown"
Thanks @teal burrow I've tested that Pending is basically open connection. I'm generating the temp keys on the server as in https://vercel.com/templates/next.js/nextjslive-transcription which are then used by the user to connect to Deepgram directly. Unfortunately due to https://github.com/orgs/deepgram/discussions/705 (BTW is there any update on this?) once a user has connected, there's no way to limit their connection in any way, so a bad actor can just take the temp key, connect, and keep sending audio forever 😦 I think Deepgram is aware of this vulnerability at least since https://deepgram.com/learn/protecting-api-key was written, see approach 3 quote "The first two approaches in this guide are a good stop-gap, but you should avoid sending keys to the client wherever possible.". So basically for serverless apps, temp keys are just a "stop-gap" 😦
For a Next.js app, you should introduce a session check so that automated requests can't be used to farm keys. Check out next/auth. We have a plan to implement that for our demos but... Our demo apps are not intended for production... They're intentionally small. We'd expect users to implement their own security around their keys.
You can check details about the request (like IP, geolocation, user agent, etc) to make sure it matches up with something you store on the session.
Yes of course all my API routes are behind auth, but both free trial and subscription users can be bad actors too. Yesterday one of my free trial users has run up 55 hours of streaming by just using the temp key to make multiple connections that lasted 2-4 hours each. How can I avoid that without imposing some limits on usage per API key? And if I cannot manually close existing connections, I cannot even react to bad actors manually.
For this kind of control over the connection, you'd require a proxy that has an abort controller. Then your keys are never in the browser
Right, but this makes Deepgram unsafe to use in any serverless app 😦 I'm not sure why Deepgram came up with temp keys if they have no usage limits. The demo app should clearly state that this is not a safe approach, even with API route auth.
Ok. this is not an admission of a vulnerability. client-side services all suffer the same vectors, that keys are regularly exposed in the browser but short-lived. the most important thing is to limit the scope of the keys, and limit the connection opportunities. It is responsible for all service providers to provide advice on key protection
This has nothing to do with serverless, and everything to do with how you choose to build your client. If you choose not to use a proxy, and make it easy for keys to be exposed to users with no session check despite the advice, then any service is likely to be vulnerable for you.
The session check should be on the endpoint that provides the keys. It should ensure that it is not a bot farming keys.
But I'm doing session checks. And both trial and paying users are covered. Yet they can connect and keep sending audio forever because there's no limits on the usage once they're connected.
Are you saying they're using your client too much?
You can disconnect from the client
Securing temp keys used in the client
Once the bad actor has the temp key, they don't need my client, they can just make their own websocket connection, and keep it alive forever.
And yeah that's what happened yesterday, the bad actor wasn't using my client, because like you say, my client wouldn't allow for this.
How do you know this?
Our connections don't last forever. We only guarantee connections for around an hour due to how our CD workflow operates. I think our record is a few hours.
What is your suggestion then? Optionally a max $ or time for a key?
It's good that you limit single connection time! But bad actors can start multiple connections around the same time, like what happened to me yesterday... Actually, if they start enough connections, they could reach the concurrent API connection limit and bring down an app altogether? Yes my suggestion is either max time (or dollars) per API key, as in https://github.com/orgs/deepgram/discussions/705, and also probably concurrent connection limit if possible.
We already have a concurrency limit for a project
My suggestion for your particular needs is a proxy to be honest
It addresses all issues here
Effectively, you need a flow that requires the requests to Deepgram themselves to be authorized by your own auth, not our auth.
You have your users behind your own auth, but once logged in you have a path that will provide an API key. Users can then use that to make unrestricted requests to Deepgram, in effect making them our end user.
A proxy is going to act as the websocket you would connect to (and if you're using the JS SDK, it supports this, and should you build your proxy correctly then no client-side changes will be required). You'd have your client and your endpoint requiring your users to be authorized with you, and a hidden long-term API key responsible for the connection between the proxy and Deepgram.
The only solution for authorized client-side requests to Deepgram is for Deepgram to become an auth provider and while I myself have asked for us to provide that as part of our product, it is only on our long-term roadmap
The solution I am after is you (our users) being able to use an API key to request a token from a new endpoint on our service. This token will include details about the end-user and be indentifiable on our service as a client-side key. We can then compare details about the use vs the token (like user agent, IP, etc), not completely removing, but significantly reducing the vectors involved. We can also make those keys require a reconnection more regularly on the websocket.
It's on our plan, just not in the immediate near-term
Yeah I'm looking into building a serverless websocket proxy now. Next.js doesn't allow this unfortunately. Streaming latency will be worse too 😦
That is why this feature exists. We have customers who have knowingly eschewed the vectors for the lowest possible latency - this is a common situation. Trading off user experience for security is a very old story
That is why any client-side APIs exist
You can use the page router to build a Next.js based proxy - You can use the app router and page router simultaneously, side by side. App router routes take precedence, if you happen to have a clash.
Pray they actually fix this massive gap in functionality of their Express abstraction 😦
Ah right, I'll try page router too, but first https://github.com/apteryxxyz/next-ws
Both page and app router approaches only work for servers though. Serverless websocket proxy is not possible, at least not with Next.js.
Depends where it is hosted I think
If you choose to make the entire app serverless with Vercel or Netlify (for e.g.) then you'll have trouble yes
You could potentially make a http proxy act as the proxy for the websocket, but that would be rough to build
OK, thanks for your help. I really think you should make it possible to impose API key usage limits though. Adding a proxy server will be a (costly) pain, now I will have to worry about scaling etc. Transcription latency will suffer too 😦 The ideas you've mentioned around authentication do not seem to address the issue here, because bad actors can be authenticated just fine, especially free trial ones.
Bad actors can do this with all client-side keys 🤔 the solution is limited scopes and limited time, yes. We do one but not the other - just yet.
I am sharing this entire conversation with my manager and our product team. We have a long-running discussion about improving client auth.
Thank you!
@teal burrow turns out I was right in that both request IDs are being abused for an attack. The second one, 39037176-b2f6-4d43-9af5-77f18c01aa05 has finished only after 65 hours, generating $22 charge...
The first request ID, i.e. 1e53f1d1-872b-44b9-be9c-b87317931930, is still in Uknown state...
Looks like you have a new record...
See #1253453019815612428 message
I see. I'll let David take it from here
Hello, you've mentioned that JS SDK supports proxying, and potentially no client-side changes would be needed. However the proxy example in https://deepgram.com/learn/protecting-api-key is creating new WebSocket on the client, rather than using SDK. So it looks like I can no longer use JS SDK on the client right? I guess it's fine, I just want to make sure I'm not missing anything. To use JS SDK on the client, you have to first createClient, which requires API key, so that makes no sense with the proxy?
Hmm the server proxy code in https://deepgram.com/learn/protecting-api-key creates a single DG connection on startup and waits for WebSocket connections from the clients, sending all messages to the same single DG connection. But then that means the DG connection would be opened 24hrs a day and also how would it differentiate which transcriptReceived is for which client? I think the correct approach would be to open a new DG connection for each WS connection? And close DG connection on WS closure.
Yeah, it's working!!!