#SQL Translate but with 100s of table schema

26 messages · Page 1 of 1 (latest)

open galleon
#

Hey people I have a problem with the SQL translate example.
Does anyone know any technique for e.g: I don't need to pass the schema everytime. It's just the tokens are too much when there's 100s of tables.

If fine tuning is the way , then how can I make the model memorize my table schema ? I've tried but it doesn't work because its not how we are supposed to train right ?. Then what is the perfect way to let the model know my table schema and business process ?

fading river
#

What are you hoping to accomplish? If it's structured data that maybe you want to retrieve with regular english why not use embeddings?

open galleon
#

@fading river . I'm trying to convert natural language to SQL queries which davinci-003 or codex does perfectly. But there's a twist. It requires me to pass the table schema everytime in the request which is not possible when I have 100s of tables right ?. Also what do you mean by embeddings can you explain ?

fading river
#

In your case it may be best to separate each table and only provide what's needed programmatically .Do you explicitly call for the table? What would an example prompt look like?

#

You could also fine tune it. Feed it some examples of good prompts, try some that failed, keep going. It'd be much easier than embeddings

#

Try some as in prompt and see if it passes this time

open galleon
#

Here is an example request but it realtime there would be more tables with huge number of columns

SQL Server tables, with their properties:

Order_Activity(id, ordernumber, machine, material)

Machine(id, description, production_state)

Material(id, description)

I would like to get all the orders that has been scheduled in machine 'M1' in the last 3 days

SELECT

fading river
#

Ok well I guess you could go two ways

#

If you are always going to call for the table. That's pretty easy. You can just programmatically retrieve its reference

open galleon
#

Yeah I've tried fine tuning with some examples. but I cannot get it to memorize the column names it always gets a random column name or gets the query wrong. it is worse compared to davinci-003

fading river
#

Yeah I've noticed the same. Ironically it's not very good with numbers

#

The example you showed me seems ideal though. You can easily accomplish that with a simple search function in any programming language

#

Could probably even do it in a spreadsheet formula

open galleon
fading river
#

Just the table

open galleon
#

Ok I will try that. Thanks for your time and help

fading river
#

No problem. you may not even need to fine tune if you're saying it already works well with the tables you provide

open galleon
fading river
#

In that case it'd be worth fixing the bad answers and refeeding them. Unless some prompt magic can do the trick

open galleon
#

yeah I will try that . I also hope there is some prompt magic omgsocute

unkempt scroll
#

Have you tried Babbage, or another code completion model?

open galleon
#

@unkempt scroll yes but other than davinci -003 or codex, most of them doesn't get the query correctly and accurately

unkempt scroll
#

Or get gpt to fix your table columns and import your data to a new table

open galleon
fading river