#Vector Data Type
19 messages · Page 1 of 1 (latest)
Help is on the way! To mark it as solved, use the /solve command. In the meantime, here are some existing threads that may help you:
Documentation:
That's the right approach.
RAG Chatbot was a great start for me. https://sdk.vercel.ai/docs/guides/rag-chatbot
Sorry, I should have been more specific. I've created many of these systems before, but I am trying to specifically create a vector embeddings field in Payload
Like a way to store vector embeddings
understand.
Generally it seems possible. There is a "schedule a demo"
https://payloadcms.com/enterprise/ai-framework?
I would add a non payload managed table./column Maybe in
https://payloadcms.com/docs/database/postgres#beforeschemainit. Have you already tried ideas? Would be great to have a solution.
You can have a vector column with beforeSchemaInit!
import vector like that:
import {vector } from '@payloadcms/db-postgres/drizzle/pg-core'
Also, if you pass extensions: ["pg_vector"] payload will automatically create the extension for you, but you can omit it if you are certain that the extension is created in some other way in the db.
that being said, you won't have a native way to manage that field from the admin panel
This is great! I don't need a way to manage it from the admin panel, but could I use the local API to add them?
Hm, doesn't seem super possible
My use case being this:
Whenever a document is changed, I'd like to create several document-chunks, where the document chunks have a vector embedding
const afterChangeHook: CollectionAfterChangeHook<Document> = async ({ doc, req }) => {
if (!doc.url) return
await req.payload.delete({
collection: 'document-chunk',
where: {
document: {
equals: doc.id,
},
},
req,
})
const response = await fetch('http://localhost:3000' + doc.url)
const buffer = await response.arrayBuffer()
const pdfLoader = new WebPDFLoader(new Blob([buffer]))
const pdf = await pdfLoader.load()
pdf.map(async (page) => {
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: [page.pageContent],
encoding_format: 'float',
})
await req.payload.create({
collection: 'document-chunk',
data: {
chunk: page.pageContent,
embedding: embedding.data[0].embedding, // <--- Not possible as currently embedding is set as a string type
document: doc.id,
},
req,
})
})
}
I just use json array to store embeddings
How would you insert them? Is there a way to let Payload know of the new types introduced by changes in beforeSchemaInit?
import type { CollectionConfig } from 'payload'
export const Embeddings: CollectionConfig = {
slug: 'embeddings',
fields: [
{
name: 'model',
type: 'text',
required: true,
index: true,
},
{
name: 'vector',
type: 'json',
required: true,
},
// Rails styled polymorphic relationship
{
name: 'embeddableId',
type: 'number',
required: true,
index: true,
},
{
name: 'embeddableType',
type: 'text',
required: true,
index: true,
},
],
}
I save the embeddings returned by olllama api
@brazen coral Local API used the payload config. But drizzle uses payload-generated-schema , so fall back to drizzle for this function.
I haven't tried, but here documentation about payload-generated-schema https://payloadcms.com/docs/database/postgres#note-for-generated-schema
and here Drizzle https://payloadcms.com/docs/database/postgres#access-to-drizzle .
BTW: I like the progress, here.
Interesting! Will try with the drizzle access
beforeSchemainit straight up doesn't work for me. Doesn't change anything in the database.
db: postgresAdapter({
pool: {
connectionString: process.env.DATABASE_URI || '',
},
extensions: ['vector'],
beforeSchemaInit: [
({ schema, adapter }) => {
return {
...schema,
tables: {
...schema.tables,
document_chunks: {
...schema.tables.document_chunks,
embedding: vector('embedding', { dimensions: 1536 }),
},
},
}
},
],
}),
Had to use afterSchemaInit, things are working now
I do it like this but having errors, what am i missing?