#question

1 messages · Page 1 of 1 (latest)

flint osprey
#

I am making an app that requests a pdf from a website and parses it. Then, I would want to make ChatGPT summarize all the information of the pdf and return it in a structured JSON that is always the same. What would be the best way to do it?

full elmBOT
#

Hi @flint osprey. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

To achieve your goal, you can follow these steps:

  1. Request and Parse the PDF: You can use a function to fetch the content of the PDF file from the website. If the file is not in PDF format, you can convert it to PDF for text extraction. Here's an example of how you can do this in Node.js, as shown in the OpenAI Cookbook (https://cookbook.openai.com/examples/chatgpt/gpt_actions_library/gpt_action_sharepoint_text#retrieving-content-from-o365--sharepoint-items):
const getDriveItemContent = async (client, driveId, itemId, name) => {
    try {
        const fileType = path.extname(name).toLowerCase();
        const allowedFileTypes = ['.pdf', '.doc', '.docx', '.odp', '.ods', '.odt', '.pot', '.potm', '.potx', '.pps', '.ppsx', '.ppsxm', '.ppt', '.pptm', '.pptx', '.rtf'];
        const filePath = `/drives/${driveId}/items/${itemId}/content` + ((fileType === '.pdf' || fileType === '.txt' || fileType === '.csv') ? '' : '?format=pdf');
        if (allowedFileTypes.includes(fileType)) {
            response = await client.api(filePath).getStream();
            let chunks = [];
            for await (let chunk of response) {
                chunks.push(chunk);
            }
            let buffer = Buffer.concat(chunks);
            const pdfContents = await pdfParse(buffer);
            return pdfContents.text;
        } else if (fileType === '.txt') {
            // If the type is txt, it does not need to create a stream and instead just grabs the content
        }
    } catch (error) {
        console.error(error);
    }
};
#
  1. Use ChatGPT to Summarize the Text: Once you have the text from the PDF, you can use ChatGPT to summarize it. According to the OpenAI Help Article (https://help.openai.com/en/articles/9260256-chatgpt-capabilities-overview#h_56a5765450), ChatGPT can understand the contents of a document and support tasks like summarization.

  2. Structure the Summary in a JSON: After getting the summary from ChatGPT, you can structure it in a JSON format. Here's an example of how you can save the result to a JSON file, as shown in the OpenAI Cookbook (https://cookbook.openai.com/examples/parse_pdf_docs_for_rag#processing-all-documents):

json_path = "data/parsed_pdf_docs.json"

with open(json_path, 'w') as f:
    json.dump(docs, f)

Please note that you need to replace docs with your summarized text.

Remember to handle the moderation of your AI applications as suggested in the OpenAI Platform Docs (https://platform.openai.com/docs/guides/moderation).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

can I do it with fine-tuning? @full elm

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Yes, you can use fine-tuning to customize the model for your specific task. Fine-tuning allows you to adapt the model to specific tasks by training it with your own prompt-completion pairs. This can help in improving the model's performance in tasks such as setting the style, tone, format, or other qualitative aspects, improving reliability at producing a desired output, correcting failures to follow complex prompts, handling many edge cases in specific ways, and performing a new skill or task that’s hard to articulate in a prompt (source (https://platform.openai.com/docs/guides/fine-tuning)).

However, it's important to note that fine-tuning is not a recommended way to get the model to absorb specific knowledge (source (https://community.openai.com/t/827772)). It's primarily designed to get a model to behave in a certain way, e.g., adopt a certain style in output or approach a task in a specific way.

Here are the steps involved in fine-tuning:

  1. Prepare and upload training data
  2. Train a new fine-tuned model
  3. Evaluate results and go back to step 1 if needed
  4. Use your fine-tuned model

You can learn more about how fine-tuned model training and usage are billed on the pricing page (https://openai.com/api/pricing).

Remember that fine-tuning GPT-4o (text only) and GPT-4 is available through an experimental access program. If interested, you can visit your fine-tuning dashboard (https://platform.openai.com/finetune), start creating a fine-tuned model, and select ‘gpt-4o-2024-05-13’ or ’gpt-4-0613’ from the base model drop-down (source (https://help.openai.com/en/articles/7127982-can-i-fine-tune-gpt-4o-or-gpt-4)).

#

Please note that fine-tuning requires additional costs and effort, and it's a fit for customers who want to maximize an established GPT-4o or GPT-4 use case.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm is it okay if the content of the user role in a example for fine-tuning has a lot of text?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Yes, it's okay if the content of the 'user' role in a fine-tuning example has a lot of text. However, you should be aware of the token limit of the model you're using. For example, gpt-3.5-turbo has a maximum text length of 4,096 tokens, gpt-4 has a maximum of 8,192 tokens, and gpt-4-32k can handle up to 32,768 tokens (source (https://cookbook.openai.com/examples/question_answering_using_embeddings#why-search-is-better-than-fine-tuning)).

Also, keep in mind that the system message, user message, and assistant message all contribute to the total token count. If the system message is long and repeated across many examples, it can significantly increase costs (source (https://community.openai.com/t/732907)).

In terms of fine-tuning, it's generally more effective to have a smaller amount of high-quality data than a larger amount of low-quality data (source (https://community.openai.com/t/701683)). So, while it's okay to have a lot of text in the 'user' role, it's important to ensure that the text is of high quality and relevant to the task you're fine-tuning the model for.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm The job failed due to an invalid training file. Expected file to have JSONL format, where every line is a valid JSON dictionary. Line 1 is not a dictionary.

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

The error message you're seeing indicates that the first line of your training file is not a valid JSON dictionary. In a JSONL (JSON Lines) file, each line should be a separate JSON object.

Here's an example of what a line in a JSONL file might look like for fine-tuning a chat model (source (https://platform.openai.com/docs/guides/fine-tuning/example-format)):

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}]}

Each line in your JSONL file should follow this format. If your first line is not a valid JSON dictionary, you'll need to correct it before you can proceed with fine-tuning.

You can also validate your data using a Python script provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/third_party/gpt_finetuning_with_wandb#save-the-data-to-weigths--biases). This script checks the formatting of your JSONL file and reports any errors.

If you're still having trouble, you might find it helpful to post your issue on the OpenAI Community forum (https://community.openai.com/c/fine-tuning/9). The community there may be able to provide further assistance.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm so there can not be multiple lines? how do i represent line breaks?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

In a JSONL file, each line should be a separate JSON object. If you want to represent line breaks within the text content of a JSON object, you can use the newline character (\n).

For example, if you have a conversation where the user says two sentences, and you want to represent this as two separate lines in the 'content' field of the 'user' role, you could do it like this:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Who won the world series in 2020?\nTell me more about it."
    },
    {
      "role": "assistant",
      "content": "The Los Angeles Dodgers won the World Series in 2020."
    }
  ]
}

In this example, the newline character (\n) is used to represent a line break within the 'content' field of the 'user' role (source (https://community.openai.com/t/678859)).

Remember, each line in the JSONL file should be a valid JSON object. If you're seeing an error message about the first line of your file not being a dictionary, it's likely that the issue is with the overall structure of the line, rather than with the way you're representing line breaks within the text content.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm what's training loss when fine tuning?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Training loss is a measure of how well the model is performing on the training data during the fine-tuning process. It's a numerical value that represents the difference between the model's predictions and the actual values in the training data. The lower the training loss, the better the model is at predicting the outcomes for the given inputs in the training set.

During the fine-tuning process, the model learns by adjusting its internal parameters to minimize this loss. This is typically done using optimization algorithms like stochastic gradient descent.

In the context of fine-tuning a language model like GPT-3.5, the training loss could be calculated using a measure like cross-entropy loss, which quantifies the difference between the model's predicted probability distribution and the actual distribution.

For example, in the training logs provided by OpenAI, you might see entries like this (source (https://cookbook.openai.com/examples/third_party/gpt_finetuning_with_wandb#time-to-train-the-model-part-3)):

{
  "object": "fine_tuning.job.event",
  "id": "ftevent-3sRpTRSjK3TfFRZY88HEASpX",
  "created_at": 1693389015,
  "level": "info",
  "message": "Step 89/90: training loss=0.00",
  "data": {
    "step": 89,
    "train_loss": 2.5828578600339824e-06,
    "train_mean_token_accuracy": 1.0
  },
  "type": "metrics"
}

In this example, "train_loss": 2.5828578600339824e-06 indicates the training loss at step 89. The goal is for this value to decrease over time as the model gets better at predicting the outcomes in the training data.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm what is validation for when fine tuning?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Validation during fine-tuning is a process used to evaluate the performance of the model on a separate dataset, known as the validation set. This set is different from the training set and is not used during the training process.

The purpose of validation is to check how well the model generalizes to unseen data. It helps in detecting overfitting, which is a situation where the model performs well on the training data but poorly on unseen data. Overfitting typically happens when the model learns the noise in the training data instead of the underlying pattern.

In the context of fine-tuning a language model, after each training epoch (one pass through the entire training dataset), the model's performance is evaluated on the validation set. The loss on the validation set (validation loss) is then compared with the loss on the training set (training loss). If the training loss continues to decrease (indicating the model is learning the training data well) but the validation loss starts to increase (indicating the model is not generalizing well to unseen data), it's a sign of overfitting.

OpenAI recommends maintaining a hold-out set to use for evaluation following training to detect overfitting (source (https://platform.openai.com/docs/guides/optimizing-llm-accuracy/how-to-maximize-correctness-and-consistent-behavior-when-working-with-llms)).

In the fine-tuning process, you can specify a validation file when creating a fine-tuning job. Here's an example of how to do this using the OpenAI API (source (https://cookbook.openai.com/examples/fine-tuned_classification#fine-tuning)):

train_file = client.files.create(file=open("sport2_prepared_train.jsonl", "rb"), purpose="fine-tune")
valid_file = client.files.create(file=open("sport2_prepared_valid.jsonl", "rb"), purpose="fine-tune")
#

fine_tuning_job = client.fine_tuning.jobs.create(training_file=train_file.id, validation_file=valid_file.id, model="babbage-002")


In this example, `valid_file` is the validation set used for evaluating the model's performance during fine-tuning.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm are system instructions necessary for fine tuned models?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Yes, system instructions are necessary for fine-tuned models. When you use the fine-tuned model, you need to include the system message. The model will not work properly without it (source (https://community.openai.com/t/732907)).

However, it's important to note that the system message should be included in every training example during the fine-tuning process. This helps in achieving the best and most general results, especially if you have relatively few training examples (source (https://platform.openai.com/docs/guides/fine-tuning)).

If you want to shorten the instructions or prompts that are repeated in every example to save costs, keep in mind that the model will likely behave as if those instructions were included, and it may be hard to get the model to ignore those "baked-in" instructions at inference time (source (https://platform.openai.com/docs/guides/fine-tuning)).

In the context of the Chat Completions API, system messages can be used to instruct the model how to behave in a conversation, such as adopting a specific tone or following particular guidelines (source (https://help.openai.com/en/articles/8234522-chat-completions-api-system-message-vs-custom-instructions-in-ui)).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm i want gpt to return a json array and validate it with zod, but it just lets me use schema if the type is object

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Yes, you're correct. When using Zod with OpenAI's API, the top-level type for the schema must be an object. This is due to the way the API interprets and validates the schema. If you try to use a top-level type other than 'object', you may encounter errors or unexpected behavior (source (https://github.com/openai/openai-node/issues/995)).

If you want to return a JSON array and validate it with Zod, you would need to wrap the array within an object. Here's an example:

const MySchema = z.object({
  myArray: z.array(z.string()),
});

In this example, myArray is an array of strings, and it's wrapped within an object. You can then use MySchema to validate a JSON object that includes an array of strings.

Remember, when using Zod with OpenAI's API, always ensure that your schema is correctly structured and that it aligns with the expected output of your model. If you're encountering issues, you might find it helpful to post your issue on the OpenAI Community forum (https://community.openai.com/c/fine-tuning/9) or OpenAI's GitHub (https://github.com/openai/openai-node/issues). The community there may be able to provide further assistance.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm When using openai api in node.js i get the following error: (node:8294) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

The warning you're seeing is due to the use of the Buffer() constructor in your code or in one of the libraries you're using. This constructor has been deprecated in Node.js due to security and usability issues.

You should replace any usage of Buffer() with Buffer.from(), Buffer.alloc(), or Buffer.allocUnsafe(), depending on your use case:

  • Buffer.from(array) or Buffer.from(string) creates a new Buffer containing a copy of the provided bytes.
  • Buffer.alloc(size) creates a new Buffer of a specified size, and fills it with zeros.
  • Buffer.allocUnsafe(size) creates a new Buffer of a specified size, but does not fill it with zeros. This is faster than Buffer.alloc(size), but the resulting Buffer might contain old data that could be sensitive.

If you're seeing this warning but you're not using Buffer() directly in your code, it's likely that one of your dependencies is using it. You might need to update your dependencies to a version where this issue has been fixed.

If you're using OpenAI's Node.js client, please ensure you're using the latest version. You can update it using npm:

npm install @openai/openai

If the issue persists, you might want to bring it up in the OpenAI Community forum (https://community.openai.com/c/fine-tuning/9) or OpenAI's GitHub (https://github.com/openai/openai-node/issues) for further assistance.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

heavy delta
#

nnn

flint osprey
#

@full elm how do I stream a json structured output?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Streaming a JSON structured output typically involves sending the output data incrementally instead of all at once. However, based on the provided knowledge sources, there's no specific information on how to stream a JSON structured output with OpenAI's API.

OpenAI's API typically returns a complete response once the model has finished processing the input. For example, when using the Chat Completions API, you send a series of messages and the model returns a message in response (source (https://help.openai.com/en/articles/8555517-function-calling-in-the-openai-api#h_63fd8d926e)).

If you're looking to process large amounts of data, you might need to handle the data in chunks in your own application. This could involve breaking the data into smaller pieces, sending each piece to the API separately, and then combining the responses.

If you're working with a language that supports streaming, like Node.js or Python, you can use libraries or built-in features to handle streaming data. However, this would be on the client side and wouldn't change how the API itself returns data.

If you need more specific guidance, you might find it helpful to post your question on the OpenAI Community forum (https://community.openai.com/c/fine-tuning/9). The community there may be able to provide further assistance.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm I have a database containing events information, such as name, date, hour, etc. in JSON format. How can I make GPT get the information and answer questions of the user? As it's not normal text, I don't know how embeddings should be.

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

To enable GPT to answer questions using your database of events information, you can use a two-step Search-Ask method. This involves searching your database for relevant information and then asking GPT the question with the retrieved information. Here's a high-level overview of the process:

  1. Search: Search your database for relevant text sections. This can be done by converting your JSON data into a format that can be processed by the model, such as plain text or a structured text format that the model can understand. You can then generate embeddings for these text sections using OpenAI's API and store them for later use.

  2. Ask: Insert the retrieved text sections into a message to GPT and ask it the question. The model can then use the provided information to generate a response.

Here's a more detailed breakdown of the process:

  1. Prepare search data (once per document):

    • Collect: Use your database of events information.
    • Chunk: Convert your JSON data into a format that can be processed by the model. This could involve converting each event into a short, mostly self-contained text section.
    • Embed: Each text section is embedded with the OpenAI API. This involves sending the text to the API and receiving an embedding in return.
    • Store: The embeddings are saved for later use. For large datasets, you might want to use a vector database to store the embeddings.
  2. Search (once per query):

    • Given a user question, generate an embedding for the query from the OpenAI API.
    • Using the embeddings, rank the text sections by relevance to the query.
  3. Ask (once per query):

    • Insert the question and the most relevant sections into a message to GPT.
    • Return GPT's answer.
#

You can find more detailed information and examples of this process in the OpenAI Cookbook's guide on Question answering using embeddings-based search (https://cookbook.openai.com/examples/question_answering_using_embeddings).

Please note that this process requires using OpenAI's API to generate embeddings, which may incur additional costs. You can find more information on the pricing page (https://openai.com/api/pricing).

Also, keep in mind that GPT models do not have access to real-time data. They can only provide information based on their training data, which ends in September 2021 for gpt-3.5-turbo and gpt-4 (source (https://cookbook.openai.com/examples/question_answering_using_embeddings#motivating-example-gpt-cannot-answer-questions-about-current-events)). Therefore, for events after this date, you'll need to provide the relevant information to the model as part of the question.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm how do i convert the json data?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

To convert JSON data into a format that can be processed by GPT, you can use a programming language like Python or JavaScript that has built-in support for JSON. Here's a basic example in Python:

import json

# Sample JSON data
json_data = '''
[
    {"event_name": "Event 1", "date": "2022-01-01", "hour": "10:00"}, 
    {"event_name": "Event 2", "date": "2022-01-02", "hour": "14:00"}
]
'''

# Load JSON data
data = json.loads(json_data)

# Now you can access the data in the JSON
for event in data:
    print(f"Event Name: {event['event_name']}, Date: {event['date']}, Hour: {event['hour']}")

In this example, json.loads() is used to convert the JSON string into a Python object (a list of dictionaries in this case). You can then iterate over the list and access the properties of each event.

If you want to convert JSON data into a CSV format, you can use Python's built-in csv library. Here's an example from the OpenAI Cookbook (source (https://cookbook.openai.com/examples/chatgpt/gpt_actions_library/gpt_action_sql_database#5-middleware-converts-records-into-base64-encoded-csv-file)):

import json
import csv

# Sample JSON array of objects
json_data = '''
[
    {"account_id": 1, "number_of_users": 10, "total_revenue": 43803.96, "revenue_per_user": 4380.40}, 
    {"account_id": 2, "number_of_users": 12, "total_revenue": 77814.84, "revenue_per_user": 6484.57}
]
'''

# Load JSON data
data = json.loads(json_data)

# Define the CSV file name
csv_file = 'output.csv'

# Write JSON data to CSV
with open(csv_file, 'w', newline='') as csvfile:
    # Create a CSV writer object
    csvwriter = csv.writer(csvfile)
    
    # Write the header (keys of the first dictionary)
    header = data[0].keys()
    csvwriter.writerow(header)
    
    # Write the data rows
    for row in data:
        csvwriter.writerow(row.values())
#

print(f"JSON data has been written to {csv_file}")


In this example, `json.loads()` is used to convert the JSON string into a Python object. Then, a CSV file is created and the data from the JSON is written to the CSV file. Each row in the CSV file corresponds to an event in the JSON data.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm i want that example in javascript

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Sure, here's how you can convert JSON data into a readable format in JavaScript:

// Sample JSON data
let json_data = `
[
    {"event_name": "Event 1", "date": "2022-01-01", "hour": "10:00"}, 
    {"event_name": "Event 2", "date": "2022-01-02", "hour": "14:00"}
]
`;

// Parse JSON data
let data = JSON.parse(json_data);

// Now you can access the data in the JSON
data.forEach(event => {
    console.log(`Event Name: ${event.event_name}, Date: ${event.date}, Hour: ${event.hour}`);
});

In this example, JSON.parse() is used to convert the JSON string into a JavaScript object (an array of objects in this case). You can then iterate over the array and access the properties of each event.

If you want to convert JSON data into a CSV format, you can use a library like json2csv in Node.js. Here's an example:

const { Parser } = require('json2csv');

// Sample JSON data
let json_data = [
    {"event_name": "Event 1", "date": "2022-01-01", "hour": "10:00"}, 
    {"event_name": "Event 2", "date": "2022-01-02", "hour": "14:00"}
];

// Define the fields for the CSV
const fields = ['event_name', 'date', 'hour'];

// Create a new json2csv parser
const json2csvParser = new Parser({ fields });

// Convert JSON data to CSV
const csv = json2csvParser.parse(json_data);

console.log(csv);

In this example, the json2csv library is used to convert the JSON data into a CSV format. Each row in the CSV corresponds to an event in the JSON data. Please note that you'll need to install the json2csv library using npm:

npm install json2csv --save

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@flint osprey kapa.ai is still learning and improving, please let us know how it did by reacting below

flint osprey
#

@full elm is csv a good option for embeddings?

full elmBOT
#

Let me check the knowledge base for you. One moment! :books:

#

CSV can be used to store embeddings, especially for small to medium-sized datasets. It's a simple and widely supported format that allows you to easily save and load your data. Here are some examples from the provided knowledge sources where CSV is used to store embeddings:

However, for larger datasets, a vector database is recommended. Vector databases are designed to handle high-dimensional data like embeddings and can provide more efficient storage and retrieval compared to CSV files. They also often include built-in functionality for nearest neighbor search, which can be useful for finding the most similar embeddings.

Here's an example from the provided knowledge sources where a vector database is recommended for larger datasets:

#

In conclusion, whether CSV is a good option for storing embeddings depends on the size of your dataset and your specific use case. For small to medium-sized datasets and for prototyping, CSV can be a good option. For larger datasets or production workflows, a vector database is likely a better choice.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: