Hello, I'm an AI student working on a project for my university: an AI chatbot designed to answer any questions related to the university. This chatbot aims to assist future students interested in joining the institution. Throughout my project, I encountered several difficulties.
I utilized Langchain LLM (GPT-3.5) and Pinecone to store and manage the data, with Streamlit as the user interface. When a user submits a query, I apply the same embeddings used for the initial data and compare it to the data stored in the Pinecone database. However, accuracy can be a challenge, especially when dealing with similar data containing the same keywords or when the query demands information scattered across different sections, such as inquiring about all university programs.
I would greatly appreciate any suggestions for solving these issues. Thank you.