Given the unstructured nature of conversational data, I'm unsure about the most effective embedding strategy to use. The options I'm considering are:
Sentence-Level Embeddings: To capture details from each sentence. The comparison would naturally be done on that level when compared to other sentence embeddings. This also implies that the embedding may miss out on broader contextual information found in a paragraph or document.
Conversation-Level Embeddings: To understand the broader context of entire conversations, that would be adding the sentences to make it like traditional document.
Summarization Before Embedding: Summarizing conversations before embedding to balance detail and context.
Could you share your recommendations or experiences regarding the best embedding approach for this scenario? Any specific techniques or methodologies that have proven effective in similar applications would be greatly appreciated.