Vector embeddings have been an Overton window shifting experience for me, not because they’re sufficiently advanced technology indistinguishable from magic, but the opposite.
One problem with pg_vector hnsw index implementation is that fetches candidates based on distance **before** applying any other filters from `WHERE`. I would recommend checking lantern and ts_vector which are both faster and more precise.
I did a similar project to learn more about LLMs to build a chat bot that is powered by an author's diary. I vectorized 40k of his journal entires, when you ask a question it queries for the top 10 related articles to use as context to answer your question. I found the process helpful to learn more about RAG and make LLMs more useful. More details here - https://stevebarbera.medium.com/building-rankobot-with-chatgpt-and-laravel-dd69088211d9
One problem with pg_vector hnsw index implementation is that fetches candidates based on distance **before** applying any other filters from `WHERE`. I would recommend checking lantern and ts_vector which are both faster and more precise.
https://medium.com/@adrian.white/cosine-similarity-in-snowflake-ove-eed3b57f4e6f
I did a similar project to learn more about LLMs to build a chat bot that is powered by an author's diary. I vectorized 40k of his journal entires, when you ask a question it queries for the top 10 related articles to use as context to answer your question. I found the process helpful to learn more about RAG and make LLMs more useful. More details here - https://stevebarbera.medium.com/building-rankobot-with-chatgpt-and-laravel-dd69088211d9