Kacper-Lukawski t1_j5xp10a wrote on January 26, 2023 at 7:26 AM

Reply to comment by keisukegoda3804 in [D] Efficient retrieval of research information for graduate research by [deleted]

Each vector may have a payload object: https://qdrant.tech/documentation/payload/ Payload attributes can be used to make some additional constraints on the search results: https://qdrant.tech/documentation/filtering/ The unique feature is the filtering is already built-in into the vector search phase, so there is no need to pre- or postfilter the results.

keisukegoda3804 t1_j5xy9x3 wrote on January 26, 2023 at 9:31 AM

Do you happen to know how fast it is compared to other services that build-in filtering inside their vector search (pinecone, milvus, etc.)?

Kacper-Lukawski t1_j5yineq wrote on January 26, 2023 at 1:26 PM

I do not know any benchmark that would measure that. It would also be quite challenging to compare to SaaS like Pinecone (it should be running on the same infrastructure to have comparable results). When it comes to Milvus, as far as I know, they use prefiltering for filtered search (https://github.com/milvus-io/milvus/discussions/12927). So they need to store the ids of matching entries somewhere during the vector search phase, possibly even all the ids if your filtering criteria do not exclude anything.