Viewing a single comment thread. View all comments

RingoCatKeeper OP t1_j287d0y wrote

I implemented the part of cosine similarity calculation myself, as for the topK, you can use .sort().prefix(k) in Swift.

2

Steve132 t1_j28og0o wrote

There's an O(n) algorithm for top k partitioning that could be much much faster than .sort() when you have thousands of elements.

QuickSelect. In C++ its available as std::nth_element in swift I couldn't find it directly but you can implement it in a few lines using .partition as a subroutine

7

RingoCatKeeper OP t1_j28p4zl wrote

Will certainly check it out!

1

ElectronicCress3132 t1_j29c108 wrote

Btw, one should take care not to implement the worst-case O(n) algorithm (which is Quickselect + Median of Medians), because it has high constant factors in the time complexity which slow it down in the average case. QuickSelect + Random Partitioning, or Introselect (the C++ standard library function mentioned) have good average time complexities and rarely hit the worst case.

1

ElectronicCress3132 t1_j29byah wrote

I think the one in the standard library is introselect, which is a hybrid of QuickSelect

1

learn-deeply t1_j287u7z wrote

So it's calculating nearest neighbor compared to all of the images in the index every time a new search is done? Might be slow past say, 1,000 images.

1

londons_explorer t1_j28cfh3 wrote

It should scale to 1 million images without much slowdown.

1 million images * 512 vector length= 512 million multiples, which the neural engine ought to be able to do in ~100ms

4

learn-deeply t1_j28hirz wrote

Is that calculation taking into account memory (RAM/SSD) access latencies?

1

londons_explorer t1_j28kvqp wrote

There is no latency constraint - it's a pure streaming operation, and total data to be transferred is 1 gigabyte for the whole set of vectors - which is well within the read performance of apples ssd's.

This is also the naive approach - there are probably smarter approaches by doing an approximate search with very low resolution vectors (eg. 3 bit depth), and then a 2nd pass of the high resolution vectors of only the most promising few thousand results.

3

Steve132 t1_j28oxex wrote

One thing you aren't taking into account is that the computation of the similarity scores is O(n) but the sorting he's doing is n log n which for 1m might dominate especially since it's not necessarily hardware optimized

1

londons_explorer t1_j28ufby wrote

Top K sorting is linear in computational complexity, and I doubt it will dominate because it just needs to be done on a single number rather than a vector of 512 numbers.

1

RingoCatKeeper OP t1_j2885ds wrote

You're right. There were some optimized work by Google called ScanNN, which is much faster on large scale vector similarity search. However, it's much more complicated to port this model to iOS.

1