Submitted by RingoCatKeeper t3_zypzrv in MachineLearning

I built an iOS app called Queryable, which integrates the CLIP model on iOS to search the Photos album offline.

Photo searching performace of search with the help of CLIP model

Compared to the search function of the iPhone Photos, CLIP-based album search capability is overwhelmingly better. With CLIP, you can search for a scene in your mind, a tone, an object, or even an emotion conveyed by the image.

How does it works? Well, CLIP has Text Encoder & Image Encoder

>Text Encoder will encode any text into a 1x512 dim vector
>
>Image Encoder will encode any image into a 1x512 dim vector

We can calculate the proximity of a text sentence and an image by finding the cosine similarity between their text vector and image vector

The pseudo code is as follows:

import clip

# Load ViT-B-32 CLIP model
model, preprocess = clip.load("ViT-B/32", device=device)

# Calculate image vector & text vector
image_feature = model.encode_image("photo-of-a-dog.png")
text_feature = model.encode_text("rainly night")

# cosine similarity
sim = cosin_similarity(image_feature, text_feature)

To use Queryable, you need to first build the index, which will traverse your album, calculate all the image vectors and store. This takes place only ONCE, when searching, only one CLP forward for the user's text input query, below is a flowchart of how Queryable works:

How does Queryable works

On Privacy and security issues, Queryable is designed to be totally offline and will Never request network access, thereby avoiding privacy issues.

As it's a paid app, I'm sharing a few promo codes here:

Requirement:
- Your iOS needs to be 16.0 or above.
- iPhone XS/XSMax or below may not working, DO NOT BUY.

9W7KTA39JLET
ALFJK3L6H7NH
9AFYNJX63LNF
F3FRNMTLAA4T
9F4MYLWAHHNT
T7NPKXNXHFRH
3TEMNHYH7YNA
HTNFNWWHA4HA
T6YJEWAEYFMX
49LTJKEFKE7Y

YTHN4AMWW99Y
WHAAXYAM3LFT
WE6R4WNXRLRE
RFFK66KMFXLH
4FHT9X6W6TT4
N43YHHRA9PRY
9MNXPAJWNRKY
PPPRXAY43JW9
JYTNF93XWNP3
W9NEWENJTJ3X

Hope you guys find it's useful.

147

Comments

You must log in or register to comment.

brucebay t1_j27z7xg wrote

Great idea. Hope you will earn more money after people recognize its value.

34

Several-Aide-8291 t1_j28a160 wrote

Overall the app looks good. A few suggestions: 1. Allow user to mark bad results so that they are ignored next time. 2. Add ability to scroll, right now it only gives top 12 results but in my album there are consistently many more results. 3. Once I find a photo there is not much I can do with it, adding share/save/edit would enhance the experience

17

RingoCatKeeper OP t1_j28crrl wrote

I've changed the results number from 12->120, now submitting a new version for review.

8

RingoCatKeeper OP t1_j28adm9 wrote

Thanks for your useful advice!

1.Great idea!

2.I will change the number to larger(or even configurable) in the next version.

3.Some functions may requires network, but it's cool for the idea of manually adding results.

2

Evoke_App t1_j288jn8 wrote

Google photos has the same feature, do you find this has better search capabilities than google photos?

Though offline search is a godsend.

12

RingoCatKeeper OP t1_j288x67 wrote

This is not comparable. Google runs models on professional GPUs, while this app can only use Apple chips, so there is a big difference in the size of models that can be run.
Offline search lets you not worry about anyone invading your album privacy, including Google.

17

Evoke_App t1_j28a1v2 wrote

Yep, I think we're in agreement here, I was just wondering what your personal experiences comparing the two are in terms of quality.

7

TheIdesOfMay t1_j28qz2q wrote

Great implementation! What is the run time for calculating the CLIP embeddings per image? And inference latency? Were any low-level model optimisations made for it to run on iOS hardware or am I deeply underestimating the power of these new chips lol

5

RingoCatKeeper OP t1_j28rheo wrote

The calculating speed is of ~2000 photos per minute on iPhone 12 mini.

The time cost for a search also depends on your Photos number, For <10,000 photos it takes less than 1s.

3

dmart89 t1_j27at4r wrote

This is cool.

3

pridkett t1_j292fr1 wrote

I used one of the codes to start poking around (X6RPT3HALW6R). I was optimistic about it working with M1/M2 Macs too. Downloaded the iPad version onto my M2 iPad Air and started a query and it crashed after I clicked to have it start indexing the photos.

Currently playing with it on my iPhone. Seems really neat. Would be great if there were a way to synchronize the indexes across devices through iCloud (or even iCloud drive).

I've had similar thoughts but doing something with X-CLIP to search the videos on your phone for when you're looking for a specific video (I take a lot of short videos of my family).

3

RingoCatKeeper OP t1_j293hgk wrote

It's an interesting idea to synchronize the indexes for different devices, however anything related with network connection is a disaster of an app that reads all you photos. Maybe there exists a better way to do this.

On the issue of running on M2, I'll check it out later.

Your project sounds interesting, please get me noticed when there is a product.

1

pridkett t1_j296ivt wrote

That's why I was suggesting just saving the index to iCloud files. You're not providing the synchronization nor do you need to provide servers to handle more people. The data stay secure in iCloud.

I also want to add that I really like how you've managed to do this in a way that is privacy centric. It also has a nice side effect of making things much more scalable - you just need to provide someplace to download the models, which are infrequently needed (likely only on a new device)?

2

1995FOREVER t1_j29aokh wrote

So you trust icloud now?

3

pridkett t1_j2bigzy wrote

I trust iCloud a whole lot more than I trust a random service to store my content. I also trust iCloud more than Google Drive. I also have all my photos in iCloud - so yes, I trust iCloud.

1

dat_cosmo_cat t1_j2a3t0o wrote

> the data stay secure in iCloud

Lmao. Dude really missed the entire point of the project.

1

CallMeInfinitay t1_j27i13b wrote

It's a shame this is only available for iOS 16, sounds useful.

2

RingoCatKeeper OP t1_j27jmic wrote

Major issues was CoreML operator support, another reason was, iOS 16.0 may block away some very old iPhone (below X), otherwise users paid but run CLIP very laggy, which is bad experience. Of course I admit that the UI of iOS 16.0 is really ugly

7

Taenk t1_j27vxn1 wrote

I think you could port this to the M-chip MacBooks as well.

2

beatle5 t1_j285pjr wrote

All the codes seem to be redeemed already. Could you please DM me one if you’re okay with me trying the app out? I’m working on CLIP and cross-modal learning at university and am interested to try out an application that uses large language models.

2

Vendraaa t1_j286gee wrote

If you port it to android as well, I'd like to try it

2

learn-deeply t1_j2875vr wrote

How do you do the top-k neighbor search in iOS? Is there a library for it?

2

RingoCatKeeper OP t1_j287d0y wrote

I implemented the part of cosine similarity calculation myself, as for the topK, you can use .sort().prefix(k) in Swift.

2

Steve132 t1_j28og0o wrote

There's an O(n) algorithm for top k partitioning that could be much much faster than .sort() when you have thousands of elements.

QuickSelect. In C++ its available as std::nth_element in swift I couldn't find it directly but you can implement it in a few lines using .partition as a subroutine

7

RingoCatKeeper OP t1_j28p4zl wrote

Will certainly check it out!

1

ElectronicCress3132 t1_j29c108 wrote

Btw, one should take care not to implement the worst-case O(n) algorithm (which is Quickselect + Median of Medians), because it has high constant factors in the time complexity which slow it down in the average case. QuickSelect + Random Partitioning, or Introselect (the C++ standard library function mentioned) have good average time complexities and rarely hit the worst case.

1

ElectronicCress3132 t1_j29byah wrote

I think the one in the standard library is introselect, which is a hybrid of QuickSelect

1

learn-deeply t1_j287u7z wrote

So it's calculating nearest neighbor compared to all of the images in the index every time a new search is done? Might be slow past say, 1,000 images.

1

londons_explorer t1_j28cfh3 wrote

It should scale to 1 million images without much slowdown.

1 million images * 512 vector length= 512 million multiples, which the neural engine ought to be able to do in ~100ms

4

learn-deeply t1_j28hirz wrote

Is that calculation taking into account memory (RAM/SSD) access latencies?

1

londons_explorer t1_j28kvqp wrote

There is no latency constraint - it's a pure streaming operation, and total data to be transferred is 1 gigabyte for the whole set of vectors - which is well within the read performance of apples ssd's.

This is also the naive approach - there are probably smarter approaches by doing an approximate search with very low resolution vectors (eg. 3 bit depth), and then a 2nd pass of the high resolution vectors of only the most promising few thousand results.

3

Steve132 t1_j28oxex wrote

One thing you aren't taking into account is that the computation of the similarity scores is O(n) but the sorting he's doing is n log n which for 1m might dominate especially since it's not necessarily hardware optimized

1

londons_explorer t1_j28ufby wrote

Top K sorting is linear in computational complexity, and I doubt it will dominate because it just needs to be done on a single number rather than a vector of 512 numbers.

1

RingoCatKeeper OP t1_j2885ds wrote

You're right. There were some optimized work by Google called ScanNN, which is much faster on large scale vector similarity search. However, it's much more complicated to port this model to iOS.

1

undefdev t1_j28cf5b wrote

Nice! This seems to work better than iOS own photo search, thanks!

2

stablebrick t1_j2axj0s wrote

I really hope apple picks this up and makes it an actual feature this is great

2

antonevstigneev t1_j27y20q wrote

how much $ did you earn so far?

1

RingoCatKeeper OP t1_j27yuxu wrote

Currently 6.3$(one purchase).

The product was on the App Store yesterday, and as a non-English speaker, it's really hard for me to promote it in English-speaking regions :-(

9

caedin8 t1_j298go4 wrote

You should change your developer name. Seeing Chinese characters on an app listing is a huge red flag for westerners. Come up with some English pen-name

5

RingoCatKeeper OP t1_j299bdb wrote

It's Apple's requirements, I've no choice. Thanks for your advice though

6

caedin8 t1_j29cotk wrote

Just make a company account and transfer the app to the company account with the western name

9

zero0_one1 t1_j281qfx wrote

You should have one more, I bought it. I was thinking to make something like this myself after Apple's search couldn't find a photo I was looking for.

3

RingoCatKeeper OP t1_j281zop wrote

Super thanks, I'm encouraged

1

zero0_one1 t1_j282noa wrote

One problem: I only get at most 12 results per query.

1

RingoCatKeeper OP t1_j282w8t wrote

It's normal cause it will only show the 12 most similar photos in current version, may set it to be configurable in the next version.

1

RingoCatKeeper OP t1_j28dvnl wrote

In the next version this number will be 120, has submited for review.

1

Evoke_App t1_j28a8h5 wrote

How are you currently promoting it? And is it a one time purchase?

I think people would be more open to it as a free trial and then subscription after that. You'd have recurring income too.

I'm curious because depending on how you're promoting it, I'd be more than happy to help.

1

RingoCatKeeper OP t1_j28aoff wrote

Yes, it's a one time permanent purchase.

I agree with you on "free trail then subscription", actually I was going to do the same thing. However, a In-App Purchase requires network connection.

Currently, I'm promoting it at reddit, produchunt, and nowhere, It would be great if you could help me.

1

Evoke_App t1_j28kvoy wrote

Oh, I see. Do you need your app to have a permanent network connection for subscription?

I would imagine to purchase the sub the customers need to be online, but their data gets logged into a separate server that is permanently online, so it doesn't matter if they go offline, they'll still be charged until they unsub

And for promotion, I was referring more to writing descriptions for your product hunt, but if I find anyone that's looking for something like this on Reddit, I'll tag you and bring up your app ;)

1

RingoCatKeeper OP t1_j28mxva wrote

It's not on whether it needs permanent nework or not, but on it would request a network access, which will toast pop-up window on first request, which is a privacy and security issue.

1

redpnd t1_j28s2il wrote

I'd encourage to use an English name on the App Store. Might increase the trust. Good luck!

1

RingoCatKeeper OP t1_j28slru wrote

Thanks, I agreed. However it's Apple's requirements when I registed Devepleper Account("Fill your Chinese name below").

1

SweatyBicycle9758 t1_j28gwde wrote

Does this look up with dates of photos taken too?

1

RingoCatKeeper OP t1_j28jbe5 wrote

No, it's only about content similarly.

1

SweatyBicycle9758 t1_j28jmv2 wrote

Then I would suggest that feature too, to be able to look up images based on dates filter too. Honest opinion, personally I wouldn’t put money into something which Apple already does(of course based on comments I see ur app does better in similar context pictures) for someone like me dates are more important as I could remember, if that feature is gonna be included I’ll definitely take it. Good luck

2

Final-Rush759 t1_j28q5hl wrote

Great, except I switched to Android.

1

omgpop t1_j28w0m9 wrote

Does not work for me at all on iPhone XS. All photos indexed and the search finds nothing. Want my money back lol. Since there are no settings, there’s nothing to troubleshoot. It simply does not work, search produces 0 results.

1

RingoCatKeeper OP t1_j28wvxw wrote

I'll check it out, got notice from another user with xsmax not working, I guess it's a chip problem. I'm sorry for that. You can refund first, and I'll also confirm and consider ban the phone before iPhone 11.

1

hermlon t1_j295dp1 wrote

This is a really cool idea. I'm currently using the CLIP model for an image retrieval task at university. We're using the Ball Tree for finding the closest images to the text in the vector space. What algorithm are you using for finding the nearest neighbors?

1

RingoCatKeeper OP t1_j295uu0 wrote

I'm using the simple cosine similarity between embedding vectors. There were some optimized work by Google called ScanNN, which is much faster on large scale vector similarity search. However, it's much more complicated to port this model to iOS.

1

hermlon t1_j29a939 wrote

So you go trough all the images each time and compute the cosine similarity between it and the text each time?

1

MammothKindly1605 t1_j29f6um wrote

How do you guarantee privacy?

1

Green_ninjas t1_j29n4ib wrote

All the computation is probably done locally, don’t have the app but if it runs in airplane mode then should be running everything on the phone itself

3

1995FOREVER t1_j29ocu0 wrote

I used RHT3NMLHPFMW. Gonna try it out on my ipad 6. Thanks!

edit: does not work on ipad 6. I think anything lower than a a13 wouldn't work since it crashes in iphone XS

1

RingoCatKeeper OP t1_j2b1epn wrote

Apple does not allow developer to restrict iPhone model, I'm considering how to block these models to purchase.

1

NoThanks93330 t1_j2aam8x wrote

Damn I need this for android.

Does anyone know if there is something similar available for Android?

1

unicodemonkey t1_j2auxih wrote

Hi. Thanks for the code, I've used 7HWRPY9RXEWY.
The app does work for me even with a fairly large index (35K photos) and I have some feedback to share:

  • a first-time user can type in a query before being asked to build the index. Might be better to offer indexing right after the first start.
  • the query doesn't get re-run automatically after indexing completes, so the user sees the "no index, no results" response to the initial query until they try searching again
  • the indexer has to rely on low-res thumbnails when processing photos that have been offloaded to iCloud. Does this affect accuracy? I'm not sure if there are enough pixels for CLIP.
  • such photos don't get redownloaded from iCloud when I'm viewing them in the search results. I just get blurry thumbnails.
  • there's no way to actually do anything useful with a search result. The "Share" button would be a welcome addition, as well as metadata display and a viewer that supports the zoom gesture.
  • I see you l've extended the number of search results from 12 to 120, great. Maybe it's possible to load more results dynamically when scrolling instead of a configurable hard limit.
  • I think ranking just by similarity is not intuitive enough, though. Recent photos or favorites are likely to be more important for the user, for example. Just an idea for future improvement - a simple ranking model over CLIP similarity and a number of other features might be useful.
  • Would be nice to have search restricted by a particular album
  • The model does produce unexpected results at times - e.g. "orange cat" seems to be a fitting description for a gray cat sitting on an orange blanket.
1

RingoCatKeeper OP t1_j2b3dtq wrote

Thanks for your long feedback, I've read it twice.

1.re-run the initial query is a great idea, will try to update in the next version.

2.For a ViT-B-32 CLIP model, it will resize all imagines input to the size by 224x224, which is even smaller than that thumbnails, so this will do no harm to performance.

3.Download imagines from iCloud is easy to implement, however it requires network access. It's a disaster for an app that reads all your photos having access to a network, so I made a compromise here.

4.I've tried dynamic scrolling but it cost more time to fetch results, will consider do that way.

5.Search from a few specific album names is a better experience, will definitely find how to implement it.

Really thanks for your patient feedback!

1

unicodemonkey t1_j2bdf3g wrote

I think network access would be legitimate if used specifically by the iCloud service to display photos. It probably happens in a separate background process that manages the photo library, not in the app itself. But it's up to you to decide, of course.

2

OmarMola69 t1_j29gqod wrote

Guys i need some help in my project can some one contact with me in +201012505830

−4