Philpax

Philpax t1_jee04jo wrote

It's just difficult to wrangle all of the dependencies; I want to be able to wrap an entire model in a complete isolated black box that I can call into with a C API or similar.

That is, I'd like something like https://github.com/ggerganov/llama.cpp/blob/master/llama.h without having to rewrite the entire model.

For my use cases, native would be good, but web would be a nice to have. (With enough magic, a native solution could be potentially compiled to WebAssembly?)

4

Philpax t1_jcrgxbb wrote

As the other commenter said, it's unlikely anyone will advertise a service like this as LLaMA's license terms don't allow for it. In your situation, I'd just rent a cloud GPU server (Lambda Labs etc) and test the models you care about. It'll only end up being a dollar or two if you're quick with your use.

2

Philpax t1_jb53hvo wrote

You are probably fine, but note that a) people will likely be very angry with you, whether or not the licensing permits it and b) this is a non-trivial problem and even more non-trivial to train.

Good luck, though!

5

Philpax t1_j4uk4ws wrote

Honestly, I'm not convinced it needs a hugely complex language model, as (to me) it seems like a primarily classification task, and not one that would need a deep level of understanding. It'd be a level or two above standard spam filters, maybe?

The two primary NN-in-web solutions I'm aware of are tf.js and ONNX Runtime Web, both of which do CPU inference, but the latter is developing some GPU inference. As you say, it only needs to be done once, so having a button that scans through the transcript and classifies sentence probabilities as sponsor-read or not, and then automatically selects the boundaries of the probabilities seems readily doable. Even if it takes some noticeable amount of time for the user, it's pretty quickly amortised across the entire viewing population.

The only real concern I'd have at that point is... is it worth it for the average user over just hitting the right arrow two times and/or manually submitting the timestamps themselves? I suspect that's why it hasn't been done yet

2

Philpax t1_j37i5s5 wrote

> we are just predicting if the image is of a cat or a dog...

And there's no way automated detection of specific traits could be weaponised, right?

I generally agree that it may be too early for regulation, but that doesn't mean you can abdicate moral responsibility altogether. One should consider the societal impacts of their work. There's a reason why Joseph Redmon quit ML.

3

Philpax t1_j28k3v0 wrote

Not sure if these are supported already (just skimmed over the website and the list you've put here), but these are things I've wanted for while reading papers:

  • Dark mode. Viewing white PDFs before bed is unpleasant.
  • A way to view two-column papers as a single column (putting the right column below the left column?) so that I don't have to keep moving around

Those are probably the two things that bother me the most. In terms of exciting future work and keeping with the theme, you could also consider doing some automatic summarisation of text / papers (especially for citations), but that's much less necessary.

3