Submitted by imgonnarelph t3_11wqmga in MachineLearning
currentscurrents t1_jd10ab5 wrote
Reply to comment by pier4r in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
Llamma.cpp uses the neural engine, so does StableDiffusion. And the speed is not that far off from VRAM, actually.
>Memory bandwidth is increased to 800GB/s, more than 10x the latest PC desktop chip, and M1 Ultra can be configured with 128GB of unified memory.
By comparison, the Nvidia 4090 is clocking in at ~1000GB/s
Apple is clearly positioning their devices for AI.
Straight-Comb-6956 t1_jd2iwp6 wrote
> Llamma.cpp uses the neural engine,
Does it?
mmyjona t1_jdceex2 wrote
no, llama-mps use ane.
pier4r t1_jd39md4 wrote
> Llamma.cpp uses the neural engine
I am trying to find confirmation for this but I didn't. I saw some ports, but weren't from the LLaMa team. Do you have any source?
Viewing a single comment thread. View all comments