Thank you for that link. With my old laptop and slow internet connection I'm struggling downloading visual studio and getting everything to work. I do have weights but still figuring out why building fails. Is there any way to download a prebuilt version?

[deleted] t1_jcr18u4 wrote on March 18, 2023 at 10:10 PM

#2,265,900

Replying to timedacorn369 (#2,264,997)

[deleted]

Pale-Dentist330 t1_jcr3e9m wrote on March 18, 2023 at 10:26 PM

#2,265,994

Can you add the steps here?

starstruckmon t1_jcrbf0m wrote on March 18, 2023 at 11:27 PM

#2,266,370

Replying to timedacorn369 (#2,264,997)

You can see some benchmarks here

https://github.com/qwopqwop200/GPTQ-for-LLaMa

simpleuserhere OP t1_jcreufr wrote on March 18, 2023 at 11:53 PM

#2,266,533

Replying to Pale-Dentist330 (#2,265,994)

Hi, please check this branch https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support

simpleuserhere OP t1_jcrfjsh wrote on March 18, 2023 at 11:59 PM

#2,266,556

Replying to schorhr (#2,265,695)

Thanks,What error are you getting? With Vs compiler and cmake we can easily build it.

baffo32 t1_jcronvh wrote on March 19, 2023 at 1:08 AM

#2,266,995

Replying to Meddhouib10 (#2,263,987)

- offloading and accelerating (moving some parts to memory mapped disk or gpu ram, this can also make for quicker loading)

- pruning (removing parts of the model that didn’t end up impacting outputs after training)

- further quantization below 4 bits

- distilling to a mixture of experts?

- factoring and distilling parts out into heuristic algorithms?

- finetuning to specific tasks (e.g. distilling/pruning out all information related to non-relevant languages or domains) this would likely make it very small

EDIT:

- numerous techniques published in papers over the past few years

- distilling into an architecture not limited by e.g. a constraint of being feed forward

[deleted] t1_jcrsk06 wrote on March 19, 2023 at 1:39 AM

#2,267,189

Replying to starstruckmon (#2,266,370)

[deleted]

Taenk t1_jcs53iw wrote on March 19, 2023 at 3:23 AM

#2,267,822

Replying to starstruckmon (#2,266,370)

The results for LLaMA-33B quantised to 3bit are rather interesting. That would be an extremely potent LLM capable of running on consumer hardware. Pity that there are no test results for the 2bit version.

Taenk t1_jcs5eon wrote on March 19, 2023 at 3:26 AM

#2,267,838

Replying to legendofbrando (#2,264,240)

A proper port to the neural engine would be especially interesting. There was one by Apple for Stable Diffusion.

360macky t1_jcsanpu wrote on March 19, 2023 at 4:15 AM

#2,268,157

Replying to Taenk (#2,267,838)

Thanks!

starstruckmon t1_jcswg1g wrote on March 19, 2023 at 8:54 AM

#2,269,062

Replying to Taenk (#2,267,822)

I've heard from some experienced testers that the 33B model is shockingly bad compared to even the 13B one. Despite what the benchmarks say. That we should either use the 65B one ( very good apparently ) or stick to 13B/7B. Not because of any technical reason but random luck/chance involved with training these models and the resultant quality.

I wonder if there's any truth to it. If you've tested it yourself, I'd love to hear what you thought.

schorhr t1_jct3v62 wrote on March 19, 2023 at 10:41 AM

#2,269,304

Replying to simpleuserhere (#2,266,556)

Thanks for your reply!

I have not used vs and cmake before, so I am probably making all newbie mistakes. I've sorted out that some paths where not set, and that C:\mingw-32\bin\make.exe doesn't exist but it's now minigw-make.exe.

Now I get the error that

   'C:/MinGW-32/bin/make.exe' '-?'

  failed with:

   C:/MinGW-32/bin/make.exe: invalid option -- ?

And from the few things I've found on-line I gathered it's because the mingw version doesn't support the option, but I should use Vs instead. I am a bit lost. Every time I manage to fix one issue, there's another one. :-)

simpleuserhere OP t1_jct4k2z wrote on March 19, 2023 at 10:51 AM

#2,269,324

Replying to schorhr (#2,269,304)

I have updated readme with Windows build instructions,please check https://github.com/rupeshs/alpaca.cpp#windows

schorhr t1_jct58nc wrote on March 19, 2023 at 11:00 AM

#2,269,351

Replying to simpleuserhere (#2,269,324)

Thanks!

Both of the instructions (for Android which I'm attempting, but also the Windows instructions) result with the > C:/MinGW-32/bin/make.exe: invalid option -- ? error. I can't seem to figure out what make version I should use instead, or how to edit that.

simpleuserhere OP t1_jct9btk wrote on March 19, 2023 at 11:51 AM

#2,269,516

Replying to schorhr (#2,269,351)

For Android build please use Linux ( tested with Ubuntu 20.04)

schorhr t1_jctb6tz wrote on March 19, 2023 at 12:12 PM

#2,269,610

Replying to simpleuserhere (#2,269,516)

Okay. I don't have the capacity right now (old laptop, disk too small to really use a second OS). I appreciate the help! I will once I get a new computer.

Taenk t1_jctdmvi wrote on March 19, 2023 at 12:38 PM

#2,269,741

Replying to starstruckmon (#2,269,062)

I haven’t tried the larger models unfortunately. However I wonder how the model could be „shockingly bad“ despite having almost three times the parameter count.

starstruckmon t1_jcte34d wrote on March 19, 2023 at 12:42 PM

#2,269,768

Replying to Taenk (#2,269,741)

🤷

Sometimes models just come out crap. Like BLOOM which has almost the same number of parameters as GPT3, but is absolute garbage in any practical use case. Like a kid from two smart parents that turns out dumb. Just blind chance.

Or they could be wrong. 🤷

ninjasaid13 t1_jcu1odb wrote on March 19, 2023 at 3:49 PM

#2,270,983

Replying to simpleuserhere (#2,269,516)

I have a problem with

C:\Users\****\source\repos\alpaca.cpp\build&gt;make chat
make: *** No rule to make target 'chat'.  Stop.

and

C:\Users\****\source\repos\alpaca.cpp&gt;make chat
I llama.cpp build info: I UNAME_S:  CYGWIN_NT-10.0 I 
UNAME_P:  unknown I UNAME_M:  x86_64 I CFLAGS:   -I.              
-O3 -DNDEBUG -std=c11   -fPIC -mfma -mf16c -mavx -
mavx2 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -
std=c++11 -fPIC I LDFLAGS: I CC:       cc (GCC) 
10.2.0 I CXX:      g++ (GCC) 10.2.0
cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -
mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o g++ -
I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -c 
utils.cpp -o utils.o g++ -I. -I./examples -O3 -
DNDEBUG -std=c++11 -fPIC chat.cpp ggml.o utils.o -o 
chat chat.cpp: In function 'int main(int, char**)': 
chat.cpp:883:26: error: aggregate 'main(int, 
char**)::sigaction sigint_action' has incomplete type 
and cannot be defined 883 |         struct sigaction 
sigint_action; |                          
~~~~~~~~~~~~ chat.cpp:885:9: error: 'sigemptyset' was 
not declared in this scope 885 |         sigemptyset 
(&amp;sigint_action.sa_mask); |         ~~~~~~~~~~ 
chat.cpp:887:47: error: invalid use of incomplete 
type 'struct main(int, char**)::sigaction' 887 |         
sigaction(SIGINT, &amp;sigint_action, NULL); |                                               
^ chat.cpp:883:16: note: forward declaration of 
'struct main(int, char**)::sigaction' 883 |         
struct sigaction sigint_action; |                
~~~~~~~~ make: *** [Makefile:195: chat] Error 1

using windows.

simpleuserhere OP t1_jcu2ikl wrote on March 19, 2023 at 3:55 PM

#2,271,034

Replying to ninjasaid13 (#2,270,983)

For Windows you need Visual C++ compiler, so install Visual Studio C++ 2019 build tools, follow the instruction here https://github.com/rupeshs/alpaca.cpp#windows

ninjasaid13 t1_jcu9nfv wrote on March 19, 2023 at 4:45 PM

#2,271,383

Replying to simpleuserhere (#2,271,034)

I believe I already have the build.

I still get this error

C:\Users\****\Downloads\alpaca\alpaca.cpp&gt;make chat
I llama.cpp build info: I UNAME_S:  CYGWIN_NT-10.0 I UNAME_P:  
unknown I UNAME_M:  x86_64 I CFLAGS:   -I.              -O3 -
DNDEBUG -std=c11   -fPIC -mfma -mf16c -mavx -mavx2 I 
CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC I 
LDFLAGS: I CC:       cc (GCC) 10.2.0 I CXX:      g++ (GCC) 
10.2.0
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat chat.cpp: In function 'int main(int, 
char**)': chat.cpp:883:26: error: aggregate 'main(int, 
char**)::sigaction sigint_action' has incomplete type and 
cannot be defined 883 |         struct sigaction 
sigint_action; |                          ~~~~~~~~~~~~ 
chat.cpp:885:9: error: 'sigemptyset' was not declared in this 
scope 885 |         sigemptyset (&amp;sigint_action.sa_mask); |         
~~~~~~~~~~ chat.cpp:887:47: error: invalid use of incomplete 
type 'struct main(int, char**)::sigaction' 887 |         
sigaction(SIGINT, &amp;sigint_action, NULL); |                                               
^ chat.cpp:883:16: note: forward declaration of 'struct 
main(int, char**)::sigaction' 883 |         struct sigaction 
sigint_action; |                ~~~~~~~~ make: *** [Makefile:195: chat] Error 1

simpleuserhere OP t1_jcu9x05 wrote on March 19, 2023 at 4:47 PM

#2,271,398

Replying to ninjasaid13 (#2,270,983)

Are you using cygwin?

ninjasaid13 t1_jcuajwh wrote on March 19, 2023 at 4:51 PM

#2,271,418

Replying to simpleuserhere (#2,271,398)

yes I have cygwin.

simpleuserhere OP t1_jcubta9 wrote on March 19, 2023 at 5:00 PM

#2,271,471

Replying to ninjasaid13 (#2,271,418)

I haven't tried cygwin for Alpaca.cpp.

ninjasaid13 t1_jcubyue wrote on March 19, 2023 at 5:01 PM

#2,271,480

Replying to simpleuserhere (#2,271,471)

so it won't work? do I need to install MinGW?

simpleuserhere OP t1_jcuc25e wrote on March 19, 2023 at 5:01 PM

#2,271,484

Replying to ninjasaid13 (#2,271,480)

Yes,

ninjasaid13 t1_jcufsqf wrote on March 19, 2023 at 5:26 PM

#2,271,635

Replying to simpleuserhere (#2,271,484)

I'm getting a new error

C:\Users\ninja\source\repos\alpaca.cpp&gt;make chat
process_begin: CreateProcess(NULL, uname -s, ...) failed. 
process_begin: CreateProcess(NULL, uname -p, ...) failed. 
process_begin: CreateProcess(NULL, uname -m, ...) failed. 
'cc' is not recognized as an internal or external command, 
operable program or batch file. 'g++' is not recognized as an 
internal or external command, operable program or batch file. 
I llama.cpp build info: I UNAME_S: I UNAME_P: I UNAME_M: I 
CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -
mfma -mf16c -mavx -mavx2 I CXXFLAGS: -I. -I./examples -O3 -
DNDEBUG -std=c++11 -fPIC I LDFLAGS: I CC: I CXX:
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat process_begin: CreateProcess(NULL, g++ 
-I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat, ...) failed. make (e=2): The system 
cannot find the file specified. Makefile:195: recipe for 
target 'chat' failed make: *** [chat] Error 2

1stuserhere t1_jcuyofc wrote on March 19, 2023 at 7:30 PM

#2,272,491

How fast is the model on android, u/simpleuserhere?

pkuba208 t1_jcvmhhm wrote on March 19, 2023 at 10:13 PM

#2,273,926

Replying to 1stuserhere (#2,272,491)

Depends on the hardware

Art10001 t1_jcwfyw8 wrote on March 20, 2023 at 2:04 AM

#2,275,859

Replying to baffo32 (#2,266,995)

I heard MoE is bad. I have no sources sadly.

Art10001 t1_jcwg2pl wrote on March 20, 2023 at 2:05 AM

#2,275,864

Replying to Taenk (#2,267,838)

CoreML.

Please, review this link and associated research paper: https://github.com/apple/ml-ane-transformers

Art10001 t1_jcwg5zg wrote on March 20, 2023 at 2:06 AM

#2,275,870

Replying to pkuba208 (#2,273,926)

You can really see how phones defeat 10 year old computers, as revealed by their Geekbench 5 scores.

Art10001 t1_jcwg7bv wrote on March 20, 2023 at 2:06 AM

#2,275,874

Replying to ninjasaid13 (#2,271,635)

Try installing MSYS2.

ninjasaid13 t1_jcwwgt5 wrote on March 20, 2023 at 4:31 AM

#2,276,882

Replying to Art10001 (#2,275,874)

now what?

pkuba208 t1_jcx3d9i wrote on March 20, 2023 at 5:53 AM

#2,277,254

Replying to Art10001 (#2,275,870)

Well... I run this model on a raspberry pi 4B, but you will need AT LEAST 8gb ram

baffo32 t1_jcxqr2i wrote on March 20, 2023 at 11:21 AM

#2,278,278

Replying to Art10001 (#2,275,859)

i visited cvpr last year and people were saying that moe was what mostly was being used; i haven’t tried these things myself though

1stuserhere t1_jcxyj1o wrote on March 20, 2023 at 12:41 PM

#2,278,741

Replying to pkuba208 (#2,273,926)

pixel 6 or 7 (or other modern phones from last 2-3 years)

Art10001 t1_jcy2jck wrote on March 20, 2023 at 1:15 PM

#2,279,001

Replying to ninjasaid13 (#2,276,882)

I was asleep, my apologies for not replying earlier.

Run pacman -Syu then pacman -Sy build-essential then cd to the build directory and follow the instructions

Art10001 t1_jcy2sb5 wrote on March 20, 2023 at 1:17 PM

#2,279,021

Replying to pkuba208 (#2,277,254)

Raspberry Pi 4 is far slower than modern phones.

Also there was somebody else saying it probably actually uses 4/6 GB.

pkuba208 t1_jcy717u wrote on March 20, 2023 at 1:51 PM

#2,279,298

Replying to Art10001 (#2,279,021)

I know, but android uses 3-4gb ram itself. I run it myself, so I know that it uses from 6-7 gb of ram on the smallest model currently with 4bit quantization

pkuba208 t1_jcy7nxg wrote on March 20, 2023 at 1:55 PM

#2,279,333

Replying to 1stuserhere (#2,278,741)

Should be faster than 1 word per second. Judging by the fact, that modern PC's run it at 5 words per second and a raspberry pi 4b runs it at 1 word per second, it should run somewhere near the 2.5 words per second mark

Art10001 t1_jcy7rqs wrote on March 20, 2023 at 1:56 PM

#2,279,344

Replying to pkuba208 (#2,279,298)

Yes, that's why it was tried in a Pixel 7 which has 8 GB of RAM and maybe even swap.

pkuba208 t1_jcy83gf wrote on March 20, 2023 at 1:59 PM

#2,279,366

Replying to Art10001 (#2,279,344)

I use swap too. For now, it can only run on flagships tho. You have to have at least 8gb of ram, because running it directly on let's say 3gb(3gb used by system) ram and 3-5gb SWAP may not even be possible and if it is, then it will be very slow and prone to crashing

Board_Stock t1_jczly8z wrote on March 20, 2023 at 7:33 PM

#2,282,866

hello, I've recently run the alpaca.cpp on my laptop, but I want to give it a context window so that it can remember conversations, and make it voice activated using python. Can someone guide me on this?

ommerike t1_jddjvvn wrote on March 23, 2023 at 4:59 PM

#2,320,462

Is there an APK out there to side load? Would be fun to try on my Pixel 6 Pro without becoming an expert on how to go through the motions of the make stuff...

Comments