Comments

You must log in or register to comment.

t1_jcptalr wrote

What are the techniques to male such large models run on low ressources ?

15

t1_jcqwzek wrote

That's amazing!

Thank you for that link. With my old laptop and slow internet connection I'm struggling downloading visual studio and getting everything to work. I do have weights but still figuring out why building fails. Is there any way to download a prebuilt version?

8

t1_jcronvh wrote

- offloading and accelerating (moving some parts to memory mapped disk or gpu ram, this can also make for quicker loading)

- pruning (removing parts of the model that didn’t end up impacting outputs after training)

- further quantization below 4 bits

- distilling to a mixture of experts?

- factoring and distilling parts out into heuristic algorithms?

- finetuning to specific tasks (e.g. distilling/pruning out all information related to non-relevant languages or domains) this would likely make it very small

EDIT:

- numerous techniques published in papers over the past few years

- distilling into an architecture not limited by e.g. a constraint of being feed forward

3

t1_jcs53iw wrote

The results for LLaMA-33B quantised to 3bit are rather interesting. That would be an extremely potent LLM capable of running on consumer hardware. Pity that there are no test results for the 2bit version.

3

t1_jcswg1g wrote

I've heard from some experienced testers that the 33B model is shockingly bad compared to even the 13B one. Despite what the benchmarks say. That we should either use the 65B one ( very good apparently ) or stick to 13B/7B. Not because of any technical reason but random luck/chance involved with training these models and the resultant quality.

I wonder if there's any truth to it. If you've tested it yourself, I'd love to hear what you thought.

5

t1_jct3v62 wrote

Thanks for your reply!

I have not used vs and cmake before, so I am probably making all newbie mistakes. I've sorted out that some paths where not set, and that C:\mingw-32\bin\make.exe doesn't exist but it's now minigw-make.exe.

Now I get the error that

   'C:/MinGW-32/bin/make.exe' '-?'

  failed with:

   C:/MinGW-32/bin/make.exe: invalid option -- ?

And from the few things I've found on-line I gathered it's because the mingw version doesn't support the option, but I should use Vs instead. I am a bit lost. Every time I manage to fix one issue, there's another one. :-)

2

t1_jct58nc wrote

Thanks!

Both of the instructions (for Android which I'm attempting, but also the Windows instructions) result with the > C:/MinGW-32/bin/make.exe: invalid option -- ? error. I can't seem to figure out what make version I should use instead, or how to edit that.

1

t1_jcte34d wrote

🤷

Sometimes models just come out crap. Like BLOOM which has almost the same number of parameters as GPT3, but is absolute garbage in any practical use case. Like a kid from two smart parents that turns out dumb. Just blind chance.

Or they could be wrong. 🤷

3

t1_jcu1odb wrote

I have a problem with

C:\Users\****\source\repos\alpaca.cpp\build>make chat
make: *** No rule to make target 'chat'.  Stop.

and

C:\Users\****\source\repos\alpaca.cpp>make chat
I llama.cpp build info: I UNAME_S:  CYGWIN_NT-10.0 I 
UNAME_P:  unknown I UNAME_M:  x86_64 I CFLAGS:   -I.              
-O3 -DNDEBUG -std=c11   -fPIC -mfma -mf16c -mavx -
mavx2 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -
std=c++11 -fPIC I LDFLAGS: I CC:       cc (GCC) 
10.2.0 I CXX:      g++ (GCC) 10.2.0
cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -
mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o g++ -
I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -c 
utils.cpp -o utils.o g++ -I. -I./examples -O3 -
DNDEBUG -std=c++11 -fPIC chat.cpp ggml.o utils.o -o 
chat chat.cpp: In function 'int main(int, char**)': 
chat.cpp:883:26: error: aggregate 'main(int, 
char**)::sigaction sigint_action' has incomplete type 
and cannot be defined 883 |         struct sigaction 
sigint_action; |                          
~~~~~~~~~~~~ chat.cpp:885:9: error: 'sigemptyset' was 
not declared in this scope 885 |         sigemptyset 
(&sigint_action.sa_mask); |         ~~~~~~~~~~ 
chat.cpp:887:47: error: invalid use of incomplete 
type 'struct main(int, char**)::sigaction' 887 |         
sigaction(SIGINT, &sigint_action, NULL); |                                               
^ chat.cpp:883:16: note: forward declaration of 
'struct main(int, char**)::sigaction' 883 |         
struct sigaction sigint_action; |                
~~~~~~~~ make: *** [Makefile:195: chat] Error 1

using windows.

1

t1_jcu9nfv wrote

I believe I already have the build.

I still get this error

C:\Users\****\Downloads\alpaca\alpaca.cpp>make chat
I llama.cpp build info: I UNAME_S:  CYGWIN_NT-10.0 I UNAME_P:  
unknown I UNAME_M:  x86_64 I CFLAGS:   -I.              -O3 -
DNDEBUG -std=c11   -fPIC -mfma -mf16c -mavx -mavx2 I 
CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC I 
LDFLAGS: I CC:       cc (GCC) 10.2.0 I CXX:      g++ (GCC) 
10.2.0
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat chat.cpp: In function 'int main(int, 
char**)': chat.cpp:883:26: error: aggregate 'main(int, 
char**)::sigaction sigint_action' has incomplete type and 
cannot be defined 883 |         struct sigaction 
sigint_action; |                          ~~~~~~~~~~~~ 
chat.cpp:885:9: error: 'sigemptyset' was not declared in this 
scope 885 |         sigemptyset (&sigint_action.sa_mask); |         
~~~~~~~~~~ chat.cpp:887:47: error: invalid use of incomplete 
type 'struct main(int, char**)::sigaction' 887 |         
sigaction(SIGINT, &sigint_action, NULL); |                                               
^ chat.cpp:883:16: note: forward declaration of 'struct 
main(int, char**)::sigaction' 883 |         struct sigaction 
sigint_action; |                ~~~~~~~~ make: *** [Makefile:195: chat] Error 1
1

t1_jcufsqf wrote

I'm getting a new error

C:\Users\ninja\source\repos\alpaca.cpp>make chat
process_begin: CreateProcess(NULL, uname -s, ...) failed. 
process_begin: CreateProcess(NULL, uname -p, ...) failed. 
process_begin: CreateProcess(NULL, uname -m, ...) failed. 
'cc' is not recognized as an internal or external command, 
operable program or batch file. 'g++' is not recognized as an 
internal or external command, operable program or batch file. 
I llama.cpp build info: I UNAME_S: I UNAME_P: I UNAME_M: I 
CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -
mfma -mf16c -mavx -mavx2 I CXXFLAGS: -I. -I./examples -O3 -
DNDEBUG -std=c++11 -fPIC I LDFLAGS: I CC: I CXX:
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat process_begin: CreateProcess(NULL, g++ 
-I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp 
ggml.o utils.o -o chat, ...) failed. make (e=2): The system 
cannot find the file specified. Makefile:195: recipe for 
target 'chat' failed make: *** [chat] Error 2
1

t1_jcy7nxg wrote

Should be faster than 1 word per second. Judging by the fact, that modern PC's run it at 5 words per second and a raspberry pi 4b runs it at 1 word per second, it should run somewhere near the 2.5 words per second mark

1

t1_jcy83gf wrote

I use swap too. For now, it can only run on flagships tho. You have to have at least 8gb of ram, because running it directly on let's say 3gb(3gb used by system) ram and 3-5gb SWAP may not even be possible and if it is, then it will be very slow and prone to crashing

1

t1_jczly8z wrote

hello, I've recently run the alpaca.cpp on my laptop, but I want to give it a context window so that it can remember conversations, and make it voice activated using python. Can someone guide me on this?

1

t1_jddjvvn wrote

Is there an APK out there to side load? Would be fun to try on my Pixel 6 Pro without becoming an expert on how to go through the motions of the make stuff...

1