Comments

You must log in or register to comment.

MysteryInc152 OP t1_jcputc0 wrote

Uses relative positional encoding. Long context in theory but because it was trained on 2048 tokens of context, performance gradually declines after that. Finetuning for more context wouldn't be impossible though.

You can run with FP-16 (13GB RAM), 8-bit(10GB) and 4-bit(6 GB) quantization.

36

BalorNG t1_jcqgc4x wrote

I has 6b parameters, but I bet it cannot answer what has happened on Tiananmen square in 1989 :3

71

username001999 t1_jcrn1aq wrote

We Americans live in a country where kids are regularly gunned down in school so we make ourselves feel better by making jokes about how much worse other countries are for events that happened over 30 years ago. Or we don’t even know our own history, like the Kent State Massacre.

−7

relevantmeemayhere t1_jcrotun wrote

Mm, not really.

Bootstrapping is used to determine the standard error of estimates using resampling. From here we can derive tools like confidence intervals, or other interval estimates.

Generally speaking you do not use the bootstrap to tweak the parameters of your model. You use cross validation to do so.

10

wyhauyeung1 t1_jcrvnxz wrote

I successfully deployed in my local PC and run. Just wondering, where is the model file stored after install? It seems I could not find any big files under the directory

3

farmingvillein t1_jcsnx0f wrote

"open source".

That license, lol:

> You will not use, copy, modify, merge, publish, distribute, reproduce, or create derivative works of the Software, in whole or in part, for any commercial, military, or illegal purposes.

> You will not use the Software for any act that may undermine China's national security and national unity, harm the public interest of society, or infringe upon the rights and interests of human beings.

> This license shall be governed and construed in accordance with the laws of People’s Republic of China. Any dispute arising from or in connection with this License shall be submitted to Haidian District People's Court in Beijing.

What a nightmare.

40

xerca t1_jcsnz4j wrote

And derailing any topic that comes out of China into Tiananmen square is not acting bad faith? Especially given that the American company "Open"AI is heavily guarding and paywalling their models while this Chinese group is sharing theirs with the world for everyone to use.

Conflating anything that comes out of a country with 1.5 billion people with your incredibly shallow knowledge of history only serves to demonstrate your ignorance.

3

extopico t1_jcsuio8 wrote

What? No it’s not. Pointing out blatant whataboutism is always independently valid.

Why would you even write what you wrote? Is it a required riposte that’s included in your briefing file, or training?

4

BalorNG t1_jcsy0rl wrote

Technically, I'm from Russia.

And, of course, you are able to read every opinion about "special military operation" here... sometimes even without VPN. It is just voicing a "different one" can get you for years into prison and your kids into a foster home for reindocrination. While the programmers that coded it might have a range diverse opinions on this and other "politically sensitive" subjects, if they would want their programm to pass inspection in China, they WILL have to do considerable fine-tuning to throw away sensitive data, if our Russian google (Yandex) frontpage is of any indictation. If this is a foundational model w/o finetunnig that's a different matter tho... but that it will hallucinate nonstop and produce "fakes" anyway...

0

sanxiyn t1_jcw2yoz wrote

On the other hand, commercial use restriction is not compatible with generally accepted definition of open source, for example The Open Source Definition.

> 6) No Discrimination Against Fields of Endeavor. The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

7