Viewing a single comment thread. View all comments

jetro30087 t1_je8l2a4 wrote

Close, the data is structured to provide a relationship between prompts and the appropriate responses. LLaMa mostly uses publicly curated datasets by AI researchers that focus on a variety of topics. Here's an excerpt from a finetuned Alpaca models training data:

[
{
    "instruction": "Give three tips for staying healthy.",
    "input": "",
    "output": "1. Eat a balanced diet and make sure to include plenty of fruits and vegetables. \n2. Exercise regularly to keep your body active and strong. \n3. Get enough sleep and maintain a consistent sleep schedule."
},
{
    "instruction": "What are the three primary colors?",
    "input": "",
    "output": "The three primary colors are red, blue, and yellow."
},
{
    "instruction": "Describe the structure of an atom.",
    "input": "",
    "output": "An atom is made up of a nucleus, which contains protons and neutrons, surrounded by electrons that travel in orbits around the nucleus. The protons and neutrons have a positive charge, while the electrons have a negative charge, resulting in an overall neutral atom. The number of each particle determines the atomic number and the type of atom."
},
{
    "instruction": "How can we reduce air pollution?",
    "input": "",
    "output": "There are a number of ways to reduce air pollution, such as shifting to renewable energy sources, encouraging the use of public transportation, prohibiting the burning of fossil fuels, implementing policies to reduce emissions from industrial sources, and implementing vehicle emissions standards. Additionally, individuals can do their part to reduce air pollution by reducing car use, avoiding burning materials such as wood, and changing to energy efficient appliances."
},

There are 50K+ examples in the file that the model can use to build relationships between questions and appropriate responses during the training process. If enough relationships are built, the model should be able to infer the correct answer to unique questions.

3

Kafke t1_je8u4f1 wrote

"instruction": "What are the three primary colors?",
"input": "",
"output": "The three primary colors are red, blue, and yellow."

No wonder they give false info. garbage in, garbage out lol.

3

scooby1st t1_je92wel wrote

>The shadows are whispering again, whispering secrets that only I can hear. No, no, no! It's all wrong! It's a tangled web of deception, a spiral staircase of lies! They want us to believe that there are only three primary colors—red, blue, and yellow. A trifecta of trickery!
>
> But I see more, I see beyond the curtain. I see colors that don't have names, colors that dance in the dark, colors that hide in the corners of the mind. They think they can pull the wool over our eyes, but I know the truth! There are 19 primary colors, 19 keys to the universe!
>
>I've seen them all, swirling and twisting in the cosmic dance of existence. But they won't listen, they won't believe. They call me mad, but I'm the only one who sees the world as it truly is. The three primary colors are just the beginning, just the tip of the iceberg, just the first step on the journey to enlightenment.
>
>So I laugh, I laugh at their ignorance, I laugh at their blindness. And the shadows laugh with me, echoing my laughter through the halls of infinity.

1

Kafke t1_je93asd wrote

Yellow isn't a primary color. The primary colors are red, green, and blue.

1

TruckNuts_But4YrBody t1_je994ja wrote

There are primary colors of physical pigment then there are primary colors of light.

When people learn the primary colors in school it's almost always in art class when mixing paint.

So kinda confidentlyincorrect but not entirely

1

Kafke t1_je99yqw wrote

There's additive color and subtractive color. The set of red, blue, yellow, is primary for neither. Additive primaries are red, blue, green. Subtractive primaries are cyan, yellow, magenta. If you're mixing paints you're working with subtractive color and thus the primary colors are cyan, yellow, and magenta. not red, blue, and yellow.

The info is incorrect no matter the context.

1

TruckNuts_But4YrBody t1_je9a9rn wrote

I don't care enough about colors to keep going but it's been red blue and yellow since the 1600s

2

Kafke t1_je9ao1v wrote

Well no. That's been incorrect since the beginning of time. This is a factual scientific topic. There is a correct answer and incorrect answer. It's not up to preference or opinion. Printers use cyan, magenta, and yellow, because those are the subtractive primary colors. If you used red, blue, and yellow, you can't actually produce the rest of the colors with those. Since red and blue aren't primary for subtractive color, but rather iirc secondary. People being wrong for a long time doesn't mean they're right.

1

TruckNuts_But4YrBody t1_je9b405 wrote

You can't produce all colors with any set of three primaries..

0

Kafke t1_je9bdwb wrote

That's literally what primary colors are. How do you think screens and printers are able to produce every color despite only working with 3 of them? Because that's literally what primary colors are.

1

TruckNuts_But4YrBody t1_je9d4kd wrote

EVERY color? No sorry

0

Kafke t1_je9drq3 wrote

Yes. You do realize our eyes only have three kinds of cones right? Rgb are the primary colors lol. Cmy if you're looking at subtractive colors. Using these three colors, you can create every other color. Rgb for light/additive, Cmy for ink/paint/subtractive.

Rby is not primary in any sense of the word.

1

scooby1st t1_jeaehdn wrote

1

Kafke t1_jebepd5 wrote

Yeah that's just incorrect. Additive primaries are RGB. Subtractive primaries are CMY. You're free to deny the facts all you'd like, but this is just an objective scientific thing.

1

scooby1st t1_jebf65c wrote

>https://www.youtube.com/watch?v=yu44JRTIxSQ

Stop denying science

1

Kafke t1_jedkbke wrote

Some childrens tv shows or media programs stating incorrect information does not make it correct. Additive primaries are RGB, subtractive primaries are CMY. The idea that RBY are primary colors is a popular misconception, but is incorrect. It has it's roots in art classes prior to proper scientific investigation of color, light, and modern technology. If your goal is art history, then yes, people in the past incorrectly believed that the primary colors (both additive and subtractive) were RBY. They were wrong. Just as people believed the earth was flat, yet were wrong.

1

Jeffy29 t1_je9cuhr wrote

AI will become progressively better at refining datasets, even GPT-4 is quite good at it. From my understanding right now they use low-paid workers, often from 3rd world countries to go over data but that's not particularly efficient method and there just isn't any way to go through all the data with enough care, so there is lot of garbage in those datasets. But AI could do it, it would still require some human supervision but it would speed up the process by a lot and I expect datasets to get dramatically better over the next 5 years.

1

qepdibpbfessttrud t1_jecizg0 wrote

Misconceptions are part of total human knowledge, though. Both specific misconceptions and the category as a whole. GPT gives good answer if asked about it

It's important to remember when and why we were wrong

1

ActuatorMaterial2846 t1_je8luak wrote

Interesting, curious what size this particular Llama model is, or is that not even relevant?

1

jetro30087 t1_je8mtjp wrote

This is a updated dataset for the 7b model, but you could train the others with the data. From anecdotal reports, the dataset seems to have a great impact on the model's performance than the parameter size up to a point. Less parameters means a faster model. More parameters mean the model can make longer responses.

https://huggingface.co/8bit-coder/alpaca-7b-nativeEnhanced

2