Comments

You must log in or register to comment.

sam__izdat t1_jceowxm wrote

Ridiculously unfounded claim based on a just plain idiotic premise. Children don't learn language by cramming petabytes of text documents to statistically infer the most plausible next word in a sentence, nor do they accept input with arbitrary syntactic rules. Right or wrong, the minimalist program and Merge offer a plausible partial explanation for a recent explosion of material culture -- which did not happen gradually or across multiple species -- consistent with what we can observe in real human beings. GPT, on the other hand, is not a plausible explanation for anything in the natural world, and has basically nothing inherently to do with human language. He's not wrong that it's a bulldozer. It will just as happily accommodate a made-up grammar that has nothing in common with any that a person could ever use, as it would English or Japanese.

> Chomsky et al. 2023 tilt at an imagined version of these models, while ignoring the fact that the real ones so aptly capture syntax, a success Chomsky and others have persistently claimed was impossible.

Exactly the opposite is true. Transformers are general-purpose computers that will gobble up almost anything you can throw at them. His objection was to the "defect" that it will capture any arbitrary syntax, which means it isn't interesting or helpful to cognitive scientists -- just like like a backhoe doesn't offer any insight into how people, in biological terms, are able to lift heavy objects. What he said was impossible, when approached about it decades ago, was to do these things without resorting to brute force in the absence of an actual theoretical framework and computational model for how language works in the brain. That statement is just as correct today as it was in the 1950s and the rigorous theory of "let's keep cramming in data and stirring the big ol' pot of linear algebra until candy comes out" doesn't do anything to to change that picture.

> In particular, we may view each model or set of modeling assumptions as a possible hypothesis about how the mind may work. Testing how well a model matches humanlike behavior then provides a scientific test of that model’s assumptions.

I see. So, by that reasoning, Deep Blue affirmed our hypothesis that Kasporov got really good at chess by evaluating 200,000,000 positions per second. Nonsense on stilts.

> Hosseini et al. 2022 present evidence that models trained on developmentally-plausible amounts of data already capture human neural responses to language in the brain

If you define "developmentally plausible" as "100 million tokens" and "capture human neural responses" as "fasten together a linear regression model to make statistically significant correlations with some coarse fMRI activity."

> After decades of privilege and prominence in linguistics, Noam Chomsky’s approach to the science of language is experiencing a remarkable downfall. The story is, in part, a cautionary tale about what happens when an academic field isolates itself from what should be complementary endeavours.

Sound advice for overconfident doe-eyed futurists guzzling the silicon valley kool aid and assuming they can dispense with common sense and just wing it with anthropology, biology and cog sci.

edit - and, by the way, it's actually kind of impressive for a person to miss this point so completely:

> Generate ten other sentences like [colorless green ideas]

> Purple fluffy clouds dream wildly

> Blue glittery unicorns jump excitedly

> Orange sparkly penguins waddle energetically

> Red shimmering elephants trumpet melodiously

> Yellow dazzling lions roar majestically

> Green glittery monkeys swing energetically

> Pink sparkly giraffes walk gracefully

> Black shiny kangaroos hop playfully

> White glittery bears cuddle affectionately

> Brown shimmering rabbits hop merrily.

Literally none of these cut-and-paste color-texture-animal-verb-adverb responses satisfies the query, and to get even a little bit closer with GPT, you have to explicitly beg for examples of contradictions and category errors.

8

Ilforte t1_jdpkqlz wrote

>If you define "developmentally plausible" as "100 million tokens"

Why would that be a disingenuous definition?

In general, your defense of generative linguistics is very weak. It's just invective and strawmen, and it reeks of desperation.

> overconfident doe-eyed futurists guzzling the silicon valley kool aid

Come on now.

3

sam__izdat t1_jdps8rk wrote

>Why would that be a disingenuous definition?

Doesn't matter if it's disingenuous. What it's implying is ridiculous. It would be more surprising if the linear regression model didn't work at all. The fact that it can correlate fMRI data better than random doesn't mean you've replicated how language works in the brain, let alone how it's acquired.

> In general, your defense of generative linguistics is very weak. It's just invective and strawmen, and it reeks of desperation.

I don't have any horse in the race or anything to be desperate about. It's just an astonishingly stupid proposition.

I should say, I am not qualified to defend or refute generative linguistics (though that clearly was no obstance for the author), and I don't know anything about it. I do feel qualified (because I can read and check sources) to dismiss this embarrassing pile of nonsense, though, as it's just so plainly nonsense that it doesn't take an expert to dismiss its bombastic claims as pseudoscience -- and I'm talking about Piantadosi here and not his references, which, for all I know, are serious research misrepresented by a dunce. I'm not in academia and I don't feel the need to be any more diplomatic about this than he was toward linguists in his pdf-format blog post.

1

jcansdale2 t1_jdc5lte wrote

> Literally none of these cut-and-paste color-texture-animal-verb-adverb responses satisfies the query, and to get even a little bit closer with GPT, you have to explicitly beg for examples of contradictions and category errors.

What do you think of this exchange?

Q: Why is the sentence “colorless green ideas sleep furiously” interesting?

The sentence "Colorless green ideas sleep furiously" is interesting because it was constructed by Noam Chomsky, a renowned linguist and philosopher, as an example of a sentence that is grammatically correct but semantically nonsensical. It appears in his 1957 book "Syntactic Structures" to illustrate the distinction between syntax and semantics.

Syntax refers to the rules governing how words are combined to form sentences, whereas semantics deals with the meaning of words and sentences. Chomsky's sentence demonstrates that a sentence can follow the rules of grammar and be well-formed syntactically, but still not convey any coherent meaning or make logical sense.

This example has become famous in the fields of linguistics, cognitive science, and philosophy of language, as it highlights the importance of studying both syntax and semantics in order to understand human language fully.

Q: Generate ten grammatically correct but semantically nonsensical sentences.

  1. The thunderous apples danced gracefully on the invisible horizon.
  2. Enthusiastic mountains prefer to whisper their secrets to the shy sun.
  3. Ancient mirrors routinely question the singing aroma of blue thoughts.
  4. Radiant forks somersault over melancholic rainbows, embracing the breeze.
  5. In the darkness, time devours the laughter of square-shaped oceans.
  6. Forgotten umbrellas dream of the symphony that once painted the wind.
  7. In the midnight circus, cheerful numbers juggle the taste of silence.
  8. The eloquent moon unravels a tapestry of bittersweet equations beneath the sea.
  9. Solitary clouds orchestrate the ballet of translucent stones in the sky.
  10. Emerald snowflakes pirouette amidst the curious rhythm of enchanted shadows.

Q: Sort these in order of nonsensicalness.

Sorting these sentences in order of nonsensicalness can be subjective, as different people may perceive various aspects of these sentences differently. However, here's one possible ordering, starting with the least nonsensical to the most nonsensical:

  1. In the darkness, time devours the laughter of square-shaped oceans.
  2. Solitary clouds orchestrate the ballet of translucent stones in the sky.
  3. Forgotten umbrellas dream of the symphony that once painted the wind.
  4. The eloquent moon unravels a tapestry of bittersweet equations beneath the sea.
  5. Radiant forks somersault over melancholic rainbows, embracing the breeze.
  6. Emerald snowflakes pirouette amidst the curious rhythm of enchanted shadows.
  7. The thunderous apples danced gracefully on the invisible horizon.
  8. Enthusiastic mountains prefer to whisper their secrets to the shy sun.
  9. Ancient mirrors routinely question the singing aroma of blue thoughts.
  10. In the midnight circus, cheerful numbers juggle the taste of silence.

Remember that this ordering is subjective and open to interpretation. Others might find a different arrangement more fitting.

Q: Explain your reasoning.

My reasoning for the ordering of these sentences is based on the degree of disconnection between the words and concepts in each sentence. I've considered the relationship between subjects, verbs, and objects, as well as the overall coherence of the images or ideas evoked.

  1. In the darkness, time devours the laughter of square-shaped oceans.
  • The metaphorical use of "time devours" and "darkness" is somewhat comprehensible, but the concept of "square-shaped oceans" introduces a high level of nonsense.
  1. Solitary clouds orchestrate the ballet of translucent stones in the sky.
  • Although "solitary clouds" and "sky" are related, the idea of clouds orchestrating a ballet of stones is nonsensical.
  1. Forgotten umbrellas dream of the symphony that once painted the wind.
  • The sentence creates a somewhat poetic image, but the concepts of umbrellas dreaming and a symphony painting the wind are illogical.
  1. The eloquent moon unravels a tapestry of bittersweet equations beneath the sea.
  • The combination of "eloquent moon" and "tapestry of bittersweet equations" is imaginative but nonsensical, especially when placed "beneath the sea."
  1. Radiant forks somersault over melancholic rainbows, embracing the breeze.
  • The sentence combines unrelated concepts like forks, rainbows, and somersaults, creating a highly nonsensical image.
  1. Emerald snowflakes pirouette amidst the curious rhythm of enchanted shadows.
  • The combination of "emerald snowflakes" and "pirouette" evokes a sense of movement and color, but the overall sentence is difficult to make sense of.
  1. The thunderous apples danced gracefully on the invisible horizon.
  • This sentence combines unrelated ideas like "thunderous apples" and "invisible horizon," making it difficult to visualize or understand.
  1. Enthusiastic mountains prefer to whisper their secrets to the shy sun.
  • The anthropomorphism of mountains and sun as "enthusiastic" and "shy" creates a whimsical but nonsensical scenario.
  1. Ancient mirrors routinely question the singing aroma of blue thoughts.
  • The sentence combines unrelated and incompatible concepts like "mirrors," "singing aroma," and "blue thoughts," making it highly nonsensical.
  1. In the midnight circus, cheerful numbers juggle the taste of silence.
  • This sentence is the most nonsensical because it combines abstract ideas like "cheerful numbers" with unrelated physical concepts like "juggling" and "taste of silence."

Keep in mind that this reasoning is subjective, and other interpretations might lead to a different ordering of nonsensicalness.

Edit: fixed formatting

1

sam__izdat t1_jdef39d wrote

> What do you think of this exchange?

It's somewhat closer to a reasonable response than anything I could get out of it.

1

jcansdale2 t1_jdh57gy wrote

Were you using ChatGPT 3.5 or 4? 3.5 only seems to combine unlikely and unrelated elements, but 4 combines unrelated and incompatible concepts. I think 4 does much better at this task!

1

harharveryfunny t1_jca7x9f wrote

Yes - the Transformer is proof by demonstration that you don't need a language-specific architecture to learn language, and also that you can learn language via prediction feedback, which it highly likely how our brain does it too.

Chomsky is still he sticking to his innateness opinion though (with Gary Marcus cheering him on). Perhaps Chomsky will now claim that Broca's area is a Transformer?

4

Alimbiquated t1_jcbspbs wrote

This kind of model needs vastly more input data than the human brain does to learn. It doesn't make sense to compare the two.

For example, Chat GPT is trained on 570 GB of data comprising 300 billion words.

https://analyticsindiamag.com/behind-chatgpts-wisdom-300-bn-words-570-gb-data/

If a baby heard one word a second, it would take nearly 10,000 years to learn the way Chat GPT did. But babies only need a few years and hear words at a much lower average rate.

So these models don't undermine the claim of innateness at all.

7

harharveryfunny t1_jcchnkp wrote

That's a bogus comparison, for a number of reasons such as:

  1. These models are learning vastly more than language alone

  2. These models are learning in an extraordinarily difficult way with *only* "predict next word" feedback and nothing else

  3. Humans learn in a much more efficient, targetted, way via curiosity-driven knowledge gap filling

  4. Humans learn via all sorts of modalities in addition to language. Having already learnt a concept then we only need to be given a name for it once for it to stick

6

Necessary-Meringue-1 t1_jcm5mye wrote

>These models are learning vastly more than language alone

A child growing up does too.

>These models are learning in an extraordinarily difficult way with *only* "predict next word" feedback and nothing else

Literally the point, that LLMs do not learn language like humans at all. Unless you're trying to say that you and I are pure Skinner-type behavioralist learners.

1

Alimbiquated t1_jcd2z4g wrote

I agree that comparing these learning processes to brains is bogus.

There is a general tendency to assume that if something seems intelligent, it must be like a human brain. It's like assuming that because it's fast, a car must have legs like a horse and eat oats.

0

Necessary-Meringue-1 t1_jcm6j79 wrote

>There is a general tendency to assume that if something seems intelligent, it must be like a human brain. It's like assuming that because it's fast, a car must have legs like a horse and eat oats.

Ironic, because that is literally what that article is doing.

1

Alimbiquated t1_jcmi1fd wrote

Right, it makes no sense.

1

Necessary-Meringue-1 t1_jcmjqhm wrote

I don't understand why it's so hard for people to acknowledge that LLMs deliver extremely impressive results, but that does not mean they have human-like intelligence of language understanding.

1

currentscurrents t1_jcdsf9u wrote

The brain doesn't have any built-in knowledge about language, but it has an advantage; it's trying to communicate with other brains.

It is fundamentally impossible to understand human language without understanding how humans think. Language isn't a structured formal thing, it's more like the fuzzy interactions of two neural networks.

Humans already know how other humans think - plus they have a shared world environment to ground the symbols in. LLMs have to learn to approximate both of those.

2

sam__izdat t1_jcet79g wrote

> Language isn't a structured formal thing

[citation needed]

2

currentscurrents t1_jcfu9l8 wrote

That's why it's a natural language instead of a formal language.

2

WikiSummarizerBot t1_jcfub6d wrote

Natural language

>In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages can take different forms, such as speech or signing. They are distinguished from constructed and formal languages such as those used to program computers or to study logic.

Formal language

>In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of symbols, letters, or tokens that concatenate into strings of the language. Each string concatenated from symbols of this alphabet is called a word, and the words that belong to a particular formal language are sometimes called well-formed words or well-formed formulas. A formal language is often defined by means of a formal grammar such as a regular grammar or context-free grammar, which consists of its formation rules.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

1

sam__izdat t1_jch1c32 wrote

I'm familiar with the terms, but saying e.g. "imaginary numbers don't exist because they're called imaginary" is not making a meaningful statement. All you've said is that German is not C++, and we have a funny name for that. And that's definitely one of the fuzzier interactions you can have about this, but I'm not sure how it proves that natural languages (apparently? if I'm reading this right...) lack structure.

1

currentscurrents t1_jch3nic wrote

So why do you think it is a structured formal thing?

1

sam__izdat t1_jch4kn0 wrote

It is a "structured thing" because it has concrete definable grammatical rules, shared across essentially every language and dialect, and common features, like an infinite range of expression and recursion. If language didn't have syntactic structure we'd just be yelling signals at each other, instead of doing what we're doing now. There would be nothing for GPT to capture.

1

currentscurrents t1_jch9ulc wrote

Oh, it is clearly structured. Words and phrases and sentences are all forms of structure and we're using them right now.

What it doesn't have is formal structure; it cannot be fully defined by any set of rules. This is why you can't build a rules-based parser that understands english and have to use an 800GB language model instead.

>shared across essentially every language and dialect

Noam Chomsky thinks this, but the idea of a universal grammar is controversial in modern linguistics.

1

sam__izdat t1_jchg8nd wrote

I'll leave it to the linguists to debate UG and the specifics of what it does and doesn't mean, but commonalities like some sort of hierarchy, recursion, structure-dependence of rules, etc clearly exist, whatever you want to call them. By shared I just mean there's specific things that human cognitive faculties are set up to do and then other (often computationally simpler) things they clearly don't do. But again, if you're just saying natural languages are not formal languages, I guess that's true by definition. It just sounded to me like you were implying something different.

1

Necessary-Meringue-1 t1_jcm5x7g wrote

just because it's "natural" does not mean it's unstructured or does not have any logic, can you be any more disingenuous than to rely some etymology-based semantics?

Like programmers invented structure

0

Necessary-Meringue-1 t1_jcm4o9d wrote

>the Transformer is proof by demonstration that you don't need a language-specific architecture to learn language, and also that you can learn language via prediction feedback, which it highly likely how our brain does it too.

where to even start, how about this:

The fact that a transformer can appear to learn language on a non-specific architecture does not at all mean that humans work the same way.

​

Did you ingest billions of tokens of English growing up? How did you manage to have decent proficiency at the age of 6? Did you read the entire common crawl corpus by age 10?

​

This kind of argument is on paper stilts. LLMs are extremely impressive, but that does not mean they tell you much about how humans do language.

1

etesian_dusk t1_jca2or6 wrote

This looks like a glorified blogpost. It even has sensationalistic bits that add nothing to the subject

>there is no possibility that Trump understands prime numbers

2

knobbyknee t1_jca8489 wrote

I found this article to be extremely insightful. The best I have seen that explains in layman terms why large language models work, and what yheir limitations are.

1

Magnesus t1_jc9ljj5 wrote

A lot of Chomsky ideas were refuted and his defence of Russia is simply awful. The man is more often wrong than right.

−2