Patterns, Predictions, and AGI

Aug 24, 2023

Shannon's Language Models and the Modern Theory of Prediction

9 Comments

Aug 25, 2023

Would you agree that our brain likely implements a simple language model -- something that can be relatively easily captured by statistics?

Expand full comment

Reply (1)

Ben Recht

Aug 26, 2023

I know just enough neuroscience to conclude I have no idea what our brain is doing!

Expand full comment

Reply (1)

Misha Belkin

Aug 26, 2023

I mean it in a slightly different way. You said that "combining Shannon’s language models with Rosenblatt’s perceptron (1955) and some function approximation gives us our modern language model". That's a pretty simple statistical model, presumably. LLMs have comparable (in many ways higher) linguistic competence to, say, an average person in any given domain. Wouldn't that imply or at least suggest that the linguistic model running on our "wetware" is not any more complex?

Expand full comment

Reply (1)

Ben Recht

Aug 26, 2023

Or would it suggest our narrow and unsophisticated conceptualization of linguistic competence is less complex than what our brains actually do? I.e., doesn't this just confirm that the Turing Test is facile?

Expand full comment

Reply (1)

Misha Belkin

Aug 26, 2023Edited

But isn't linguistic competence the main thing that separates us from animals (there is also tool use, but arguably language is more key)? The Turing test is the best thing we have currently. Btw, Turing did not think it was that difficult, he predicted that it would be done by 2000 with computers with 10^9 bits of storage. Not a bad guess!

I don't see a reason to think our brain does more than what it appears to do. The lesson I take is that human intelligence is a relatively simple phenomenon, amenable to statistical modeling.

Expand full comment

Reply (1)

Ben Recht

Aug 27, 2023

I disagree with all of this! But fortunately, the class doesn't depend on any functional analysis of human wetware.

Expand full comment

Reply (1)

Misha Belkin

Aug 27, 2023

Ok!

Expand full comment

Alexander Naumenko

Sep 15, 2023

Sorry, I cannot agree. We do not predict the next character, but we may figure out the continuation given sufficient piece of text to start with. Figuring out is different from prediction. The game 20 Questions is different from "heads or tails".

Intelligence is the ability to handle differences, its core algorithm is based on comparisons. All of that is enabled by comparable properties.

In language, references are stacked filters that differentiate the relevant objects from the context. Sentences follow formulas composed of constituents. Note that question words address constituents. Questions provide all the other constituents and ask for an unknown one. We store many sentences in our head - answering questions is just comparing provided constituents with those in the stored records.

The above process is heavily based on generalization, which is also based on differences and comparable properties, not on similarities. We introduce differentiating factors to the parent class during specialization. During generalization, we ignore differentiating factors.

Differences rule!

Expand full comment

Ryan

Aug 25, 2023

Thank you!!

Expand full comment

arg min

Patterns, Predictions, and AGI