Learning from clairvoyance

Sep 2

The best case for future evaluation fixes our methods in the present.

9 Comments

I have a counterpoint to Wiener's claim from a while back. Basically, there are things that an entirely additive loss function cannot express.

https://weary-travelers.gitlab.io/posts/ideas/non-additive-losses/idea.html

Expand full comment

Reply (2)

Jess Grogan

Sep 3

Thank you for sharing, I found this useful to think about. Currently, I don't think that's a good enough counter point for Wiener's claim. My reasoning: just because there's a specific way of human reasoning that an average loss function doesn't exactly replicate, doesn't mean that considering an average isn't the most effective way for machines to learn the problems we need them to, in order to help society.

Expand full comment

Cagatay Candan

Sep 4

Average loss is linked with probability through law of large numbers.

In many applications, say in communication; say your Wifi server operates at 50 Mbits/sec; that is there are 50 million bit transmissions taking place at every second. This is the regime of law of large numbers, relative frequency interpretation of probability etc. It does not make sense to exert a special effort for the 100th bit or 200th bit in this application.(Also, there is a feedback channel requesting the correction of mis-delivered bits called ARQ. Things work as you know!) Since all bits are identical in value, which is indeed the case after source coding, you only care to improve the average number of transmission errors, say in a second, which is the probability of error metric in communications. Hence, in this regime of repeated trials/experiments, targeting the average behaviour makes sense; also in insurance business, also in the hub optimization of fed-ex packets etc.

The meaning and connotation of probability of airplane accident is not the same for the insurance company, airline company or passengers.

I have also read the blog in the link with great interest. The word habituation in the blog really suits well current AI training supervised by examples IMO. I remember when I was teaching, I was trying to avoid many examples in class but trying to focus on the main content and directing students to the examples for self-discovery after the theory in class. Many instructors taught topics through examples. Books also differ significantly on this from Schaums books to yellow perils in maths. This reminds me that. I can not say that learning through examples does not work after so many years.

Expand full comment

Alex Tolley

Sep 2

You might want to use cheap AI to correct spelling. "Evaluting" stood out like a sore thumb.

Actually, I thought teh slide deck seemed quite understandable, at least on first glance.

Expand full comment

Reply (2)

Ben Recht

Sep 2

It would be helpful if you told me where the typos occurred. That error doesn't occur in this post.

also: teh.

Expand full comment

Reply (1)

Alex Tolley

Sep 2

Slide 17.

"teh" is a long-time muscle coordination problem. I usually have to do a "Find and Replace" to correct that error in texts.

Expand full comment

Hostile Replicator

Sep 2Edited

If we’re being pedentic, the final paragraph contains a you’re that should be a your.

Anyway, excited to follow along with these notes and especially the reports of the discussions that arise in your classes!

Expand full comment

Reply (1)

Ben Recht

Sep 3

fixed. I'm always happy for people to call out typos! just tell me where they are.

also: pendentic -> pedantic. :)

Expand full comment

Reply (1)

Hostile Replicator

Sep 3

Aha! You fell into my trap of pedantically correcting my deliberate typo of pedantic!

(It’s the little things in life)

Expand full comment

arg min

Learning from clairvoyance