I see that Gary Marcus thinks that big AI companies are cheating on the benchmarks to make it look like there is continuing improvement. His last post on OpenAI's latest o3 results is all but saying "liars".
As we might expect, this is becoming as fuzzy as evaluating human performance, especially when cheating is barely detectable.
I'd love to take that class!
I think Canvas courses can be made "Public" in the settings, no?
I see that Gary Marcus thinks that big AI companies are cheating on the benchmarks to make it look like there is continuing improvement. His last post on OpenAI's latest o3 results is all but saying "liars".
As we might expect, this is becoming as fuzzy as evaluating human performance, especially when cheating is barely detectable.