Been following your course live blog with great interest, thank you so much for sharing! So much of what you share relates to my background as a recovering academic economist as a remedy for the orthodox approaches to economic modelling. MLs orientation towards learning and action resonates. Wanted to put Alex Imas’ Substack on your radar. https://substack.com/@aleximas/note/p-182334603?r=qumb&utm_medium=ios&utm_source=notes-share-action. He shares an application of transformers to learning dynamics embedded in an economic model of the macroeconomy. Integrating ML into economic applications feels like a real step towards a new and more productive paradigm.
In fear of the Streisand effect, I was avoiding commenting, but I was subtweeting that post when I wrote "Economists now throw together random Python notebooks, call them AI, and declare revolution." I was quite dismayed it went as viral as it did.
I could share a more detailed critique if you'd like, but I'd be more interested in getting a sense of how you see ML playing a role in a new kind of economic thinking.
Lol... I left my comment before reading your whole post and felt squeamish after reading that. I agree that economists like to throw quantitative spaghetti at the wall and see what sticks.
I do see so much potential for a different way of thinking about how to deal with randomness/uncertainty using an ML lens as you describe, learning to inform action with a feedback loop. Specifically, I work in building large portfolios of financial assets. Modern Portfolio Theory was truly an innovation when developed 70 odd years ago but it really hasn't progressed much since then conceptually. A specific problem that comes up is how should we think about weighting stocks and bonds in a portfolio when their correlation is ~0 on average but goes through long spells of 0.3 or -0.3? Traditional mean-variance optimization is really unhelpful. Practitioners have developed intuitions and "methods" for incorporating this dynamic but it feels very much like medieval medicine, we should be able to do better.
More generally, your points about metrical determinism resonate strongly. There is probably a whole literature that could explore how different economic forecasting techniques and frameworks (thinking about IMF macroeconomic forecasts or earnings forecasts from sell-side bank equity analysts) are rational responses to success metrics used in those areas.
I'm still just scratching the surface of ML and many of the perspectives are like a breath of fresh air relative to classical statistical inference. One thing that is interesting about transformers is their potential to capture weak/distant relationships across time / assets much like how words are connected in a complex way in text. That sounds very much like the texture of how financial markets are connected in my observations through the years.
Lots of good points here. A couple ML-related replies.
- I think the evidence that transformers capture long term dependencies better than normal time series is very mixed. Transformers caught on in text because they were more efficient to train than recurrent models, not because they provably captured patterns that recurrent models couldn't. I'd recommend this talk by Ludwig Schmidt about this subtle point: https://simons.berkeley.edu/talks/ludwig-schmidt-university-washington-2023-08-18
- When your datasets are relatively small (i.e., not all of the text on the internet), transformers don't outperform simple models. Here are a couple of interesting papers that compare transformers to boring ARMA models and find that the latter often make better predictions!
- I totally agree with you that portfolio theory remains uncomfortably lacking. The same is true for much of the mathematical finance frameworks developed in the post-war period. I personally think it's the framing of investing as "mathematical risk management" that leaves us functionally impoverished when trying to manage complex dynamic systems like markets. Shoehorning everything into risk management is metrical determinism in action.
Happy new year! I first off wanted to say that I really enjoy your substack - you have a very unique perspective on ML/AI that I always find thought provoking.
I had one question that I wanted to get your thoughts on (as I have been thinking and wrestling with these kind of questions).
Let's take the question of whether AI-enabled technologies making hiring more or less biased. There's an entire academic literature and popular literature that will immediately conclude that AI must make things more biased. But there's also an entire academic and popular literature that shows that humans are quite biased as well. So it seems impossible to answer the question of whether AI resume screening tools reduce or exacerbate bias from just first principles. (I'm using hiring as an example here but I think the basic structure would be true for a lot of AI-in-society questions)
If its not possible to answer from first principles, then it seems to me that you really need some sort of an RCT to answer the question. But it seems from your posts on statistical fatalism that you're skeptical that RCTs can really answer anything since the conditions in an RCT do not reflect the real-world (i.e. how the resume screening tool is used in the RCT may not be representative of how companies actually use the tools, the Lucas critique, small effects need huge samples, etc ).
From your perspective, is there a way of convincing (or at least semi-convincing) way of addressing questions like "does introducing algorithmic tool X to real-world setting Y make things better or worse"?
Been following your course live blog with great interest, thank you so much for sharing! So much of what you share relates to my background as a recovering academic economist as a remedy for the orthodox approaches to economic modelling. MLs orientation towards learning and action resonates. Wanted to put Alex Imas’ Substack on your radar. https://substack.com/@aleximas/note/p-182334603?r=qumb&utm_medium=ios&utm_source=notes-share-action. He shares an application of transformers to learning dynamics embedded in an economic model of the macroeconomy. Integrating ML into economic applications feels like a real step towards a new and more productive paradigm.
In fear of the Streisand effect, I was avoiding commenting, but I was subtweeting that post when I wrote "Economists now throw together random Python notebooks, call them AI, and declare revolution." I was quite dismayed it went as viral as it did.
I could share a more detailed critique if you'd like, but I'd be more interested in getting a sense of how you see ML playing a role in a new kind of economic thinking.
Lol... I left my comment before reading your whole post and felt squeamish after reading that. I agree that economists like to throw quantitative spaghetti at the wall and see what sticks.
I do see so much potential for a different way of thinking about how to deal with randomness/uncertainty using an ML lens as you describe, learning to inform action with a feedback loop. Specifically, I work in building large portfolios of financial assets. Modern Portfolio Theory was truly an innovation when developed 70 odd years ago but it really hasn't progressed much since then conceptually. A specific problem that comes up is how should we think about weighting stocks and bonds in a portfolio when their correlation is ~0 on average but goes through long spells of 0.3 or -0.3? Traditional mean-variance optimization is really unhelpful. Practitioners have developed intuitions and "methods" for incorporating this dynamic but it feels very much like medieval medicine, we should be able to do better.
More generally, your points about metrical determinism resonate strongly. There is probably a whole literature that could explore how different economic forecasting techniques and frameworks (thinking about IMF macroeconomic forecasts or earnings forecasts from sell-side bank equity analysts) are rational responses to success metrics used in those areas.
I'm still just scratching the surface of ML and many of the perspectives are like a breath of fresh air relative to classical statistical inference. One thing that is interesting about transformers is their potential to capture weak/distant relationships across time / assets much like how words are connected in a complex way in text. That sounds very much like the texture of how financial markets are connected in my observations through the years.
Lots of good points here. A couple ML-related replies.
- I think the evidence that transformers capture long term dependencies better than normal time series is very mixed. Transformers caught on in text because they were more efficient to train than recurrent models, not because they provably captured patterns that recurrent models couldn't. I'd recommend this talk by Ludwig Schmidt about this subtle point: https://simons.berkeley.edu/talks/ludwig-schmidt-university-washington-2023-08-18
- When your datasets are relatively small (i.e., not all of the text on the internet), transformers don't outperform simple models. Here are a couple of interesting papers that compare transformers to boring ARMA models and find that the latter often make better predictions!
https://ojs.aaai.org/index.php/AAAI/article/view/26317
https://arxiv.org/abs/2406.16964
- I totally agree with you that portfolio theory remains uncomfortably lacking. The same is true for much of the mathematical finance frameworks developed in the post-war period. I personally think it's the framing of investing as "mathematical risk management" that leaves us functionally impoverished when trying to manage complex dynamic systems like markets. Shoehorning everything into risk management is metrical determinism in action.
Happy new year! I first off wanted to say that I really enjoy your substack - you have a very unique perspective on ML/AI that I always find thought provoking.
I had one question that I wanted to get your thoughts on (as I have been thinking and wrestling with these kind of questions).
Let's take the question of whether AI-enabled technologies making hiring more or less biased. There's an entire academic literature and popular literature that will immediately conclude that AI must make things more biased. But there's also an entire academic and popular literature that shows that humans are quite biased as well. So it seems impossible to answer the question of whether AI resume screening tools reduce or exacerbate bias from just first principles. (I'm using hiring as an example here but I think the basic structure would be true for a lot of AI-in-society questions)
If its not possible to answer from first principles, then it seems to me that you really need some sort of an RCT to answer the question. But it seems from your posts on statistical fatalism that you're skeptical that RCTs can really answer anything since the conditions in an RCT do not reflect the real-world (i.e. how the resume screening tool is used in the RCT may not be representative of how companies actually use the tools, the Lucas critique, small effects need huge samples, etc ).
From your perspective, is there a way of convincing (or at least semi-convincing) way of addressing questions like "does introducing algorithmic tool X to real-world setting Y make things better or worse"?