Much of the time I think the 'rush to quantify' is just a panicked kid grasping for the edge of the pool- it's an unfounded conviction that the numbers (and more numbers, always more) will *always* make for better decisions that just so happens to alleviate the existential terror of responsibility and trust, to oneself or others. That's not all bad! Some judgement is unfair and unfounded and, look, science! But of course there are diminishing (and negative) returns to data, just like everything else, and that this point I think the correct default response to 'we're going to revolutionize X through Stasi-esque levels of tracking' is 'eh, probs not.'
There's a whole tranche of science-scented people out there, engineers and technologists and the like, that seem to think that an insistence on measuring and tracking an objective marker is uniformly a *replacement* for irrational vibes, a wholly distinct and superior way of examining the world that swapped out the module of their brain that sought the word of God in burnt entrails, when just watching them it's clear that it's just a new flavor, a way of getting all the relief that the world is unfolding as it should you get from augury but with a dollop of superiority on top that they are doing things with *numbers*, and all the other kids were bad at numbers in school, but they weren't and so are Smart and Good.
Nice entry. As you say, ". . . facilitates posing clear questions and objectives, though crowds out nuance and multiplicity." From National Research Council, 1989, 1996: "Every way of summarizing deaths embodies its own set of values (National Research Council, 1989). For example, reduction in life expectancy treats deaths of young people as more important than deaths of older people, who have less life expectancy to lose. Simply counting fatalities treats deaths of the old and the young as equivalent; it also treats as equivalent deaths that come immediately after mishaps and deaths that follow painful and debilitating disease. Also in the case of delayed illness and death, a simple count of adverse outcomes places no value on what happens to exposed people who may spend years living in daily fear of illness, even if they ultimately do not die from the hazard.
Using number of deaths as the summary indicator of risk implies that it is as important to prevent deaths of people who engage in an activity by choice as it is to prevent deaths of those who bear its effects unwillingly. Thus, the death of a motorcyclist in an accident is given the same weight as the death of the pedestrian hit by the motorcycle. It also implies that it is as important to protect people who have been benefiting from a risky activity or technology as it is to protect those who get no benefit from it. One can easily imagine a range of arguments to justify different kinds of unequal weightings for different kinds of deaths, but to arrive at any selection requires a judgment about which deaths one considers most undesirable. To treat all deaths as equal also involves a judgment. In sum, even so simple and fundamental a choice as how to measure fatalities is value laden. It can present a dilemma in which no single summary measure, no matter how carefully the underlying analysis is done, can satisfy the expectations of all the participants in a risk decision process."
I have really enjoyed these posts on the violence done to nuance through quantification. I found two points in your book, that if you can define a product (as opposed to a process), probably AI will get to it sooner than latter and that the need to quantify--and what can be quantified--has gone hand-in-hand with computing power are so important to understand where we have landed up. I thought that David Graeber's Malinowski lecture on structural violence in bureaucracies also got at this from a completely different angle...would love to hear your take on that.
Yes, probably no single book had more influence on "The Irrational Decision" than Graeber's "Utopia of Rules," and the 2nd chapter of that book draws heavily from his Malinowski Lecture. I fully endorse both his theory of structural violence and his assessment of why social scientists and institutionalists are uncomfortable engaging with it.
Indeed, this is probably what bugs me most about Nguyen's "The Score." He's asking the same questions Graeber was asking, yet refuses to deal with the long history of gameplay in human societies and the role of structural violence in hierarchical control. The central question of Nguyen's book is far better addressed by structural ethnography than analytic philosophy.
This is a great one! I’m curious to think more about the different kinds of work done by quantification vs by abstraction, which I see as the other major process that can be involved in mathematization (and certainly either one of these two can be present without the other one)
A clean example is encryption of text! And in general a lot of mathematical objects in cryptography involve abstraction without necessarily quantification
I very much agree with the point about needing shared language. One bit of vocabulary I keep wishing were more mainstream is the Type 1 vs Type 2 error intuition: any boundary is simultaneously too strict and too loose, just in different situations. Feels like a useful handle for forcing the trade-offs back into discussions and bringing back politics to anything that gets flattened by an "objective measure"
Congrats on posting a question rather than an answer; imagine if this was more common on the internet!
I offer some suggestions based on your specific question: "why do we always tend to side with “the data?” “
As you are no doubt aware (hence your scare quotes I guess) there is no such thing as “the” data. There is only the data you choose to collect and pay attention to. And often one chooses to declare the data to be a “fact.”
Facts are often taken as primitives of the world. Following Chaim Perelman (The New Rhetoric: A Treatise on Argumentation) I think otherwise: rather than saying “this here is a fact, and thus I do not need to question it or justify it” we could instead say “I do not wish to have to further justify this here thing, so I will give it a special name, and that name is fact”. For machine learning folks, data is always taken to be a fact.
You observing that quantification is based on counting, and of course to count first you need categories (this many apples, this many oranges). Funnily enough, one of the first books on statistics (John Venn’s The Logic of Chance, already in its 3rd edition in 1888) recognised the unavoidability of categorisation (this underpins the poorly named reference class “problem”). So I’ld say that the problem of quantification is really better framed as a problem of categorisation. Then it is clear that there are many choices, and it is not given by the world, but a choice you make (Categories are in our heads, not objective features of the world). The choice of categories is THE big question (I guess you know Bowker and Star’s work on this, Sorting Things Out: Classification and its Consequences).
Thus I do not think this is actually anything to do with computation or computers. Computers are merely the means. The big choice is what is counted and what is done with it. You can collect numbers on people just fine; it is only when you ordinalize (rank) that certain harms are done … so just don’t be a ranker (if you need a concrete example, think of professors and publication counts).
Regarding the trade-offs (technologies of quantification comes with good and bad effects), well this is true of every technology, and the actuarial technologies are no different. The only difference is that disentangling who is harmed and who benefits is trickier — this comes down to how you aggregate (and essentially the reference class problem again). So rather than bemoaning that quantification always seems to win, I would say why not just treat this like we (should!) treat other technologies, and take care to analyse the harms and benefits (rather than naively presuming that just because there are some benefits we are somehow obliged to use it) and pay attention to who is harmed and who benefits (again with other technologies, it is usually not the same groups of people).
One final thought: as I argued (contra to your stance on benchmarks, but oddly aligned with your recapitulation of Meehl on actuarial vs clinical) if you judge your technologies by narrow actuarial means then a narrow actuarial technology will come out on top. As I briefly mention in my "Rhetoric of Machine Learning" https://arxiv.org/abs/2604.06754 this is an example of “self-authentication” - a circular style of reasoning that just reinforces your position: how do we know that (quantifying people; benchmarking algorithms) is good: because when we do the evaluation (by quantifying people; benchmarking algorithms) it LOOKs good by that measure. If you stay inside this self-vindicating bubble, you will see nothing else.
So my take-aways:
1) The technologies you talk about are still technologies, and like all technologies they are both good and bad — so concluding that because there is some good we must adopt them is plain silly — it is always a tradeoff.
2) The actuarial nature of technologies of quantification means that issues of how to aggregate are front and centre; this was known for decades in economics and philosophy: the debate between John Rawls and John Harsanyi can be framed in terms of which mathematical aggregation functional to use — the max or the sum, and in ML we can interpolate between the two using CVaR! (see my "Fairness Risk Measures"). Choosing the average as how to aggregate is a particular choice (with consequences); other choices will lead to different conclusions.
3) Perhaps most important: data is not a fact unless we declare it to be one. I think we would be better served using the less well known “capta” (taken) rather than “data” (given). Presuming that data really truely and “objectively” tells you the real state of the world is the single biggest blindspot of the ML community (bigger I would say that the “bias bias” that Gigerenzer talked of and that you mentioned). It’s not the computers. It’s not the models. It’s not the data that you were given. It is the data that you chose to measure…. You could choose something else.
The caveat is not in the quantification per se (using reducio ad absurdum, imagine how decision making would look like without any quantication), but rather in what's being quantified (the model) and how it's interpreted.
Since quantification is always only a model, there's always simplification and reduction of the reality. In some situations this reduction is insignificant (physics), in the others (economics) it reflects very much who the observer is (the second order observation). Many situations can also be somewhere in between.
It's also common that people have a wrong understanding about what measurement is or can be good for. They are searching for 'preciness', but measurement is only reduction of uncertainty. D. Hubbard wrote an excellent book on this topic: How to Measure Anything
Information got compressed to different levels via several representations for storage and communication: natural languages, images, formal equations, etc., each serves some purpose for some scenarios. Some representations of information, like numbers, are better than others to be “operated” (stored, communicated, etc.) at scale. Therefore in a time when “at scale” operation is needed, quantification-to-number surges.
The decisions about how to quantify are often hidden in processing. The Byte-Pair Encoding (BPE) algorithm has a long lineage in language processing and is the default choice for Large Language Models (LLMs). What goes unspoken is the use of the space as a word separator. The same space that breaks up article references, or statute and caselaw citations. How we tokenize makes a difference.
"This paper is a philosophical evaluation of current "decision theory" and the pragmatic theory of induction. Its main argument is that there can be no theory without measurement, and that we have no method as yet of performing measurements relative to decisions. Statements made about "rationality" and "optimality" of decisions are premature. In order to perform measurements of values (or preferences) we will have to precommit ourselves to a general decision theory, because measurement is the most intricate and complex of all human decision processes. Indeed, we will be fortunate if we find one decision theory adequate to the task of generating controlled value measurements."
Jacques Ellul's writing is a great fit for this impossibly long bibliography. Here's an excerpt from his book "The Technological System," which was a follow-up to "The Technological Society."
"The computer faces us squarely with the contradiction already announced throughout the technological movement and brought to its complete rigor–between the rational (problems posed because of the computer and the answers given) and the irrational (human attitudes and tendencies). The computer glaringly exposes anything irrational in a human decision, showing that a choice considered reasonable is actually emotional. It does not follow that this is translation into an absolute rationality; but plainly, this conflict introduces man into a cultural universe that is different from anything he has ever known before. Man's central, his–I might say–metaphysical problem is no longer the existence of God and his own existence in terms of that sacred mystery. The problem is now the conflict between that absolute rationality and what has hitherto constituted his person. That is the pivot of all present-day reflection, and, for a long time, it will remain the only philosophical issue."
Aren't there also complementary traps when we don't quantify? It seems that one should add some balance in reading from Kahneman/Tversky (e.g., "Thinking, Fast and Slow".) Without some sort of objective standard (not necessarily reduced to single numbers) how do we ensure that we are avoiding our biases?
The word "bias" carries unearned objective authority. It's funny that you bring up KT as there are two very smart people so captured by an ideology that they can't see how their narrow biases shoehorn their laboratory techniques into finding "bias" everywhere they look. Gigerenzer's assessment of this bias for bias in behavioral economics is worth a read.
I suppose that today it is needless to say that unquantification is a trap, on the other hand metrics and quantification have the aura of being "objective" and "reliable" ways to perform decision making.
Much of the time I think the 'rush to quantify' is just a panicked kid grasping for the edge of the pool- it's an unfounded conviction that the numbers (and more numbers, always more) will *always* make for better decisions that just so happens to alleviate the existential terror of responsibility and trust, to oneself or others. That's not all bad! Some judgement is unfair and unfounded and, look, science! But of course there are diminishing (and negative) returns to data, just like everything else, and that this point I think the correct default response to 'we're going to revolutionize X through Stasi-esque levels of tracking' is 'eh, probs not.'
There's a whole tranche of science-scented people out there, engineers and technologists and the like, that seem to think that an insistence on measuring and tracking an objective marker is uniformly a *replacement* for irrational vibes, a wholly distinct and superior way of examining the world that swapped out the module of their brain that sought the word of God in burnt entrails, when just watching them it's clear that it's just a new flavor, a way of getting all the relief that the world is unfolding as it should you get from augury but with a dollop of superiority on top that they are doing things with *numbers*, and all the other kids were bad at numbers in school, but they weren't and so are Smart and Good.
Nice entry. As you say, ". . . facilitates posing clear questions and objectives, though crowds out nuance and multiplicity." From National Research Council, 1989, 1996: "Every way of summarizing deaths embodies its own set of values (National Research Council, 1989). For example, reduction in life expectancy treats deaths of young people as more important than deaths of older people, who have less life expectancy to lose. Simply counting fatalities treats deaths of the old and the young as equivalent; it also treats as equivalent deaths that come immediately after mishaps and deaths that follow painful and debilitating disease. Also in the case of delayed illness and death, a simple count of adverse outcomes places no value on what happens to exposed people who may spend years living in daily fear of illness, even if they ultimately do not die from the hazard.
Using number of deaths as the summary indicator of risk implies that it is as important to prevent deaths of people who engage in an activity by choice as it is to prevent deaths of those who bear its effects unwillingly. Thus, the death of a motorcyclist in an accident is given the same weight as the death of the pedestrian hit by the motorcycle. It also implies that it is as important to protect people who have been benefiting from a risky activity or technology as it is to protect those who get no benefit from it. One can easily imagine a range of arguments to justify different kinds of unequal weightings for different kinds of deaths, but to arrive at any selection requires a judgment about which deaths one considers most undesirable. To treat all deaths as equal also involves a judgment. In sum, even so simple and fundamental a choice as how to measure fatalities is value laden. It can present a dilemma in which no single summary measure, no matter how carefully the underlying analysis is done, can satisfy the expectations of all the participants in a risk decision process."
I have really enjoyed these posts on the violence done to nuance through quantification. I found two points in your book, that if you can define a product (as opposed to a process), probably AI will get to it sooner than latter and that the need to quantify--and what can be quantified--has gone hand-in-hand with computing power are so important to understand where we have landed up. I thought that David Graeber's Malinowski lecture on structural violence in bureaucracies also got at this from a completely different angle...would love to hear your take on that.
Yes, probably no single book had more influence on "The Irrational Decision" than Graeber's "Utopia of Rules," and the 2nd chapter of that book draws heavily from his Malinowski Lecture. I fully endorse both his theory of structural violence and his assessment of why social scientists and institutionalists are uncomfortable engaging with it.
Indeed, this is probably what bugs me most about Nguyen's "The Score." He's asking the same questions Graeber was asking, yet refuses to deal with the long history of gameplay in human societies and the role of structural violence in hierarchical control. The central question of Nguyen's book is far better addressed by structural ethnography than analytic philosophy.
Being real, is there actually anything that's better addressed by analytic philosophy than structural ethnography? I certainly am not so sure
LOL, Zoe. You know I acknowledge and endorse your predilections.
https://press.princeton.edu/books/hardcover/9780691639079/measures-and-men
Kula's book is a good one to add to this list.
This is a great one! I’m curious to think more about the different kinds of work done by quantification vs by abstraction, which I see as the other major process that can be involved in mathematization (and certainly either one of these two can be present without the other one)
What are your running examples of abstraction without quantification?
A clean example is encryption of text! And in general a lot of mathematical objects in cryptography involve abstraction without necessarily quantification
I very much agree with the point about needing shared language. One bit of vocabulary I keep wishing were more mainstream is the Type 1 vs Type 2 error intuition: any boundary is simultaneously too strict and too loose, just in different situations. Feels like a useful handle for forcing the trade-offs back into discussions and bringing back politics to anything that gets flattened by an "objective measure"
Congrats on posting a question rather than an answer; imagine if this was more common on the internet!
I offer some suggestions based on your specific question: "why do we always tend to side with “the data?” “
As you are no doubt aware (hence your scare quotes I guess) there is no such thing as “the” data. There is only the data you choose to collect and pay attention to. And often one chooses to declare the data to be a “fact.”
Facts are often taken as primitives of the world. Following Chaim Perelman (The New Rhetoric: A Treatise on Argumentation) I think otherwise: rather than saying “this here is a fact, and thus I do not need to question it or justify it” we could instead say “I do not wish to have to further justify this here thing, so I will give it a special name, and that name is fact”. For machine learning folks, data is always taken to be a fact.
You observing that quantification is based on counting, and of course to count first you need categories (this many apples, this many oranges). Funnily enough, one of the first books on statistics (John Venn’s The Logic of Chance, already in its 3rd edition in 1888) recognised the unavoidability of categorisation (this underpins the poorly named reference class “problem”). So I’ld say that the problem of quantification is really better framed as a problem of categorisation. Then it is clear that there are many choices, and it is not given by the world, but a choice you make (Categories are in our heads, not objective features of the world). The choice of categories is THE big question (I guess you know Bowker and Star’s work on this, Sorting Things Out: Classification and its Consequences).
Thus I do not think this is actually anything to do with computation or computers. Computers are merely the means. The big choice is what is counted and what is done with it. You can collect numbers on people just fine; it is only when you ordinalize (rank) that certain harms are done … so just don’t be a ranker (if you need a concrete example, think of professors and publication counts).
Regarding the trade-offs (technologies of quantification comes with good and bad effects), well this is true of every technology, and the actuarial technologies are no different. The only difference is that disentangling who is harmed and who benefits is trickier — this comes down to how you aggregate (and essentially the reference class problem again). So rather than bemoaning that quantification always seems to win, I would say why not just treat this like we (should!) treat other technologies, and take care to analyse the harms and benefits (rather than naively presuming that just because there are some benefits we are somehow obliged to use it) and pay attention to who is harmed and who benefits (again with other technologies, it is usually not the same groups of people).
One final thought: as I argued (contra to your stance on benchmarks, but oddly aligned with your recapitulation of Meehl on actuarial vs clinical) if you judge your technologies by narrow actuarial means then a narrow actuarial technology will come out on top. As I briefly mention in my "Rhetoric of Machine Learning" https://arxiv.org/abs/2604.06754 this is an example of “self-authentication” - a circular style of reasoning that just reinforces your position: how do we know that (quantifying people; benchmarking algorithms) is good: because when we do the evaluation (by quantifying people; benchmarking algorithms) it LOOKs good by that measure. If you stay inside this self-vindicating bubble, you will see nothing else.
So my take-aways:
1) The technologies you talk about are still technologies, and like all technologies they are both good and bad — so concluding that because there is some good we must adopt them is plain silly — it is always a tradeoff.
2) The actuarial nature of technologies of quantification means that issues of how to aggregate are front and centre; this was known for decades in economics and philosophy: the debate between John Rawls and John Harsanyi can be framed in terms of which mathematical aggregation functional to use — the max or the sum, and in ML we can interpolate between the two using CVaR! (see my "Fairness Risk Measures"). Choosing the average as how to aggregate is a particular choice (with consequences); other choices will lead to different conclusions.
3) Perhaps most important: data is not a fact unless we declare it to be one. I think we would be better served using the less well known “capta” (taken) rather than “data” (given). Presuming that data really truely and “objectively” tells you the real state of the world is the single biggest blindspot of the ML community (bigger I would say that the “bias bias” that Gigerenzer talked of and that you mentioned). It’s not the computers. It’s not the models. It’s not the data that you were given. It is the data that you chose to measure…. You could choose something else.
The caveat is not in the quantification per se (using reducio ad absurdum, imagine how decision making would look like without any quantication), but rather in what's being quantified (the model) and how it's interpreted.
Since quantification is always only a model, there's always simplification and reduction of the reality. In some situations this reduction is insignificant (physics), in the others (economics) it reflects very much who the observer is (the second order observation). Many situations can also be somewhere in between.
It's also common that people have a wrong understanding about what measurement is or can be good for. They are searching for 'preciness', but measurement is only reduction of uncertainty. D. Hubbard wrote an excellent book on this topic: How to Measure Anything
I felt it makes the problem way cleaner by forming quantification as information compression
Information got compressed to different levels via several representations for storage and communication: natural languages, images, formal equations, etc., each serves some purpose for some scenarios. Some representations of information, like numbers, are better than others to be “operated” (stored, communicated, etc.) at scale. Therefore in a time when “at scale” operation is needed, quantification-to-number surges.
The decisions about how to quantify are often hidden in processing. The Byte-Pair Encoding (BPE) algorithm has a long lineage in language processing and is the default choice for Large Language Models (LLMs). What goes unspoken is the use of the space as a word separator. The same space that breaks up article references, or statute and caselaw citations. How we tokenize makes a difference.
I just came across this 1956 paper by C. West Churchman that seems quite relevant: https://projecteuclid.org/ebooks/berkeley-symposium-on-mathematical-statistics-and-probability/Proceedings-of-the-Third-Berkeley-Symposium-on-Mathematical-Statistics-and/chapter/Problems-of-Value-Measurement-for-a-Theory-of-Induction-and/bsmsp/1200511855
"This paper is a philosophical evaluation of current "decision theory" and the pragmatic theory of induction. Its main argument is that there can be no theory without measurement, and that we have no method as yet of performing measurements relative to decisions. Statements made about "rationality" and "optimality" of decisions are premature. In order to perform measurements of values (or preferences) we will have to precommit ourselves to a general decision theory, because measurement is the most intricate and complex of all human decision processes. Indeed, we will be fortunate if we find one decision theory adequate to the task of generating controlled value measurements."
Jacques Ellul's writing is a great fit for this impossibly long bibliography. Here's an excerpt from his book "The Technological System," which was a follow-up to "The Technological Society."
"The computer faces us squarely with the contradiction already announced throughout the technological movement and brought to its complete rigor–between the rational (problems posed because of the computer and the answers given) and the irrational (human attitudes and tendencies). The computer glaringly exposes anything irrational in a human decision, showing that a choice considered reasonable is actually emotional. It does not follow that this is translation into an absolute rationality; but plainly, this conflict introduces man into a cultural universe that is different from anything he has ever known before. Man's central, his–I might say–metaphysical problem is no longer the existence of God and his own existence in terms of that sacred mystery. The problem is now the conflict between that absolute rationality and what has hitherto constituted his person. That is the pivot of all present-day reflection, and, for a long time, it will remain the only philosophical issue."
Aren't there also complementary traps when we don't quantify? It seems that one should add some balance in reading from Kahneman/Tversky (e.g., "Thinking, Fast and Slow".) Without some sort of objective standard (not necessarily reduced to single numbers) how do we ensure that we are avoiding our biases?
The word "bias" carries unearned objective authority. It's funny that you bring up KT as there are two very smart people so captured by an ideology that they can't see how their narrow biases shoehorn their laboratory techniques into finding "bias" everywhere they look. Gigerenzer's assessment of this bias for bias in behavioral economics is worth a read.
https://pure.mpg.de/rest/items/item_3037697_6/component/file_3047156/content
I suppose that today it is needless to say that unquantification is a trap, on the other hand metrics and quantification have the aura of being "objective" and "reliable" ways to perform decision making.