Metadata
- Author: readwise.io
- Full Title:: Statistical_Rethinking_2nd_Edition
- Category:: 🗞️Articles
- URL:: https://readwise.io/reader/document_raw_content/20664484
- Read date:: 2025-03-23
Highlights
Rather than idealized angels of reason, scientific models are powerful clay robots without intent of their own, bumbling along according to the myopic instructions they embody. Like with Rabbi Judah’s golem, the golems of science are wisely regarded with both awe and apprehension. We absolutely have to use them, but doing so always entails some risk. (View Highlight)
Without our guidance and skepticism, pre-manufactured golems may do nothing useful at all. Worse, they might wreck Prague. What researchers need is some unified theoryofgolem engineering, a set ofprinciples for designing, building, and refining special-purpose statistical procedures. Everymajor branch of statistical philosophy possesses such a unified theory. But the theory is never taught in introductory—and often not even in advanced—courses. So there are benefits in rethinking statistical inference as a set of strategies, instead of a set ofpre-made tools. (View Highlight)
The greatest obstacle that I encounter among students and colleagues is the tacit belief that the proper objective of statistical inference is to test null hypotheses.3 This is the proper objective, the thinking goes, because Karl Popper argued that science advances by falsifying hypotheses. Karl Popper (1902–1994) is possibly themost influential philosopher ofscience, at least among scientists. He did persuasively argue that science works better by developing hypotheses that are, in principle, falsifiable. (View Highlight)
But the above is a kind of folk Popperism, an informal philosophy of science common among scientists but not among philosophers ofscience. Science is not described by the falsification standard, and Popper recognized that.4 (View Highlight)
Explicitly compare predictions ofmore than one model, and you can save yourself from some ordinary kinds of folly. (View Highlight)
that it is a good practice to design experiments and observations that can differentiate competing hypotheses. But in many cases, the comparison must be probabilistic, (View Highlight)
But falsification is always consensual, not logical. In light ofthe real problems ofmeasurement error and the continuous nature of natural phenomena, scientific communities argue towards consensus about the meaning ofevidence. (View Highlight)
Bayesian data analysis takes a question in the formofamodel and uses logic to produce an answer in the form ofprobability distributions. (View Highlight)
The frequentist approach requires that all probabilities be defined by connection to the frequencies of events in very large samples. (View Highlight)
Nothing in the real world—excepting controversial interpretations of quantum physics—is actually random. Presumably, ifwe had more information, we could exactly predict everything. We just use randomness to describe our uncertainty in the face of incomplete knowledge. From the perspective of our golem, the coin toss is “random,” (View Highlight)
the Bayesian framework presents a distinct pedagogical advantage: many people find it more intuitive. Perhaps best evidence for this is that verymany scientists interpret non-Bayesian results in Bayesian terms, for example interpreting ordinary p-values as Bayesian posterior probabilities and non-Bayesian confidence intervals as Bayesian ones (View Highlight)
Multilevelmodels—also known as hierarchical, randomeffects, varying effects, or mixed effects models—are becoming de rigueur in the biological and social sciences. (View Highlight)
I want to convince the reader of something that appears unreasonable: multilevel regression deserves to be the default form of regression (View Highlight)
Successful prediction does not require correct causal identification. In fact, as you’ll see later in the book, predictions may actually improve when we use a model that is casually misleading. (View Highlight)
Often the point of statistical modeling is to produce understanding that leads to generalization and application. In that case, we needmore than just good predictions, in the absence of intervention. We also need an accurate causal understanding (View Highlight)
comparing models on the basis of predictive accuracy—or p-values or anything else—will not necessarily produce it. (View Highlight)
Causal inference requires a causal model that is separate from the statistical model. The data are not enough (View Highlight)
it’s a good bet thatmost animals are not Bayesian, ifonly because being Bayesian is expensive and depends upon having a goodmodel. Instead, animals use various heuristics that are fit to their environments, past or present. These heuristics take adaptive shortcuts and so may outperform a rigorous Bayesian analysis, once costs of information gathering and processing (and overfitting, Chapter 6) are taken into account.40 Once you already know which information to ignore or attend to, being fully Bayesian is a waste. It’s neither necessary nor sufficient for making good decisions, as real animals demonstrate. But for human animals, Bayesian analysis provides a general way to discover relevant information and process it logically. Just don’t think that it is the only way. (View Highlight)
Which assumption should we use, when there is no previous information about the conjectures? The most common solution is to assign an equal number ofways that each conjecture could be correct, before seeing any data. This is sometimes known as the principle of indifference: When there is no reason to say that one conjecture ismore plausible than another, weigh all of the conjectures equally. This book does not use nor endorse “ignorance” priors. As we’ll see in later chapters, the structure of the model and the scientific context always provide information that allows us to do better than ignorance. (View Highlight)
The relative number of ways that a value p can produce the data is usually called a likelihood. It is derived by enumerating all the possible data sequences that could have happened and then eliminating those sequences inconsistent with the data. (View Highlight)
Designing a simple Bayesian model benefits from a design loop with three steps. (1) Data story: Motivate the model by narrating how the data might arise. (2) Update: Educate your model by feeding it the data. (3) Evaluate: All statistical models require supervision, leading possibly to model revision. (View Highlight)