tl;dr: The book is mainly about the application of resampling as a tool to analyze statistical significance. His advantage is that it does not require to fulfill assumptions.

For example, when we build a regressor, we can evaluate the significance of its goodness of fit by doing target shuffling: randomly shuffling the target vector and fitting a model multiple times, to obtain a distribution of the goodness of fit. If the original fit is well within that distribution, we can attribute it to chance (and thus, there is no statistical significance).

The Normal distribution was originally called the error distribution: data is not normally distributed, it is the distribution of errors after known causes of variability have been removed and all that is left is random noise (e.g., the height distribution of white woman in the US between 30-50 yo).

The key ingredient that catapulted Gallup to fame and success was the realization that a small representative sample is more accurate than a large sample that is not representative. There are now a variety of increasingly sophisticated devices that pollsters use to ensure representative results but at the root of all of them lies random sampling. (p. 107)

The key of random sampling is that every member of the population should have an equal chance of being selected.

Notes on terminology: