Yes, this is the part that sounds like parody to me. At least, as a working statistician, I can tell you that the concept of AutoML could not apply to the far majority of things I work on.
It walks through an example with arsenic data in wells and a problem of estimating how distance, education and some other factors relate to a person’s willingness to travel to a clean well for water.
Deciding on how to standardize the input features, how to rescale for regression coefficients to be interpretable in meaningful human units, how to interpret statistics of the fitted model to decide whether a feature is helping or hurting by adding it (since this cannot be deduced from raw accuracy metrics alone), how to interpret deviance residual plots for outlier analysis, etc.
All those things have nothing to do with changing the architecture of the model, except possibly including or excluding features, and in that example there were no hyperparameters to tune, and the inference problem would not make sense for hyperparameter tuning on raw accuracy outputs anyway, since the goal was not optimizing prediction but rather understanding impact of features that have semantic meaning in the contexf of possible policy choices that could be adopted.
By way of contrast, applying an automated subset selection algorithm to automatically choose the features would be a naive idea with likely bad results in that case, and setting up an optimization framework that would optimize over possible transformations or standardizations of the inputs seems equally dubious compared with expert, context-aware human judgment.
And this is a very trivial example. If you modify a problem like this to address causal inference goals, or add some type of cost optimization on top of it, it becomes more and more complex, but exactly in a way that a tool like AutoML can’t help with.
In other words, making an AutoML that can truly apply to all types of estimation or inference problems is no easier than solving strong AI computer vision and natural language problems entirely, since you need contextual reasoning and creative proposals for inventing features and sleuthing the goodness of fit of a certain model architecture in light of the human-level inference goal you’re trying to reach.