Regarding the lack of transfer, yes, AlphaGo, AlphaZero and most of their variants have boards of fixed size and shape hard-coded in their architecture (as they have the types of piece moves-hard coded) and need architectural modifications and re-training before they can play on different boards or with different pieces (e.g. AlphaGo can't play Chess and Shoggi unmodified). The KataGo paper (the paper you linked) is one exception to this. Personally, I don't know others. Anyway general game-playing is a hard task and nobody claims it's solved by AlphaGo.
Regarding KataGo its main contribution is a significant reduction to the cost of training an AlpahGo variant while maintaining a competitive performance. This is very promising- after DeepBlue, creating a chess engine became cheaper and cheaper until they could run on a smartphone. We are far from that with Go computer players.
However, in the KataGo paper, major gains are claimed to come from a) game-playing specific or MCTS-specific improvements (playout cap randomisation, forced playouts and policy target pruning) or architecture-specific improvements (global pooling) or, b) domain-specific improvements (auxiliary ownership and score targets). Finally, KataGo has a few game-specific features (liberties, pass-alive regions and ladder features).
The KataGo paper itself says it very clearly. I quote, snipping for brevity:
Second, our work serves as a case study that there is still a significant efficiency gap between AlphaZero's methods and what is possible from self-play. We find nontrivial further gains from some domain-specific methods (...) We also find that a set of standard game-specific input features still significantly accelerates learning, showing that AlphaZero does not yet obsolete even simple additional tuning.
Finally, "it would obviously work so nobody tried" would make sense if it wasn't for the extremely competitive nature of machine learning research where every novel result is presented as a big breakthrough. Also, if something is obvious but never seems to make it to publication the chances are someone has tried and it didn't work as expected so they shelved the paper. We all know what happens to negative results in machine learning.