If you want to stay within the Apple ecosystem without the TV part, you could use an AVR with airplay built-in. Or get an AirPort Express, which can join a wifi network and become an Airplay client, and connect it via optical (mini toslink) to an AVR. And control it all from a phone or Mac.
Unfortunately as of a few years ago Google TV/Android TV forces ads for useless content in the home screen, taking up bandwidth and slowing load times. The 'dumb' Chromecasts can still talk to the AVR over HDMI-CEC to turn on power, adjust volume, etc.
gp is most likely using a display that quickly boots into "source" mode – think hdmi input
It’s actually insane to me.
I'm not, honestly. Think of AVR-integrated radio receivers and hi-fi CD players: a typical appliance-grade (non-raster) VFD/LCD display is sufficient for navigating through radio stations and CD tracks; I will admit that Alexa-style voice-control can work quite well for online services like Spotify or Apple Music, but even then I find myself frequently needing to reach for my phone (and wait for Amazon's webview-based Echo app to load) for anything nontrivial.
While a good modern TV can show a picture from standby in a few seconds, it "feels wrong" to me to have to turn-on an eye-burningly-bright main living-room TV just to select a song to play.