The prime problem is latency, which becomes both noticeably larger and unpredictable. Both those aspects cause huge loss of ergonomics. It's quite easy to tell whether a piece of hardware is doing things directly in firmware, or if it's running a full OS with a software stack - the latter has neither instant nor predictable feedback.
A good example would be phones, for anyone who remembers the transition from feature phones to iOS and Android smartphones. That transition is where we lost predictable input latency, and thus ability to develop muscle memory and operate device without looking at it. E.g. texting people while keeping the phone in your pocket was a typical thing teenagers did, that became impossible with modern smartphones.