I would say that both the client and the media server (SFU) side of the work are challenging in their own ways, if you are trying to support a large variety of use cases, features, and sessions with large numbers of participants.
The client-side and server-side code end up being tightly coupled and you end up having a lot more client-side code than maybe is obvious if you're building an application that uses WebRTC in one specific way. For example, handling fast subscription to and un-subscription from batches of tracks is non-trivial, but important if you're implementing "grid mode" client views.
The goal of the approach we're taking here is to be able to support a bunch of different platforms at the same level of performance, stability, and feature parity. Web, iOS, and Android are the three most important platforms. But people are also using WebRTC on Flutter, native Linux, macOS, Windows, Unreal, Unity, and various embedded platforms.