If you read the spec: "The Content Decryption Module (CDM) is a generic term for a part of or add-on to the user agent that provides functionality for one or more Key Systems. Implementations may or may not separate the implementations of CDMs and may or may not treat them as separate from the user agent."
So the CDM isn't necessarily seperate from the browser itself - that's left totally as an implementation decision, and at least one widespread implementation (IE11) is integrating the CDM tightly into the browser. Also, even if the CDM is seperate it can render frames directly to the screen without passing through the browser. In particular, note that:
"Where media rendering is not performed by the UA, for example in the case of a hardware protected media pipeline, then the full set of HTML rendering capabilities, for example CSS Transforms, may not be available. One likely restriction is that video media may be constrained to appear only in rectangular regions with sides parallel to the edges of the window and with normal orientation."
So basically, just like with existing plugins, encrypted content is an opaque rectangle plonked on top of the web page that's not part of the browser's normal rendering pathway.