In fact it's not clear that anyone can use a standardized, open API for decryption modules and meet content providers' security demands. While some of them were historically willing to use Flash which did use standard browser APIs, they've taken this as an opportunity to demand more.
It looks like the idea is to implement a system where encrypted content is passed to the browser. The key is then sent through the browser to the CDM, the CDM can take the content and hand back decrypted frames.
It's entirely possible for content vendors to support multiple CDMs for different browser or OS combinations. The advantage of this is that the CDM is a smaller dependency than something like silverlight, so you can have a standard HTML5 video player interface across platforms and just swap out CDMs.
The only way it will work is if the restrictions module handles everything from decryption, decoding, to rendering. Probably even using a hardware DRM scheme and preventing any interaction of the video data with the JavaScript or any other website elements.
So the CDM isn't necessarily seperate from the browser itself - that's left totally as an implementation decision, and at least one widespread implementation (IE11) is integrating the CDM tightly into the browser. Also, even if the CDM is seperate it can render frames directly to the screen without passing through the browser. In particular, note that:
"Where media rendering is not performed by the UA, for example in the case of a hardware protected media pipeline, then the full set of HTML rendering capabilities, for example CSS Transforms, may not be available. One likely restriction is that video media may be constrained to appear only in rectangular regions with sides parallel to the edges of the window and with normal orientation."
So basically, just like with existing plugins, encrypted content is an opaque rectangle plonked on top of the web page that's not part of the browser's normal rendering pathway.
This scheme basically just creates a special class of plugins; these plugins clearly won't be OS-agnostic, because they can't -- that's the whole point of the exercise: to restrict playback to devices that are fully authorised/controlled from top to bottom, with the browser piping streams from the web to trusted plugins running on trusted OSes using trusted hardware (TPM etc).
If I want to watch netflix or play GTA5 on my haiku box I'm SOL as it is unless there is a business case to be made for doing the port.
This actually makes things easier. For example netflix currently uses silverlight for their streaming, this means that in order to watch netflix you need something that supports the entire silverlight stack.
With this proposal all you need is modern browser and a compatible CDM which is a much smaller chunk of code.