They are intentionally similar, ProseMirror has some great ideas. We tried to tackle those ideas from an extensible angle where nodes themselves, which form part of a tree, are the core to everything in Lexical. There are no "marks", instead you just use properties on the given nodes and treat them in an immutable sense. Nodes also present a set of createDOM/updateDOM methods, that allow you to define you DOM element, and its properties, to be passed back to Lexical's DOM reconciler.
You can imagine the EditorState in Lexical as not only the source-of-truth for your editor's data, but also the virtual DOM for your view. Lexical diff's both and applies delta based on dirty node logic to improve performance. Furthermore, any mutations to the DOM outside of Lexical and reverted back to Lexical's EditorState – to preserver source of truth.
Lexical also promotes different types of "text" properties on the TextNode class. Such as if the node is segmented, tokenized etc. This is a far better accessibility experience, as it ensures you can build rich text nodes – like custom emojis, or hashtags, or mentions without bailing out to contentEditable="false" which is what almost every other editor promotes. Which is also a major hinderance in terms of the ability to properly utilizie NVDA, Jaws and VoiceOver to move selection through the characters (in most cases, they skip non contenteditable elements entirely, which is terrible).