"The dictionary" can be implemented in many ways; simple or clever, namespaced or not. A language without any form of local or global name binding would presumably need quotations ala Joy/Factor, which means your stacks probably have to store
objects rather than raw numbers. This tips the scale in the direction of a higher-level language.
The stacks are likewise a description of semantics rather than implementation. Some Forths keep the top few stack elements in registers to reduce overhead. If the return stack isn't user-accessible (r> >r etc.) then, again, you're straying closer to a higher-level functional language than a Forth.
There are many kinds of threaded code. For example, subroutine-threaded code uses native subroutine call instructions in word bodies and does away with the inner interpreter. Threaded code is a natural consequence of disentangling conventional "activation records" into two stacks which grow and shrink independently. You could have a Forth without threaded code, I suppose.
Some Forths attempt to "seal off" or otherwise obscure some parts of their own internal workings from "user programs". This is more common among dialects which try to adhere to ANS specs, like GForth. This isn't a total anathema to Forthiness, but it tends to introduce additional complexity. If a word is useful for implementing a forth kernel, why couldn't it be useful in implementing other functionality, too?
If you're building something higher-level which vaguely resembles Forth, it's probably better to describe it as a concatenative language. A dependent type system doesn't sound very Forthy, imo.