A CPU is, at root, a massive Boolean circuit wrapped in a flip-flop and some persisted state. The binary [sequence corresponding to an] opcode is just an input that determines which inputs go where.
Thus, for an n-bit opcode width, there are 2^n valid opcodes. Only m of them will correspond to intelligible, "I might want to use that some day" instructions. (Others, as you note, will be functionally equivalent versions of the m and ignorable as well.)
And so you will have 2^n - m "undocumented opcodes".
[1] based on reading NAND to Tetris