In the DPO paper linked from the OP page, DPO is described as "a simple RL-free algorithm for training language models from preferences." So as you say, "not technically RL."
Given that, shouldn't the first sentence on the linked page end with "...in a process known as DPO (...)" ? Ditto for the title.
It sounds like you're saying that the terms RL and RLHF should subsume DPO because they both solve the same problem, with similar results. But they're different techniques, and there are established terms for both of them.