With the Wikipedia sample, I kinda suspect the central low note is the same phenomenon as binaural beats - 400hz and 800hz constructively interfere to produce 400hz. There is no 400hz on the high side, but we perceive interference even though there is none.
Which means it should disappear or produce a third tone if it wasn't an exact doubling. I haven't hunted down a sample to test though.