Kind of. Appropriately normalizing the primes with some kind of function based on log(log(n)) would probably allow the article's technique to perform well (still struggling on larger primes -- you can't avoid needing a lot of bits to express those gaps), but it's precisely the shifting density which makes the problem hard to learn with any standard algorithm, and I don't think theirs would be an exception since they explicitly rely on gaps drawn from some fixed distribution.