That's not how VLSI chip design works. You can't just take the RTL designed for 5 - 8 nm, zoom it up to 65nm and expect it to still work.
When you design a CPU or GPU, the RTL, like the core pipelines, schedulers, and various buses, are designed from the start on a certain manufacturing process where they're expected to work correctly at specific frequencies that are fast enough to feed the pipelines at the right timings, in order to get the top expected performance. Failure to meet the fabrication process expectations means the RTL design will perform much worse than expected in practice.
That's why many of Intel's past designs sucked so bad in the performance and efficiency category as their 10nm manufacturing process fell behind, so they had to scale their newer designs back on the aging 14+++++ process, which caused those CPUs to flop big time.