The trend for power consumption of compute (Megaflops per watt) has generally tracked with Koomey’s law for a doubling every 1.57 years
Then you also have model performance improving with compression. For example, Llama 3.1’s 8B outperforming the original Llama 65B