Autoscaling doesn't always help with hot shards (which I think gp was referring to) because you can have a single shard go over its share of the throughput[0] while still having a low total throughput.
This has largely been resolved, a single shard can now consume more of the throughput than your equation would give you. AWS refer to it as Adaptive Capacity