Thanks for getting into some of the details ...
>> (1) One possibility is they are having capacity and/or infrastructure problems so the model performance is degraded.
> As far as I understand it, scaling issues would result in increased latency or requests being dropped, not model quality being lower.
Yes, many scaling issues would manifest in that way -- but not all. It seems plausible for Anthropic to have other ways to degrade model performance that don't show up in the latency or reliability metrics. I need to research more... (I'll try to think more on your other points later).