I'm giving up for the night, but
https://github.com/Smaug123/whisper/pull/1/files at least contains the setup instructions that may help others get to this point. Got it working on the GPU, but it's… much much slower than the CPU? Presumably due to the 'aten::repeat_interleave.self_int' CPU fallback.
Also hitting a nice little PyTorch bug:
> File "/Users/patrick/Documents/GitHub/whisper/whisper/decoding.py", line 388, in apply
logits[:, self.tokenizer.encode(" ") + [self.tokenizer.eot]] = -np.inf
> RuntimeError: dst_.nbytes() >= dst_byte_offset INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Copy.mm":200, please report a bug to PyTorch.