4-Turbo is a bit worse than 4 for my NLP work. But it's so much cheaper that I'll probably move every pipeline to using that. Depending on the exact problem it can even be comparable in quality/price to 3.5-turbo. However the fact that output tokens are limited to 4096 is a big asterisk on the 128k context.