The MTP (Multi-Token Prediction) loss combined with stable full-task RL is an interesting training approach - curious how much the MTP specifically contributes to the 94.62 OmniDocBench score vs the RL component alone. At 0.9B params with vLLM/SGLang support, this looks very deployable. The PP-DocLayout-V3 integration for layout analysis before recognition is smart - most OCR failures I've seen come from poor region detection on complex documents rather than the recognition itself.