We take several days (2-3) on 8 Titan X GPUs to train our models, which is quite a lot of compute. Running on mobile devices is quite challenging – the inference is not yet fast enough to support that, and has only been optimized for x86 AVX2 CPUs. It may be possible with a fair amount of future work!