While you can train simple neural networks with relatively small amounts of training data with TensorFlow, for deep neural networks with large training datasets you really need to use CUDA-capable Nvidia GPUs, or Google TPUs, or FPGAs for acceleration. The alternative has, until recently, been to train on clusters of CPUs for weeks.
One of the innovations introduced with TensorFlow 2.0 is a JavaScript implementation, TensorFlow.js. I wouldn’t have expected that to improve training or inference speed, but it does, given its support for all GPUs (not just CUDA-capable GPUs) via the WebGL API.