3

For learning purposes, I made a minimal TensorFlow.js re-implementation of Karpathy’s minGPT (Generative Pre-trained Transformer). One nice side effect of having the 300-lines-of-code model in a .ts file is that you can train it on a GPU in the browser.

https://github.com/trekhleb/...

The Python and Pytorch version still seems much more elegant and easy to read though...

Comments
  • 1
    @retoor I've tried to train ~80M GPT parameters on a single GPU in the browser so far. Pretty heavy. It is interesting to see how 1.5B parameter will behave...
  • 0
    @retoor I'm not sure, it probably depends on the model configuration/implementation and equipment. But in the browser, for that "homemade GPT", I see that training on WebGPU is around x100 - x1000 times faster than CPU
Add Comment