Hi!
I'm building a C++ ML library. I built it because I don't like how python is the default language for ML. Currently it allows you to train neural nets on CPU with very fast training & inference speeds.
Planning to expand to CUDA and ROCm soon! 
15 messages · Page 1 of 1 (latest)
Hi!
I'm building a C++ ML library. I built it because I don't like how python is the default language for ML. Currently it allows you to train neural nets on CPU with very fast training & inference speeds.
Planning to expand to CUDA and ROCm soon! 
But Python isn't actually used for training or inference.
It's intended to be way faster than using a c++ pytorch wrapper
How would you achieve this?
How are you comparing this?
Using pytorch nn.sequential with the same config
Is your implementation doing the same thing as pytorch? Does it produce equivalent or better models?
the math done is the same, standard stochastic gradient descent
There is randomness involved with the initialization of the weight vector
I could try benchmarking them more when I have the time
Training a
Linear(784, 128),
ReLU(),
Linear(128, 10)
model, which is 100k parameters on 50k training set data on mnist took 1 second per epoch on my AMD Ryzen 5 7535HS