No. of Recommendations: 5
Thank you, quite interesting! But that link needed a "pro" subscription which I don't have, so I did some digging to get more info.
Based on the following, I don't think I agree with your conclusion.
This article was informative:
https://www.techinasia.com/news/indian-ai-startup-...Key excerpt:
"Google's own tests have demonstrated that CPUs can achieve competitive latencies for large language models, though typically requiring larger batch sizes to approach GPU efficiency"
I think this is a key point: if Google knows about it, then so does Nvidia.
Here's the reference about google's work in the techinasia link above
https://www.theregister.com/2024/10/29/cpu_gen_ai_...Excerpt from the "theregister" reference about Google's work on AI on CPUs
"Today, most GenAI models are trained and run on GPUs or some other specialized accelerator, but that doesn't mean they have to be. In fact, several chipmakers have suggested that CPUs are more than adequate for many enterprise AI use cases.
Now Google has rekindled the subject of CPU-based inference and fine-tuning, detailing its experience with the advanced matrix extensions baked into Intel's 4th-Gen (Sapphire Rapids) Xeon cores. "
It also goes on to discuss why CPUs for AI isn't the panacea one might initially think:
"While CPUs are beginning to catch up with lower-end GPUs in terms of memory bandwidth, they're still no match for high-end accelerators like Nvidia's H100 or AMD's MI300X, which boast multiple terabytes of bandwidth. Pricing also remains a challenge for CPUs. Intel's 6900P-series Granite Rapids parts will set you back between $11,400 and $17,800, and that's before you factor in the cost of memory."
My conclusion (could be wrong!):
It appears that this isn't some breakthrough of the Indian startup. Rather that they're trying to capitalize on something that's already known: inference (not training) of AI can be done acceptably well, i.e. with an acceptable performance hit, on CPUs compared to GPUs, if you can't really afford to access the high end GPUs. I gather that the Indian company wants to capitalize on this within India, where massive computing capacity with GPU's is sparse.
It seems that the CPU versus GPU story is well known i.e. Google has done lots of investigation and even published about it. So this isn't something taking Nvidia by surprise. They're a nimble company, if they see a market in CPU supported aspects of AI (as opposed to GPU), they'll have plans to deal with it.
My takeaway is that it was relatively well known in the field that if you can't afford massive GPU investments, as is the case in India, you can still stay somewhat in the game of AI at least on the inference and fine-tuning side with CPUs. The Indian company wants to try to make a go of this in India. But from what I read so far, I saw nothing approaching a knockout blow to Nvidia and GPUs.