NVIDIA’s latest entry, the GeForce RTX 5090, has raised the bar for inference performance on the DeepSeek R1, leaving AMD’s RX 7900 XTX trailing behind thanks to the advanced fifth-generation Tensor Cores.
### Seamless Access to DeepSeek’s Models with NVIDIA’s New RTX GPUs Offering Top-Notch Performance
It appears that consumer GPUs have become a fantastic option for running high-end LLM models right on your own computer. Both NVIDIA and AMD are striving to optimize the conditions for this kind of execution. Just recently, AMD put its RDNA 3 flagship GPU to the test with the DeepSeek R1 LLM model. Not wanting to be outdone, NVIDIA has answered by showcasing their inference benchmarks using the latest RTX Blackwell GPUs. The results? The GeForce RTX 5090 is clearly leading the pack.
On a variety of DeepSeek R1 models, the GeForce RTX 5090 shows a noticeable advantage over the Radeon RX 7900 XTX, as well as previous generations. It has managed to process up to 200 tokens per second in models like Distill Qwen 7b and Distill Llama 8b, which is nearly double what AMD’s RX 7900 XTX could manage. This positions NVIDIA’s GPUs as leaders in AI performance, making it likely that we’ll see more edge AI on consumer PCs, thanks largely to the robust “RTX on AI” support.
For those curious about running DeepSeek R1 on NVIDIA’s RTX GPUs, the company has crafted an easy-to-follow blog post, likening it to using an online chatbot. Here’s a sneak peek of how you can dive into it:
> To enable developers to tap into these powerful capabilities and craft their own specialized tools, the hefty 671-billion-parameter DeepSeek-R1 model is now accessible as an NVIDIA NIM microservice preview on build.nvidia.com. Astonishingly, the DeepSeek-R1 NIM microservice can manage up to 3,872 tokens per second on a singular NVIDIA HGX H200 system.
>
> Developers are encouraged to get hands-on with the API, which will soon be available for download as a part of the NIM microservice integrated into the NVIDIA AI Enterprise software platform.
>
> The DeepSeek-R1 NIM microservice encourages easy deployments using widely recognized APIs. Enterprises can further enhance their security and data safety by running the NIM microservice on their chosen accelerated computing setups.
>
> – NVIDIA
Thanks to NVIDIA’s NIM, both developers and enthusiastic users can effortlessly test the AI model on local systems. Not only does this ensure your data remains secure, but if your hardware can keep up, it also means potentially enhanced performance when run locally.