Why it matters: Nvidia has released a demo application that leverages select video cards to run a personal AI chatbot on your PC. Chat with RTX runs on Windows 11 and requires a GeForce RTX 30 or 40 Series GPU (or an RTX Ampere or Ada Generation GPU). It uses retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration to create a personalized GPT large language model based on your own unique content.
You can also feed it videos from the web, including clips from YouTube. Simply plug in the URL of the source video, then ask the chatbot any question related to the content in the video.
As the name suggests, retrieval-augmented generation is a technique that enhances the accuracy and reliability of generative AI models using facts gathered from external sources. Nvidia has a full write-up on the subject for those seeking a deeper understanding of the technique.
Because it runs locally instead of in the cloud and is trained on your own personal data, it should be both fast and contextually relevant. Nvidia also says it delivers secure results – plausible considering it isn’t transmitting sensitive data over the internet.
Tom Warren from The Verge spent some time using a pre-release version of Chat with RTX and although it was a little rough around the edges, the editor said he can see it being a valuable tool for journalists or anyone needing to analyze a set of documents.
For example, Warren was able to have the bot summarize Microsoft’s entire Xbox Game Pass strategy using legal documents from Redmond’s court battle with the FTC. Things were a bit buggier on the video side, however, as the app somehow loaded a transcript for a completely separate video instead of the intended one. Notably, it was not even a video that Warren had previously queried.
If you have a GeForce 30 or 40 Series GPU and want to give Chat with RTX a spin, head over to Nvidia’s website to grab the installation file.