According to the companies, the system is designed to understand user intent and carry out the full transaction flow without requiring traditional app navigation. The experience will initially roll out with Swiggy as an early partner on the Indus App, where users will be able to order food by simply speaking to an AI assistant.
According to a CNBC report, the companies said similar voice-based commerce capabilities can also be integrated by businesses into their own platforms. As part of an early rollout, a conversational assistant has already been deployed on The Derma Co website, allowing users to browse and purchase products using voice commands.
As part of the collaboration, Sarvam’s technology will also be integrated into Razorpay’s Agent Studio, enabling developers to build multilingual AI agents that can interact with users in languages such as Hindi and Hinglish.
The companies said the move aims to make digital commerce more accessible for India’s multilingual user base, with AI agents handling everything from product discovery to checkout in a single conversational flow.
Razorpay collaborates with Gnani.ai
Earlier last month, Razorpay also announced a partnership with the Indian AI startup Gnani.ai, which focuses on a more specific use case compared to the broader voice commerce push seen with Sarvam AI. Here, the companies have introduced an agentic AI collections platform that enables businesses to complete payment transactions during live customer calls. The system allows an AI agent to assess intent, generate payment requests (such as UPI links), and confirm transactions within the same interaction.
Unlike the Sarvam AI collaboration, which is centred on end-to-end conversational commerce (discovery to checkout), this platform is targeted at automating payment collections. It integrates Gnani.ai’s voice AI with Razorpay’s payments infrastructure to handle the full payment workflow in real time, including verification, link generation, tracking, and confirmation, making it more focused on financial operations rather than general consumer transactions.
What is Sarvam AI
Sarvam AI is a Bengaluru-based startup focused on building speech, language, and multimodal AI systems tailored for Indian use cases. Instead of a single general-purpose chatbot, the company develops specialised models for tasks such as speech recognition, text-to-speech, translation, and document understanding, with a strong focus on Indian languages and formats.
Its portfolio includes Saaras for speech recognition, Bulbul for text-to-speech, Saarika for transcription, Mayura for translation, Sarvam-M, a multilingual reasoning model, and more. On the vision side, Sarvam Vision handles OCR and document analysis, while applications like Samvaad enable voice-based interactions.