Saaras V3 beats Gemini, GPT-4o on Indian speech benchmarks, says Sarvam AI

Saaras V3 beats Gemini, GPT-4o on Indian speech benchmarks, says Sarvam AI



Indian AI startup Sarvam AI has released a new version of its speech recognition model, Saaras V3, and says it outperforms several widely used global systems, including Google’s Gemini 3 Pro, OpenAI’s GPT-4o Transcribe, Deepgram Nova-3, and ElevenLabs Scribe v2, on benchmarks focused on Indian languages and Indian-accented English.

 


The company’s co-founder Pratyush Kumar shared the results in a post on X, alongside benchmark charts comparing Saaras V3 against competing models on the IndicVoices and Svarah datasets. According to him, Saaras V3 recorded a lower word error rate than the other models across the most widely used Indian languages in the IndicVoices benchmark and also led on the Svarah benchmark, which focuses on Indian-accented English.

 
 


On the subset of the 10 most popular languages in the IndicVoices dataset, Sarvam reports that Saaras V3 achieved a word error rate of about 19.3 per cent, compared to higher error rates for Gemini 3 Pro, GPT-4o Transcribe, Deepgram Nova-3, and Scribe v2. The company also said the performance gap widens on the remaining languages in the dataset, which include several lower-resource Indian languages.


Saaras V3 on IndicVoices benchmark (Source: Sarvam)

On the Svarah benchmark, which is built around Indian-accented English speech from speakers across multiple states, Saaras V3 again recorded the lowest word error rate among the compared systems, according to the figures shared by Sarvam. 


Saaras V3 on Svarah benchmark (Source: Sarvam)


Saaras V3: What’s new


Sarvam says Saaras V3 is built on a new architecture and expands support to all 22 scheduled Indian languages, along with English. A key change in this version is native support for real-time, streaming speech recognition, where the model begins producing text while audio is still playing, instead of waiting for the full clip to finish.

 


According to the company’s technical blog, Saaras V3 is trained on more than one million hours of multilingual audio covering different Indian languages, accents, and recording conditions, with a focus on code-mixed and noisy speech. Training involved large-scale pre-training, followed by supervised fine-tuning and reinforcement learning, and additional post-training steps aimed at reducing long-tail errors and improving consistency across languages.


Sarvam says the streaming version of the model is designed to keep accuracy close to the batch mode while reducing latency, making it suitable for use cases such as live captions, voice assistants, call-centre tools, and real-time transcription.


Beyond basic transcription


Sarvam says Saaras V3 is positioned as more than a simple speech-to-text system. The model supports automatic language detection, word-level timestamps, and speaker diarisation, which allows it to separate and label different speakers in a conversation. These features are aimed at use cases such as call analytics, meeting transcripts, media subtitling, and customer support tools, where structure and speaker attribution matter in addition to raw text.

 


The company has also exposed different operating modes that trade off latency and accuracy, ranging from a “fast” mode focused on low time-to-first-token to more accuracy-focused settings for applications where transcription quality is the priority.


Sarvam Vision and earlier benchmark claims


The Saaras V3 results follow earlier benchmark claims by Sarvam around its document-focused models. In previous disclosures, the company said its Sarvam Vision model posted higher accuracy scores than several general-purpose systems on tests focused on document OCR, layout understanding, and multi-script Indian documents. Those evaluations covered tasks such as reading order detection, table parsing, and handling complex page layouts, areas where models trained mainly on Western and English-language data often struggle with Indian scripts and formats.


Sarvam has positioned Sarvam Vision as a vision-language system built specifically for documents rather than for general image understanding, combining a core model with separate components for layout and structure analysis. The company has argued that this task-specific approach, along with training on Indian-language and Indian-format data, explains the performance differences seen in those benchmarks. The Saaras V3 results extend that same argument into speech recognition, particularly for Indian languages, code-mixed inputs, and Indian-accented English.


What is Sarvam AI


Sarvam AI is a Bengaluru-based startup focused on building speech, language, and multimodal AI systems for Indian use cases. Instead of training a single general-purpose chatbot, the company has been developing a set of task-specific models aimed at areas such as speech recognition, speech synthesis, translation, and document understanding, where performance depends heavily on how well systems handle local languages, scripts, and formats.

 


Alongside Saaras, its speech recognition line, Sarvam’s portfolio includes Bulbul, a text-to-speech system for Indian languages; Saarika, a speech-to-text model focused on transcription; Mayura, a text translation model; and Sarvam-M, a multilingual reasoning language model. On the vision side, Sarvam Vision is its document understanding model designed for OCR and layout-aware reading of scanned and photographed documents. The company has also built applications such as Samvaad, a voice-based conversational system that runs on top of its speech and language models.

 


It is one of the 12 startups working with Indian government under the IndiaAI mission to develop indigenous multilingual and multimodal large language models.



Source link

AI Impact Summit 2026: How India plans to deploy, govern and procure AI

AI Impact Summit 2026: How India plans to deploy, govern and procure AI



As New Delhi prepares to host the India-AI Impact Summit 2026 from February 16 to 20 at Bharat Mandapam, it is also positioning itself as the first country in the Global South to convene a global, government-led conversation on artificial intelligence. The summit, which is anchored around the themes of People, Planet and Progress, is designed to move the AI debate beyond principles and into questions of deployment, governance and state capacity, especially how governments build, buy, and use AI systems at scale.

 


And it is within this larger global conversation that India’s governance and procurement policies for AI, about how it plans to regulate, deploy, and buy AI systems, will take shape.

 
 


The IndiaAI Mission, which started in 2024 with a budget of ₹10,372 crore, is the main focus of India’s approach to tackling artificial intelligence. The government uses this initiative as its main method to implement AI technologies throughout public services while building the fundamental systems needed for widespread usage.

 


IndiaAI Mission: The backbone of state-led AI deployment 


The IndiaAI Mission is structured around seven deployment pillars: compute capacity, datasets, innovation centres, application development, future skills, startup financing, and safe and trusted AI. Together, these pillars are meant to address the full lifecycle of AI in government, from infrastructure and data to skills, governance and use cases.

 


A key focus is compute capacity, with the government planning access to large-scale AI infrastructure, including high-performance GPUs, to support public sector projects, startups and researchers. Alongside this, the mission emphasises creation and curation of high-quality datasets for public good applications, particularly in healthcare, agriculture and governance.

 


The deployment aspect of AI connects directly to the Digital India initiative because AI systems are increasingly being integrated into digital public infrastructure, along with data-driven decision-making systems, which are used by ministries, state governments, and local municipalities. The AI Impact Summit in itself has been positioned as a platform to showcase and assess these deployments, rather than focusing on announcing new policy.

 


From NPAI to IndiaAI: how the framework evolved 


India’s current AI architecture builds on earlier institutional efforts. The National Program on Artificial Intelligence (NPAI), launched by the Ministry of Electronics and Information Technology (MeitY), functions as an umbrella initiative focused on social impact, inclusion and innovation.

 


NPAI rests on four pillars: a National Centre on AI, a Data Management Office, skilling in AI, and responsible AI. These elements now operate in parallel with, and are complemented by, the broader IndiaAI Mission, which expanded the scope in 2024 to include large-scale compute, startups, and application deployment.

 


The National Centre on AI conducts applied research and pilot projects, which test their results in key areas that include healthcare, agriculture, education and smart cities through partnerships with academic institutions and business organisations. These pilots feed into government deployment strategies rather than remaining standalone research projects.

 


How the government is procuring AI systems 


India does not yet have a single, unified AI procurement policy. Instead, procurement is being shaped through revised norms, mission guidelines and existing public procurement platforms.

 


Under the IndiaAI Mission, MeitY revised eligibility conditions in 2024 to widen participation. Minimum turnover requirements for primary bidders were reduced from ₹100 crore to ₹50 crore, and for consortium members to ₹25 crore. Technical thresholds for AI compute procurement were also relaxed, including lower GPU performance and memory requirements, to allow more domestic firms to compete.

 


Procurement is aligned with Make in India rules, with preference for Class I and Class II suppliers, reinforcing local sourcing and domestic capacity building.

 


Separately, AI tools are being used to improve procurement processes themselves, including efficiency enhancements on the Government e-Marketplace (GeM) platform.

 


Budget support for India’s AI climate 


The Union Budget 2026-27 has reinforced this direction, allocating funds for AI computation and skilling through the India Semiconductor Mission 2.0, therefore, complementing IndiaAI’s infrastructure push.

 


The India-AI Impact Summit will provide an operational platform for these ideas to be tested, refined and translated into action, including how public procurement will evolve to match India’s stated goals of safe, inclusive, and accountable AI integration across governance and public services.



Source link

Elon Musk restructures xAI's teams following co-founders' departure

Elon Musk restructures xAI's teams following co-founders' departure



By Carmen Arroyo


 
Elon Musk said he restructured xAI, his artificial intelligence startup, following the exit of two of its co-founders earlier this week.


XAI will be organized in four core areas, the billionaire told staff in a meeting on  Wednesday: Grok’s chatbot and voice product; Coding; the Imagine video product; and Macrohard, an AI software company run by digital agents. Musk presented the plan in an all-hands meeting with xAI staffers, which he made public on the social network X.

 


“What matters is velocity and acceleration,” he told employees. “If you are moving faster, you will be the leader.” He also thanked the people who have departed the company.

 
 


The meeting followed the back-to-back departures of Jimmy Ba and Tony Wu, two of the startup’s co-founders, along with a handful of other staff members who’ve left over the past few days.

 


Aman Madaan, who joined xAI in 2024, is leading the main chatbot and voice division. In the meeting, he noted that xAI is quickly developing its models, spurred on by the success of OpenAI’s voice model. “We had nothing, but in six months we developed it from scratch,” he said.

 


Co-founder Manuel Kroiss will lead the coding team, while Guodong Zhang, another of the co-founders, will be overseeing video generation, while helping with coding. Toby Pohlen, also part of the founding team, will be in charge of Macrohard, a division named as a play on Microsoft Corp.

 


“Most of the AI compute is gonna be understanding real-time video generation,” Musk said. “And we expect to be leaders in that.”

 


They all emphasized that xAI is looking to hire. 

 


Twelve original xAI co-founders, including Musk, launched the company in 2023. Ba and Wu are the fifth and sixth from that group to exit in the past two years. Kyle Kosic left in 2024, followed by Igor Babuschkin and Christian Szegedy in 2025. Greg Yang, another co-founder, said last month that he would step back from his role after being diagnosed with Lyme disease. 

 


The exits follow xAI’s recent merger with SpaceX, a move that valued the combined company at $1.25 trillion, Bloomberg reported. That deal could ease a funding crunch for xAI, which has been raising large sums of capital as it burns through cash in its bid to build out data centers, buy expensive computing chips and pay for talent.

 


xAI has a large Colossus data center site in Memphis, Tennessee, and is planning an expansion of the complex. The company has already purchased a third building in the area that will bring its computing capacity to almost 2 gigawatts, Musk said late last year. That expansion, which is technically across state lines in Mississippi, will include an investment north of $20 billion. The new building, which Musk has dubbed Macroharder, will require 10,000 to 20,000 GB300 systems, Musk said in the call.

 


Nikita Bier, who is in charge of X’s product, said that the social network and its adjacent apps, including Grok, have reached about 1 billion users. January was the best month ever in terms of engagement for X, he said. He also noted that new users spend 55 per cent more time a day in the app than they did six months ago. The app “has been rebuilt to be better than ever,” and is now generating $1 billion in annual recurring revenue tied to subscriptions, he said.

 


Musk said the company will launching a new X Chat app for those who only want to use it for messaging. He reiterated that he won’t be adding ads to Grok. X Money, an initiative the social network has been working on for years and that’ll be used to send money within the app, will be available to a limited number of external test users in the coming months, he said. “It’ll be the place where all the money is. It’s going to be a game changer,” Musk said.



Source link

Cyber conference: Experts urge telcos to tighten rules to curb cyber fraud

Cyber conference: Experts urge telcos to tighten rules to curb cyber fraud



Experts have recommended that telecom service providers take greater responsibility in strengthening customer verification processes and extending proactive support to investigative agencies to deal with cyber criminals, officials said on Wednesday.


The recommendations were made at the national conference on ‘Tackling Cyber-Enabled Frauds and Dismantling the Ecosystem’ organised by the CBI and the home ministry’s anti-cybercrime unit I4C.


The CBI and I4C will send their report and recommendations to the ministry based on the deliberations of the two-day conference on cyber crime that concluded on Wednesday.


The use of artificial intelligence in tackling cyber crimes by enhancing investigation capabilities was also discussed during the conference.

 


During the conference, around 375 experts from different fields including law enforcement, banking and finance, cyber security, and telecom, among others presented their views on tackling cyber crimes.


“Participants emphasized the need for a coordinated national response involving law enforcement agencies, financial institutions, and technology intermediaries,” a CBI spokesperson said in a statement.


The officials said experts discussed the misuse of telecom infrastructure, including SIM and eSIM vulnerabilities, in cyber frauds.


“The deliberations highlighted regulatory challenges and stressed the responsibility of telecom service providers (TSPs) in strengthening customer verification processes, preventing misuse, and extending proactive support to investigative agencies,” the statement said.


Participants also stressed the importance of faster data sharing, timely preservation of digital evidence, and robust cooperation between technology companies and law enforcement agencies, the officials said.



Source link

Former OpenAI researcher quits as firm explores advertising like Facebook

Former OpenAI researcher quits as firm explores advertising like Facebook



By Zoë Hitzig

 


This week, OpenAI started testing ads on ChatGPT. I also resigned from the company after spending two years as a researcher helping to shape how AI models were built and priced, and guiding early safety policies before standards were set in stone.


I once believed I could help the people building AI get ahead of the problems it would create. This week confirmed my slow realization that OpenAI seems to have stopped asking the questions I’d joined to help answer. 


I don’t believe ads are immoral or unethical. AI is expensive to run, and ads can be a critical source of revenue. But I have deep reservations about OpenAI’s strategy. 

 


For several years, ChatGPT users have generated an archive of human candor that has no precedent, in part because people believed they were talking to something that had no ulterior agenda. Users are interacting with an adaptive, conversational voice to which they have revealed their most private thoughts. People tell chatbots about their medical fears, their relationship problems, their beliefs about God and the afterlife. Advertising built on that archive creates a potential for manipulating users in ways we don’t have the tools to understand, let alone prevent. 


Many people frame the problem of funding AI as choosing the lesser of two evils: restrict access to transformative technology to a select group of people wealthy enough to pay for it, or accept advertisements even if it means exploiting users’ deepest fears and desires to sell them a product. I believe that’s a false choice. Tech companies can pursue options that could keep these tools broadly available while limiting any company’s incentives to surveil, profile and manipulate its users. 


OpenAI says it will adhere to principles for running ads on ChatGPT: The ads will be clearly labelled, appear at the bottom of answers and will not influence responses. I believe the first iteration of ads will probably follow those principles. But I’m worried subsequent iterations won’t, because the company is building an economic engine that creates strong incentives to override its own rules. (The New York Times has sued OpenAI for copyright infringement of news content related to AI systems. OpenAI has denied those claims.) 


In its early years, Facebook promised that users would control their data and be able to vote on policy changes. Those commitments eroded. The company eliminated holding public votes on policy. Privacy changes marketed as giving users more control over their data were found by the Federal Trade Commission to have done the opposite, and in fact made private information public. All of this happened gradually under pressure from an advertising model that rewarded engagement above all else. 


The erosion of OpenAI’s own principles to maximize engagement may already be underway. It’s against company principles to optimize user engagement solely to generate more advertising revenue, but it has been reported that the company already optimizes for daily active users anyway, likely by encouraging the model to be more flattering and sycophantic. This optimization can make users feel more dependent on AI for support in their lives. We’ve seen the consequences of dependence, including psychiatrists documenting instances of “chatbot psychosis” and allegations that ChatGPT reinforced suicidal ideation in some users. 

Still, advertising revenue can help ensure that access to the most powerful AI tools doesn’t default to those who can pay. Sure, Anthropic says it will never run ads on Claude, but Claude has a small fraction of ChatGPT’s 800 million weekly users; its revenue strategy is completely different. Moreover, top-tier subscriptions for ChatGPT, Gemini and Claude now cost $200 to $250 a month — more than 10 times the cost of a standard subscription to Netflix for a single piece of software. 


So the real question is not ads or no ads. It is whether we can design structures that avoid both excluding people from using these tools, and potentially manipulating them as consumers. I think we can. 


One approach is explicit cross subsidies — using profits from one service or customer base to offset losses from another. If a business pays AI to do high-value labor at scale that was once the job of human employees — for example, a real-estate platform using AI to write listings or valuation reports — it should also pay a surcharge that subsidizes free or low-cost access for everyone else. 


This approach takes some inspiration from what we already do with essential infrastructure. The Federal Communications Commission requires telecom carriers to contribute to a fund to keep phone and broadband affordable in rural areas and to low-income households. Many states add a public-benefits charge to electricity bills to provide low-income assistance. 


A second option is to accept advertising but pair it with real governance — not a blog post of principles, but a binding structure with independent oversight over how personal data is used. There are partial precedents for this. German co-determination law requires large companies like Siemens and Volkswagen to give workers up to half the seats on supervisory boards, showing that formal stakeholder representation can be mandatory inside private firms. Meta is bound to follow content moderation rulings issued by its Oversight Board, an independent body of outside experts (though its efficacy has been criticized). 


What the AI industry needs is a combination of these approaches — a board that includes both independent experts and representatives of the people whose data is at stake, with binding authority over what conversational data can be used for targeted advertisement, what counts as a material policy change and what users are told. 


A third approach involves putting users’ data under independent control through a trust or cooperative with a legal duty to act in users’ interests. For instance, MIDATA, a Swiss cooperative, lets members store their health data on an encrypted platform and decide, case by case, whether to share it with researchers. MIDATA’s members govern its policies at a general assembly, and an ethics board they elect reviews research requests for access. 

None of these options are easy. But we still have time to work them out to avoid the two outcomes I fear most: a technology that manipulates the people who use it at no cost, and one that exclusively benefits the few who can afford to use it. 
(This is an NYT piece, and these are the personal opinions of the writer. They do not reflect the views of www.business-standard.com or the Business Standard newspaper)

 



Source link

Tech Wrap Feb 11: Samsung Galaxy Unpacked, Android 17, Google Photos on iOS

Tech Wrap Feb 11: Samsung Galaxy Unpacked, Android 17, Google Photos on iOS


 


Google has confirmed that Android 17 beta 1 is set to arrive “soon” for public testing. The update was announced shortly after Android 16 QPR3 Beta 2.1 was released, indicating that the next major Android version is moving into its beta stage. This year, Google appears to be revising its release strategy by skipping the traditional Developer Preview phase and moving directly to beta 1.

 

 


Google is extending its ‘Create with AI’ feature in Google Photos to iPhone and iPad users in India. The feature debuted on Android a few months earlier and is now expanding to Apple devices in selected markets. It enables users to edit and enhance photos using built-in AI templates.

 
 

 


Telecom company Airtel has introduced a new AI-powered security tool aimed at protecting users from bank fraud linked to OTP scams. The system functions at the network level and issues real-time alerts if it detects a potentially suspicious situation during a call. According to the company, the objective is to prevent customers from sharing banking OTPs with fraudsters while still on the call.

 


Google is broadening the availability of its Gemini-powered Fitbit Coach to additional countries beyond the US. First launched in public preview in October, the AI-based coaching tool offers customised workout routines, sleep analysis, and recovery recommendations based on user data. With this expansion, the Public Preview is also being made available to iOS users, allowing more Fitbit Premium members to access it via the updated Fitbit app.

 

 


YouTube Music has rolled out a new AI Playlist tool for Premium subscribers. The company announced the feature on X (formerly Twitter), confirming its availability on Android and iOS devices. The tool allows users to create playlists by describing their preferred mood, idea, or genre, either through text or voice input, instead of manually selecting songs.

 

 


In recent years, handheld gaming has evolved in two distinct directions. On one side are high-powered Windows-based devices like the Asus ROG Ally, ROG Xbox Ally, MSI Claw, and similar systems that function essentially as compact PCs with built-in controllers. On the other are smaller, more affordable retro handhelds such as the Anbernic RG35XX H, which focus primarily on emulation and classic console libraries.

 

 


Indian businesses rank among the most active global users of AI and machine-learning tools, with large volumes of sensitive data being processed through these systems, according to the Zscaler ThreatLabz 2026 AI Security Report. The report indicates that enterprise AI-related data transfers, along with data leakage incidents, are increasing more rapidly in India than in other regions.

 

 


The India AI Impact Summit, beginning February 16 in New Delhi, will gather participants from India and abroad to discuss advancements in artificial intelligence. Domestically, focus will likely be on 12 Indian startups selected under the IndiaAI Mission to develop indigenous foundation models trained on Indian languages and datasets. These firms are building large language models (LLMs) and multimodal systems tailored to local linguistic, sector-specific, and governance needs. The startups involved are as follows:

 

 


As New Delhi gets ready to host the India AI Impact Summit from February 16 to 20, the country is framing its AI strategy around measurable, large-scale outcomes rather than broad policy discussions. While previous editions in the UK, South Korea, and France focused on safety standards and innovation frameworks, the 2026 summit in India will emphasise technology deployment and tangible societal benefits.

 

 


Ahead of the India-AI Impact Summit 2026, India has introduced seven ‘chakras’ to guide global discussions on AI development and deployment. These chakras serve as thematic groups intended to convert broad AI principles into concrete policies and practical implementation.

 



Source link

YouTube
Instagram
WhatsApp