IISc Bengaluru team's AI research paper ranks among top 15 at global CVPR conference
A research paper authored by a team from the Indian Institute of Science (IISc) Bengaluru has secured a place among the top 15 submissions at the Computer Vision and Pattern Recognition (CVPR) 2026 conference, held earlier this month in Colorado, USA. CVPR is a premier global event in the field of computer vision, focusing on how computers interpret and process visual information from images and videos.
The paper, titled Rethinking Dataset Distillation: Hard Truths about Soft Labels, was written by Priyam Dey, R Venkatesh Babu, Additya Sahdev, Sunny Bhati, and Konda Reddy Mopuri from IISc's Computational and Data Science (CDS) Department. Out of approximately 16,000 submissions received for the conference, this paper was ranked within the top 15, according to an IISc announcement earlier this month.
Professor R Venkatesh Babu, Head of the CDS Department, explained the concept of 'dataset distillation' — a method to condense large datasets into a small number of representative samples that can still train AI models effectively. 'With AI, we have a large amount of data used in training models. You may need a very large and expensive network of training data. In a dataset where so many samples are available, can we get a handful of samples with which we can train the AI model? Then the training cost can come down drastically,' he said.
The research challenges conventional approaches that rely on complex synthetic datasets for training. Professor Babu noted that the team found simpler random samples can achieve similar accuracy to carefully crafted synthetic datasets. 'This is kind of a deviation from what people were doing continuously — we wanted to look back and say, this is not the correct way. Random training data samples also give you the same accuracy,' he added.
Currently, the research has been applied to image classification tasks, such as sorting a million images into 1,000 categories. Professor Babu noted that similar techniques could be extended to other domains, including audio processing.
The team also highlighted the environmental benefits of reducing data volume. 'The volume of data is so much that we never pay attention to it and feed whatever is available to the machinery. That is what is emitting huge amounts of carbon. Any effort to reduce this amount of data could significantly reduce the carbon footprint,' Professor Babu said. By enabling more efficient training with fewer data points, dataset distillation can lower the energy consumption and associated carbon emissions of AI systems.
The recognition at CVPR underscores the global significance of the IISc team's contribution to making AI more efficient and environmentally sustainable.