Vibhav Vineet

Publication

Complete updated list available on Google Scholar.

Phi-4-reasoning technical report
ArXiv, 2025.

[Paper] [HF model]
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Vidhisha Balachandran, Jingya Chen, Lingjiao Chen, Shivam Garg, Neel Joshi, Yash Lara, John Langford, Besmira Nushi, Vibhav Vineet, Yue Wu, Safoora Yousefi .
ArXiv, 2025.

[Paper] [Eureka]
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh Singh Rawat .
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[Paper] [Project Page]
Unearthing Skill-Level Insights For Under Sstanding Trade-Offs Of Foundation Models
Mazda Moayeri, Vidhisha Balachandran, Varun Chandrasekaran, Safoora Yousefi, Thomas Fel, Soheil Feizi, Besmira Nushi, Neel Joshi, Vibhav Vineet .
The Thirteenth International Conference on Learning Representations (ICLR), 2025.

[Paper] [Code & Dataset]
DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models
Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge .
The Thirteenth International Conference on Learning Representations (ICLR), 2025.

[Paper]
Exposing the Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Joykirat Singh, Akshay Nambi, Vibhav Vineet .
The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025.

[Paper]
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Gabriel Herbert Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andrew D Wilson .
The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025.

[Paper]
MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
Siddharth Joshi, Besmira Nushi, Vidhisha Balachandran, Varun Chandrasekaran, Vibhav Vineet, Neel Joshi, Baharan Mirzasoleiman .
ArXiv, 2025.

[Paper]
Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models
Jiayu Wang, Yifei Ming, Zhenmei Shi, Vibhav Vineet, Xin Wang, Yixuan Li, Neel Joshi .
Neural Information Processing System (NeurIPS), 2024.

[Paper] [Code & Data]
PEEKABOO: Interactive Video Generation via Masked-Diffusion
Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl .
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[Paper]
DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets
Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet .
Neural Information Processing System (NeurIPS), 2023.

[Paper] [Code]
Revealing the unseen: Benchmarking video action recognition under occlusion
Shresth Grover, Vibhav Vineet, Yogesh Singh Rawat .
Neural Information Processing System Dataset and Benchmark track (NeurIPS), 2023.

[Paper] [Project Page with Data]
On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes
Rajat Modi, Vibhav Vineet, Yogesh Singh Rawat .
Neural Information Processing System Dataset and Benchmark track (NeurIPS), 2023.

[Paper] [Code & Data]
Efficiently Robustify Pre-Trained Models
Nishant Jain, Harkirat Behl, Yogesh Rawat, Vibhav Vineet .
International Conference on Computer Vision (ICCV), 2023.

[Paper]
YCB Digital Twins for Sim2Real Analysis
Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Pedro Urbina, Neel Joshi, Vibhav Vineet .
International Conference on Computer Vision (ICCV), 2023.

[Paper]
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S Rawat, Vibhav Vineet .
arXiv, 2023.

[Paper] [Code & Dataset]
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang .
arXiv, 2023.

[Paper][Project page]
PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining
Garrett Thomas, Ching-An Cheng, Ricky Loynd, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov .
7th Conference on Robot Learning (CoRL), 2023.

[Paper]
A Large-Scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh S Rawat .
IConference on Computer Vision and Pattern Recognition (CVPR), 2023.

[Paper]
Neural-Sim: Learning to Generate Training Data with NeRF
Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet .
European Conference on Computer Vision (ECCV), 2022.

[Paper][Code & Dataset]
MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning
Xiaogang Xu, Hengshuang Zhao, Vibhav Vineet, Ser-Nam Lim, Antonio Torralba .
European Conference on Computer Vision (ECCV), 2022.

[Paper]
Scaling Novel Object Detection with Weakly Supervised Detection Transformers
Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi .
IEEE/CVF winter conference on applications of computer vision (WACV), 2023.

[Paper]
Multi-Modal Robustness Analysis Against Language And Visual Perturbations
Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh Rawat, Vibhav Vineet .
NeurIPS dataset and benchmark track, 2022.

[Paper]
Large-scale Robustness Analysis of Video Action Recognition Models
Madeline C Schiappa, Naman Biyani, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[Paper]
Causalcity: Complex simulations with agency for causal discovery and reasoning
Daniel McDuff, Yale Song, Jiyoung Lee, Vibhav Vineet, Sai Vemprala, Nicholas Alexander Gyde, Hadi Salman, Shuang Ma, Kwanghoon Sohn, Ashish Kapoor.
Conference on Causal Learning and Reasoning (CLeaR), 2022.

[Paper][Project Page]
DALL-E for Detection: Language-driven Context Image Synthesis for Object Detection
Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet.
arXiv preprint arXiv:2206.09592 (Preprint), 2022.

[Paper]
Missingness bias in model debugging
Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry.
International Conference on Learning Representations (ICLR), 2022.

[Paper][Code]
Image Retrieval from Contextual Descriptions
Benno Krojer, Vaibhav Adlakha, Vibhav Vineet, Yash Goyal, Edoardo Ponti, Siva Reddy.
Association for Computational Linguistics (ACL), 2022.

[Paper][Code & Data]
Inferring Articulated Rigid Body Dynamics from RGBD Video
Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav S Sukhatme.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[Paper][VideoSim Code & Data]
One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning
Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song.
arXiv:2203.08130 (Preprint), 2022.

[Paper]
Learning Articulated Rigid Body Dynamics Simulations From Video
Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav Sukhatme.
ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality (ICLR workshop), 2022.

[Paper][Code & Dataset]
Taskography: Evaluating robot task planning over large 3D scene graphs
Christopher Agia, Krishna Murthy Jatavallabhula, Mohamed Khodeir, Ondrej Miksik, Vibhav Vineet, Mustafa Mukadam, Liam Paull, Florian Shkurti.
Conference on Robot Learning (CoRL), 2022.

[Paper]
Robust contrastive learning against noisy views
Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[Paper][Code]
Learning to align sequential actions in the wild
Weizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[Paper][Code]
3db: A framework for debugging computer vision models
Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry.
NeurIPS, 2022.

[Paper][Project Page with Code]
Benchmarking Spatial Reasoning Abilities of Text-to-Image Generative Models
Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang.
ArXiv, 2022.

[Paper][Project Page with Code]
Prediction of object geometry from acoustic scattering using convolutional neural networks
Ziqi Fan, Vibhav Vineet, Chenshen Lu, TW Wu, Kyla McMullen.
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021.

[Paper][Code & Dataset]
Depth completion using a view-constrained deep prior
Pallabi Ghosh, Vibhav Vineet, Larry S Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi.
International Conference on 3D Vision (3DV), 2020.

[Paper]
Learning visuomotor policies for aerial navigation using cross-modal representations
Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor.
International Conference on Intelligent Robots and Systems (IROS), 2020.

[Paper][Code]
Learning to Simulate Realistic LiDARs
Benoit Guillard, Sai Vemprala, Jayesh K. Gupta, Ondrej Miksik, Vibhav Vineet, Pascal Fua, Ashish Kapoor.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[Paper]
RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs
Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley.
International Conference on 3D Vision (3DV), 2020.

[Paper]
AutoSimulate:(Quickly) Learning Synthetic Data Generation
Harkirat Singh Behl, Atılım Güneș Baydin, Ran Gal, Philip HS Torr, Vibhav Vineet.
European Conference on Computer Vision (ECCV), 2020.

[Paper]
Fast acoustic scattering using convolutional neural networks
Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi.
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020.

[Paper]
Photorealistic image synthesis for object instance detection
Tomáš Hodaň, Vibhav Vineet, Ran Gal, Emanuel Shalev, Jon Hanzelka, Treb Connell, Pedro Urbina, Sudipta N Sinha, Brian Guenter.
International conference on image processing (ICIP), 2019.

[Paper][Project Page with Data]
Learning Controls Using Cross-Modal Representations: Bridging Simulation and Reality for Drone Racing
Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor.
International Conference on Intelligent Robots and Systems (IROS), 2020.

[Paper]
Live Reconstruction of Large-Scale Dynamic Outdoor Worlds
Ondrej Miksik, Vibhav Vineet.
Conference on Computer Vision and Pattern Recognition Workshop (CVPR workshop), 2019.

[Paper]
Privacy-preserving action recognition using coded aperture videos
Zihao W Wang, Vibhav Vineet, Francesco Pittaluga, Sudipta N Sinha, Oliver Cossairt, Sing Bing Kang.
Conference on Computer Vision and Pattern Recognition Workshop (CVPR workshop), 2019.

[Paper]
Playing for data: Ground truth from computer games
Stephan R Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun.
European conference on computer vision (ECCV), 2016.

[Paper][Data]
Dense monocular depth estimation in complex dynamic scenes
Rene Ranftl, Vibhav Vineet, Qifeng Chen, Vladlen Koltun.
Conference on computer vision and pattern recognition (CVPR), 2016.

[Paper]
Feature space optimization for semantic video segmentation
Abhijit Kundu, Vibhav Vineet, Vladlen Koltun.
Conference on computer vision and pattern recognition (CVPR), 2016.

[Paper]
Struck: Structured output tracking with kernels
Sam Hare, Stuart Golodetz, Amir Saffari, Vibhav Vineet, Ming-Ming Cheng, Stephen L Hicks, Philip HS Torr.
IEEE transactions on pattern analysis and machine intelligence (TPAMI), 2015.

[Paper]
Semanticpaint: Interactive 3d labeling and learning at your fingertips
Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, David Kim, Jamie Shotton, Pushmeet Kohli, Matthias Nießner, Antonio Criminisi, Shahram Izadi, Philip Torr.
ACM Transactions on Graphics (TOG), 2015.

[Paper]
Semanticpaint: A framework for the interactive segmentation of 3d scenes
Stuart Golodetz, Michael Sapienza, Julien PC Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W Murray, Shahram Izadi, Philip HS Torr.
arXiv preprint arXiv:1510.03727 (Preprint), 2015.

[Paper]
Incremental dense multi-modal 3d scene reconstruction
Ondrej Miksik, Yousef Amar, Vibhav Vineet, Patrick Pérez, Philip HS Torr.
International Conference on Intelligent Robots and Systems (IROS), 2015.

[Paper]
Semanticpaint: interactive segmentation and learning of 3d worlds
Stuart Golodetz, Michael Sapienza, Julien PC Valentin, Vibhav Vineet, Ming-Ming Cheng, Victor A Prisacariu, Olaf Kähler, Carl Yuheng Ren, Anurag Arnab, Stephen L Hicks, David W Murray, Shahram Izadi, Philip HS Torr.
ACM SIGGRAPH 2015 Emerging Technologies (SIGGRAPH), 2015.

[Paper]
Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction
Vibhav Vineet, Ondrej Miksik, Morten Lidegaard, Matthias Nießner, Stuart Golodetz, Victor A Prisacariu, Olaf Kähler, David W Murray, Shahram Izadi, Patrick Pérez, Philip HS Torr.
IEEE international conference on robotics and automation (ICRA), 2015.

[Paper]
The semantic paintbrush: Interactive 3d mapping and recognition in large outdoor spaces
Ondrej Miksik, Vibhav Vineet, Morten Lidegaard, Ram Prasaath, Matthias Nießner, Stuart Golodetz, Stephen L Hicks, Patrick Pérez, Shahram Izadi, Philip HS Torr.
ACM Conference on Human Factors in Computing Systems (CHI), 2015.

[Paper]
Conditional random fields as recurrent neural networks
Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip HS Torr.
IEEE international conference on computer vision (ICCV), 2015.

[Paper]
ImageSpirit: Verbal guided image parsing
Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy J Mitra, Philip Torr.
ACM Transactions on Graphics (TOG), 2014.

[Paper]
Filter-based mean-field inference for random fields with higher-order terms and product label-spaces
Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
International Journal of Computer Vision (IJCV), 2014.

[Paper]
A tiered move-making algorithm for general non-submodular pairwise energies
Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
arXiv preprint arXiv:1403.6275 (Preprint), 2014.

[Paper]
Distributed non-convex admm-inference in large-scale random fields
Ondrej Miksik, Vibhav Vineet, Patrick Pérez, Philip HS Torr, F Cesson Sévigné.
British Machine Vision Conference (BMVC), 2014.

[Paper]
Dense semantic image segmentation with objects and attributes
Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip HS Torr.
IEEE conference on computer vision and pattern recognition (CVPR), 2014.

[Paper]
Posefield: An efficient mean-field based method for joint estimation of human pose, segmentation, and depth
Vibhav Vineet, Glenn Sheasby, Jonathan Warrell, Philip HS Torr.
Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR), 2013.

[Paper]
Higher order priors for joint intrinsic image, objects, and attributes estimation
Vibhav Vineet, Carsten Rother, Philip Torr.
Neural Information Processing Systems (NIPS), 2013.

[Paper]
Efficient salient region detection with soft image abstraction
Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook.
International Conference on Computer vision (ICCV), 2013.

[Paper]
Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference.
Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip HS Torr.
BMVC (BMVC), 2012.

[Paper]
A tiered move-making algorithm for general pairwise MRFs
Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
Conference on Computer Vision and Pattern Recognition (CVPR), 2012.

[Paper]
Fast minimum spanning tree computation
Pawan Harish, PJ Narayanan, Vibhav Vineet, Suryakant Patidar.
GPU Computing Gems Jade Edition (GPU Gem), 2012.

[Paper]
Fast graph cuts for computer vision
PJ Narayanan, Vibhav Vineet, Timo Stich.
GPU Computing Gems Emerald Edition (GPU Gems), 2011.

[Paper]
Human Instance Segmentation from Video using Detector-based Conditional Random Fields.
Vibhav Vineet, Jonathan Warrell, Lubor Ladicky, Philip HS Torr.
BMVC (BMVC), 2011.

[Paper]
Solving Multilabel MRFs Using Incremental α-Expansion on the GPUs
Vibhav Vineet, PJ Narayanan.
Asian conference on computer vision (ACCV), 2009.

[Paper]
Fast minimum spanning tree for large graphs on the GPU
Vibhav Vineet, Pawan Harish, Suryakant Patidar, PJ Narayanan.
Conference on High Performance Graphics (HOG), 2009.

[Paper]
Large graph algorithms for massively multithreaded architectures
Pawan Harish, Vibhav Vineet, PJ Narayanan.
Tech. Rep. IIIT/TR/2009/74 (Tech Report), 2009.

[Paper]