Publication 
Complete updated list available on 
 Google Scholar. 
 
  - 
    Phi-4-reasoning technical report
 ArXiv, 2025.
      [Paper] [HF model] 
     
- 
    Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
 Vidhisha Balachandran, Jingya Chen, Lingjiao Chen, Shivam Garg, Neel Joshi, Yash Lara, John Langford, Besmira Nushi, Vibhav Vineet, Yue Wu, Safoora Yousefi .
 ArXiv, 2025.
      [Paper] [Eureka]
     
- 
    HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
 Shehreen Azad, Vibhav Vineet, Yogesh Singh Rawat  .
 Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
      [Paper] [Project Page]
     
- 
    Unearthing Skill-Level Insights For Under Sstanding Trade-Offs Of Foundation Models
 Mazda Moayeri, Vidhisha Balachandran, Varun Chandrasekaran, Safoora Yousefi, Thomas Fel, Soheil Feizi, Besmira Nushi, Neel Joshi, Vibhav Vineet  .
 The Thirteenth International Conference on Learning Representations (ICLR), 2025.
      [Paper] [Code & Dataset]
     
- 
    DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models
 Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge .
 The Thirteenth International Conference on Learning Representations (ICLR), 2025.
      [Paper]
     
- 
    Exposing the Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
 Joykirat Singh, Akshay Nambi, Vibhav Vineet .
 The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025.
      [Paper]
     
- 
    Grounding Task Assistance with Multimodal Cues from a Single Demonstration
 Gabriel Herbert Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andrew D Wilson .
 The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025.
      [Paper]
     
- 
    MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
 Siddharth Joshi, Besmira Nushi, Vidhisha Balachandran, Varun Chandrasekaran, Vibhav Vineet, Neel Joshi, Baharan Mirzasoleiman .
 ArXiv, 2025.
      [Paper]
     
- 
    Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models
 Jiayu Wang, Yifei Ming, Zhenmei Shi, Vibhav Vineet, Xin Wang, Yixuan Li, Neel Joshi  .
 Neural Information Processing System (NeurIPS), 2024.
      [Paper] [Code & Data]
     
- 
    PEEKABOO: Interactive Video Generation via Masked-Diffusion
 Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl  .
 Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
      [Paper]
     
- 
    DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets
 Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet  .
 Neural Information Processing System (NeurIPS), 2023.
      [Paper] [Code]
     
- 
    Revealing the unseen: Benchmarking video action recognition under occlusion
 Shresth Grover, Vibhav Vineet, Yogesh Singh Rawat  .
 Neural Information Processing System Dataset and Benchmark track (NeurIPS), 2023.
      [Paper] [Project Page with Data]
     
- 
    On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes
 Rajat Modi, Vibhav Vineet, Yogesh Singh Rawat  .
 Neural Information Processing System Dataset and Benchmark track (NeurIPS), 2023.
      [Paper] [Code & Data]
     
- 
    Efficiently Robustify Pre-Trained Models
 Nishant Jain, Harkirat Behl, Yogesh Rawat, Vibhav Vineet  .
 International Conference on Computer Vision (ICCV), 2023.
      [Paper]
     
- 
    YCB Digital Twins for Sim2Real Analysis
 Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Pedro Urbina, Neel Joshi, Vibhav Vineet  .
 International Conference on Computer Vision (ICCV), 2023.
      [Paper]
     
- 
    Robustness Analysis on Foundational Segmentation Models
 Madeline Chantry Schiappa, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S Rawat, Vibhav Vineet .
 arXiv, 2023.
      [Paper] [Code & Dataset]
     
- 
    Controllable Text-to-Image Generation with GPT-4
 Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang .
 arXiv, 2023.
      [Paper][Project page]
     
- 
    PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining
 Garrett Thomas, Ching-An Cheng, Ricky Loynd, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov .
 7th Conference on Robot Learning (CoRL), 2023.
      [Paper]
     
- 
    A Large-Scale Robustness Analysis of Video Action Recognition Models
 Madeline Chantry Schiappa,  Naman Biyani, Prudvi Kamtam, Shruti Vyas,  Hamid Palangi, Vibhav Vineet, Yogesh S Rawat .
 IConference on Computer Vision and Pattern Recognition (CVPR), 2023.
      [Paper]
     
- 
    Neural-Sim: Learning to Generate Training Data with NeRF
 Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet  .
 European Conference on Computer Vision (ECCV), 2022.
      [Paper][Code & Dataset]
     
- 
    MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning
 Xiaogang Xu, Hengshuang Zhao, Vibhav Vineet, Ser-Nam Lim, Antonio Torralba .
 European Conference on Computer Vision (ECCV), 2022.
      [Paper]
     
- 
    Scaling Novel Object Detection with Weakly Supervised Detection Transformers
 Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi .
 IEEE/CVF winter conference on applications of computer vision (WACV), 2023.
      [Paper]
     
- 
    Multi-Modal Robustness Analysis Against Language And Visual Perturbations
 Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh Rawat, Vibhav Vineet .
 NeurIPS dataset and benchmark track, 2022.
      [Paper]
     
- 
    Large-scale Robustness Analysis of Video Action Recognition Models
 Madeline C Schiappa, Naman Biyani, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat.
 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
      [Paper]
     
- 
    Causalcity: Complex simulations with agency for causal discovery and reasoning
 Daniel McDuff, Yale Song, Jiyoung Lee, Vibhav Vineet, Sai Vemprala, Nicholas Alexander Gyde, Hadi Salman, Shuang Ma, Kwanghoon Sohn, Ashish Kapoor.
 Conference on Causal Learning and Reasoning (CLeaR), 2022.
      [Paper][Project Page]
     
- 
    DALL-E for Detection: Language-driven Context Image Synthesis for Object Detection
 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet.
 arXiv preprint arXiv:2206.09592 (Preprint), 2022.
      [Paper]
     
- 
    Missingness bias in model debugging
 Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry.
 International Conference on Learning Representations (ICLR), 2022.
      [Paper][Code]
     
- 
    Image Retrieval from Contextual Descriptions
 Benno Krojer, Vaibhav Adlakha, Vibhav Vineet, Yash Goyal, Edoardo Ponti, Siva Reddy.
 Association for Computational Linguistics (ACL), 2022.
      [Paper][Code & Data]
     
- 
    Inferring Articulated Rigid Body Dynamics from RGBD Video
 Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav S Sukhatme.
 International Conference on Intelligent Robots and Systems (IROS), 2022.
      [Paper][VideoSim Code & Data]
     
- 
    One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning
 Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song.
 arXiv:2203.08130  (Preprint), 2022.
      [Paper]
     
- 
    Learning Articulated Rigid Body Dynamics Simulations From Video
 Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav Sukhatme.
 ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality  (ICLR workshop), 2022.
      [Paper][Code & Dataset]
     
- 
    Taskography: Evaluating robot task planning over large 3D scene graphs
 Christopher Agia, Krishna Murthy Jatavallabhula, Mohamed Khodeir, Ondrej Miksik, Vibhav Vineet, Mustafa Mukadam, Liam Paull, Florian Shkurti.
 Conference on Robot Learning  (CoRL), 2022.
      [Paper]
     
- 
    Robust contrastive learning against noisy views
 Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song.
 Conference on Computer Vision and Pattern Recognition  (CVPR), 2022.
      [Paper][Code]
     
- 
    Learning to align sequential actions in the wild
 Weizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys.
 Conference on Computer Vision and Pattern Recognition  (CVPR), 2022.
      [Paper][Code]
     
- 
    3db: A framework for debugging computer vision models
 Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry.
 NeurIPS, 2022.
      [Paper][Project Page with Code]
     
- 
    Benchmarking Spatial Reasoning Abilities of Text-to-Image Generative Models
 Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang.
 ArXiv, 2022.
      [Paper][Project Page with Code]
     
- 
    Prediction of object geometry from acoustic scattering using convolutional neural networks
 Ziqi Fan, Vibhav Vineet, Chenshen Lu, TW Wu, Kyla McMullen.
 International Conference on Acoustics, Speech and Signal Processing   (ICASSP), 2021.
      [Paper][Code & Dataset]
     
- 
    Depth completion using a view-constrained deep prior
 Pallabi Ghosh, Vibhav Vineet, Larry S Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi.
 International Conference on 3D Vision   (3DV), 2020.
      [Paper]
     
- 
    Learning visuomotor policies for aerial navigation using cross-modal representations
 Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor.
 International Conference on Intelligent Robots and Systems   (IROS), 2020.
      [Paper][Code]
     
- 
    Learning to Simulate Realistic LiDARs
 Benoit Guillard, Sai Vemprala, Jayesh K. Gupta, Ondrej Miksik, Vibhav Vineet, Pascal Fua, Ashish Kapoor.
 International Conference on Intelligent Robots and Systems   (IROS), 2022.
      [Paper]
     
- 
    RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs
 Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley.
 International Conference on 3D Vision   (3DV), 2020.
      [Paper]
     
- 
    AutoSimulate:(Quickly) Learning Synthetic Data Generation
 Harkirat Singh Behl, Atılım Güneș Baydin, Ran Gal, Philip HS Torr, Vibhav Vineet.
 European Conference on Computer Vision   (ECCV), 2020.
      [Paper]
     
- 
    Fast acoustic scattering using convolutional neural networks
 Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi.
 International Conference on Acoustics, Speech and Signal Processing   (ICASSP), 2020.
      [Paper]
     
- 
    Photorealistic image synthesis for object instance detection
 Tomáš Hodaň, Vibhav Vineet, Ran Gal, Emanuel Shalev, Jon Hanzelka, Treb Connell, Pedro Urbina, Sudipta N Sinha, Brian Guenter.
 International conference on image processing   (ICIP), 2019.
      [Paper][Project Page with Data]
     
- 
    Learning Controls Using Cross-Modal Representations: Bridging Simulation and Reality for Drone Racing
 Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor.
 International Conference on Intelligent Robots and Systems  (IROS), 2020.
      [Paper]
     
- 
    Live Reconstruction of Large-Scale Dynamic Outdoor Worlds
 Ondrej Miksik, Vibhav Vineet.
 Conference on Computer Vision and Pattern Recognition Workshop  (CVPR workshop), 2019.
      [Paper]
     
- 
    Privacy-preserving action recognition using coded aperture videos
 Zihao W Wang, Vibhav Vineet, Francesco Pittaluga, Sudipta N Sinha, Oliver Cossairt, Sing Bing Kang.
 Conference on Computer Vision and Pattern Recognition Workshop  (CVPR workshop), 2019.
      [Paper]
     
- 
    Playing for data: Ground truth from computer games
 Stephan R Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun.
 European conference on computer vision  (ECCV), 2016.
      [Paper][Data]
     
- 
    Dense monocular depth estimation in complex dynamic scenes
 Rene Ranftl, Vibhav Vineet, Qifeng Chen, Vladlen Koltun.
 Conference on computer vision and pattern recognition  (CVPR), 2016.
      [Paper]
     
- 
    Feature space optimization for semantic video segmentation
 Abhijit Kundu, Vibhav Vineet, Vladlen Koltun.
 Conference on computer vision and pattern recognition  (CVPR), 2016.
      [Paper]
     
- 
    Struck: Structured output tracking with kernels
 Sam Hare, Stuart Golodetz, Amir Saffari, Vibhav Vineet, Ming-Ming Cheng, Stephen L Hicks, Philip HS Torr.
 IEEE transactions on pattern analysis and machine intelligence  (TPAMI), 2015.
      [Paper]
     
- 
    Semanticpaint: Interactive 3d labeling and learning at your fingertips
 Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, David Kim, Jamie Shotton, Pushmeet Kohli, Matthias Nießner, Antonio Criminisi, Shahram Izadi, Philip Torr.
 ACM Transactions on Graphics  (TOG), 2015.
      [Paper]
     
- 
    Semanticpaint: A framework for the interactive segmentation of 3d scenes
 Stuart Golodetz, Michael Sapienza, Julien PC Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W Murray, Shahram Izadi, Philip HS Torr.
 arXiv preprint arXiv:1510.03727  (Preprint), 2015.
      [Paper]
     
- 
    Incremental dense multi-modal 3d scene reconstruction
 Ondrej Miksik, Yousef Amar, Vibhav Vineet, Patrick Pérez, Philip HS Torr.
 International Conference on Intelligent Robots and Systems  (IROS), 2015.
      [Paper]
     
- 
    Semanticpaint: interactive segmentation and learning of 3d worlds
 Stuart Golodetz, Michael Sapienza, Julien PC Valentin, Vibhav Vineet, Ming-Ming Cheng, Victor A Prisacariu, Olaf Kähler, Carl Yuheng Ren, Anurag Arnab, Stephen L Hicks, David W Murray, Shahram Izadi, Philip HS Torr.
 ACM SIGGRAPH 2015 Emerging Technologies  (SIGGRAPH), 2015.
      [Paper]
     
- 
    Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction
 Vibhav Vineet, Ondrej Miksik, Morten Lidegaard, Matthias Nießner, Stuart Golodetz, Victor A Prisacariu, Olaf Kähler, David W Murray, Shahram Izadi, Patrick Pérez, Philip HS Torr.
 IEEE international conference on robotics and automation  (ICRA), 2015.
      [Paper]
     
- 
    The semantic paintbrush: Interactive 3d mapping and recognition in large outdoor spaces
 Ondrej Miksik, Vibhav Vineet, Morten Lidegaard, Ram Prasaath, Matthias Nießner, Stuart Golodetz, Stephen L Hicks, Patrick Pérez, Shahram Izadi, Philip HS Torr.
 ACM Conference on Human Factors in Computing Systems  (CHI), 2015.
      [Paper]
     
- 
    Conditional random fields as recurrent neural networks
 Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip HS Torr.
 IEEE international conference on computer vision  (ICCV), 2015.
      [Paper]
     
- 
    ImageSpirit: Verbal guided image parsing
 Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy J Mitra, Philip Torr.
 ACM Transactions on Graphics  (TOG), 2014.
      [Paper]
     
- 
    Filter-based mean-field inference for random fields with higher-order terms and product label-spaces
 Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
 International Journal of Computer Vision  (IJCV), 2014.
      [Paper]
     
- 
    A tiered move-making algorithm for general non-submodular pairwise energies
 Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
 arXiv preprint arXiv:1403.6275  (Preprint), 2014.
      [Paper]
     
- 
    Distributed non-convex admm-inference in large-scale random fields
 Ondrej Miksik, Vibhav Vineet, Patrick Pérez, Philip HS Torr, F Cesson Sévigné.
 British Machine Vision Conference  (BMVC), 2014.
      [Paper]
     
- 
    Dense semantic image segmentation with objects and attributes
 Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip HS Torr.
 IEEE conference on computer vision and pattern recognition  (CVPR), 2014.
      [Paper]
     
- 
    Posefield: An efficient mean-field based method for joint estimation of human pose, segmentation, and depth
 Vibhav Vineet, Glenn Sheasby, Jonathan Warrell, Philip HS Torr.
 Energy Minimization Methods in Computer Vision and Pattern Recognition  (EMMCVPR), 2013.
      [Paper]
     
- 
    Higher order priors for joint intrinsic image, objects, and attributes estimation
 Vibhav Vineet, Carsten Rother, Philip Torr.
 Neural Information Processing Systems  (NIPS), 2013.
      [Paper]
     
- 
    Efficient salient region detection with soft image abstraction
 Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook.
 International Conference on Computer vision  (ICCV), 2013.
      [Paper]
     
- 
    Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference.
 Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip HS Torr.
 BMVC  (BMVC), 2012.
      [Paper]
     
- 
    A tiered move-making algorithm for general pairwise MRFs
 Vibhav Vineet, Jonathan Warrell, Philip HS Torr.
 Conference on Computer Vision and Pattern Recognition  (CVPR), 2012.
      [Paper]
     
- 
    Fast minimum spanning tree computation
 Pawan Harish, PJ Narayanan, Vibhav Vineet, Suryakant Patidar.
 GPU Computing Gems Jade Edition  (GPU Gem), 2012.
      [Paper]
     
- 
    Fast graph cuts for computer vision
 PJ Narayanan, Vibhav Vineet, Timo Stich.
 GPU Computing Gems Emerald Edition  (GPU Gems), 2011.
      [Paper]
     
- 
    Human Instance Segmentation from Video using Detector-based Conditional Random Fields.
 Vibhav Vineet, Jonathan Warrell, Lubor Ladicky, Philip HS Torr.
 BMVC  (BMVC), 2011.
      [Paper]
     
- 
    Solving Multilabel MRFs Using Incremental α-Expansion on the GPUs
 Vibhav Vineet, PJ Narayanan.
 Asian conference on computer vision  (ACCV), 2009.
      [Paper]
     
- 
    Fast minimum spanning tree for large graphs on the GPU
 Vibhav Vineet, Pawan Harish, Suryakant Patidar, PJ Narayanan.
 Conference on High Performance Graphics  (HOG), 2009.
      [Paper]
     
- 
    Large graph algorithms for massively multithreaded architectures
 Pawan Harish, Vibhav Vineet, PJ Narayanan.
 Tech. Rep. IIIT/TR/2009/74  (Tech Report), 2009.
      [Paper]