News
Research
I am interested in computer vision and machine learning in general. My research is mainly about multimodal large language models, video understanding, reinforcement learning and world models.
Your browser does not support the video tag.
pLSTM: parallelizable Linear Source Transition Mark networks
Korbinian Pöppel ,
Richard Freinschlag ,
Thomas Schmied ,
Wei Lin ,
Sepp Hochreiter ,
Arxiv , 2025
arxiv
/
code
/
video
Your browser does not support the video tag.
STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving
Christian Fruhwirth-Reisinger ,
Dusan Malic ,
Wei Lin ,
David Schinagl ,
Samuel Schulter ,
Horst Possegger
Arxiv , 2025
arxiv
/
code
/
video
Your browser does not support the video tag.
Teaching VLMs to Localize Specific Objects from In-context Examples
Sivan Doveh ,
Nimrod Shabtay ,
Wei Lin ,
Eli Schwartz ,
Hilde Kuehne ,
Raja Giryes ,
Rogerio Feris ,
Leonid Karlinsky ,
James Glass ,
Assaf Arbelle ,
Shimon Ullman ,
Muhammad Jehanzeb Mirza
ICCV , 2025
arxiv
/
code
/
video
Your browser does not support the video tag.
PerLA: Perceptive 3D Language Assistant
Guofeng Mei ,
Wei Lin ,
Luigi Riz ,
Yujiao Wu ,
Fabio Poiesi ,
Yiming Wang
CVPR , 2025
arxiv
/
code
/
video
Your browser does not support the video tag.
LiveXiv--A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Nimrod Shabtay ,
Felipe Maia Polo ,
Sivan Doveh ,
Wei Lin ,
Muhammad Jehanzeb Mirza ,
Leshem Choshen ,
Mikhail Yurochkin ,
Yuekai Sun ,
Assaf Arbelle ,
Leonid Karlinsky ,
Raja Giryes
ICLR , 2025
arxiv
/
🤗 Dataset
/
code
/
video
Your browser does not support the video tag.
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza ,
Mengjie Zhao ,
Zhuoyuan Mao ,
Sivan Doveh ,
Wei Lin ,
Paul Gavrikov ,
Michael Dorkenwald ,
Shiqi Yang ,
Saurav Jha ,
Hiromi Wakaki ,
Yuki Mitsufuji ,
Horst Possegger
Rogerio Feris ,
Leonid Karlinsky ,
James Glass
Arxiv , 2024
arxiv
/
code
/
video
Your browser does not support the video tag.
Comparison Visual Instruction Tuning
Wei Lin ,
Muhammad Jehanzeb Mirza ,
Sivan Doveh ,
Rogerio Feris ,
Raja Giryes ,
Sepp Hochreiter ,
Leonid Karlinsky
In collaboration with the MIT-IBM Watson AI Lab
Arxiv , 2024
arxiv
/
🤗 Dataset
/
code
/
video
an approach for collection of visual instructions that improves Commonality and difference spoting capabilities for Large Multimodal Modes
Your browser does not support the video tag.
Conme: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
*Irene Huang ,
*Wei Lin ,
*Muhammad Jehanzeb Mirza ,
Jacob Hansen,
Sivan Doveh ,
Victor Ion Butoi ,
Roei Herzig ,
Assaf Arbelle ,
Hilde Kuehne ,
Trevor Darrell ,
Chuang Gan ,
Aude Oliva ,
Rogerio Feris ,
Leonid Karlinsky
(*equal contribution)
In collaboration with the MIT-IBM Watson AI Lab
NeurIPS , 2024 Datasets & Benchmarks Track
arxiv
/
🤗 Dataset
/
code
/
video
Your browser does not support the video tag.
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
Muhammad Jehanzeb Mirza ,
Leonid Karlinsky ,
Wei Lin ,
Sivan Doveh ,
Jakub Micorek ,
Mateusz Kozinski ,
Hilde Kuehne ,
Horst Possegger
In collaboration with the MIT-IBM Watson AI Lab
ECCV , 2024
arxiv
/
code
/
video
Your browser does not support the video tag.
Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection
Christian Fruhwirth-Reisinger ,
Wei Lin ,
Dusan Malic ,
Horst Bischof ,
Horst Possegger
BMVC, 2024 Oral Presentation & Best Poster Award
arxiv
/
code
/
video
Your browser does not support the video tag.
Towards Multimodal In-Context Learning for Vision & Language Models
Sivan Doveh ,
Shaked Perek ,
Muhammad Jehanzeb Mirza ,
Wei Lin ,
Amit Alfassy ,
Assaf Arbelle ,
Shimon Ullman ,
Leonid Karlinsky
ECCV 2024 Workshop on Multimodal Agents
arxiv
/
code
/
video
Your browser does not support the video tag.
Overlooked Aspects in the Evaluation of Out-Of-Distribution Detection Methods
*Bernhard Lehner ,
*Christian Huber,
Bernhard Moser ,
Claus Hofmann ,
Wei Lin ,
Sepp Hochreiter (*equal contribution)
Arxiv, 2024
arxiv
/
code
/
video
Your browser does not support the video tag.
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
Muhammad Jehanzeb Mirza ,
Leonid Karlinsky ,
Wei Lin ,
Mateusz Kozinski ,
Horst Possegger ,
Rogerio Feris ,
Horst Bischof
NeurIPS, 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Wei Lin ,
Leonid Karlinsky ,
Nina Shvetsova ,
Horst Possegger ,
Mateusz Kozinski ,
Rameswar Panda ,
Rogerio Feris ,
Hilde Kuehne ,
Horst Bischof
In collaboration with the MIT-IBM Watson AI Lab
ICCV , 2023
arxiv
/
code
/
video
Unsupervised finetuning of Vision-Language models for zero-shot and few-shot action recognition, with GPT3 text expansion and video frame captioning.
Your browser does not support the video tag.
TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification
Muhammad Jehanzeb Mirza ,
Leonid Karlinsky ,
Wei Lin ,
Horst Possegger ,
Rogerio Feris ,
Horst Bischof
Arxiv, 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
MATE: Masked Autoencoders are Online 3D Test-Time Learners
*Muhammad Jehanzeb Mirza ,
*Inkyu Shin ,
*Wei Lin ,
Andreas Schriebl,
Kunyang Sun ,
Jaesung Choe ,
Horst Possegger ,
Mateusz Kozinski ,
In So Kweon ,
Kun-Jin Yoon ,
Horst Bischof
(*equal contribution)
ICCV , 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
Video Test-Time Adaptation for Action Recognition
*Wei Lin ,
*Muhammad Jehanzeb Mirza ,
Mateusz Kozinski ,
Horst Possegger ,
Hilde Kuehne ,
Horst Bischof
(*equal contribution)
CVPR , 2023
arxiv
/
🤗 Dataset
/
code
/
video
Test-time adaptation of video action recognition against common distribution shifts.
Your browser does not support the video tag.
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
Muhammad Jehanzeb Mirza ,
Pol Jané Soneira ,
Wei Lin ,
Mateusz Kozinski ,
Horst Possegger ,
Horst Bischof
CVPR , 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
Unsupervised Class-aware 3D Object Detection in LiDAR Point Clouds
Christian Fruhwirth-Reisinger ,
Wei Lin ,
Dusan Malic ,
David Schinagl ,
Georg Krispel ,
Horst Possegger ,
Horst Bischof
Arxiv, 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video
Wei Lin ,
Anna Kukleva ,
Kunyang Sun ,
Horst Possegger ,
Hilde Kuehne ,
Horst Bischof
ECCV , 2022
paper
/
arxiv
/
code
/
video
Unsupervised image-to-video domain adaptation.
Your browser does not support the video tag.
Extended Abstract CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video
Wei Lin ,
Anna Kukleva ,
Kunyang Sun ,
Horst Possegger ,
Hilde Kuehne ,
Horst Bischof
ECCV Workshop of Out Of Distribution Generalization in Computer Vision , 2022
paper
/
code
/
video
Your browser does not support the video tag.
AIR-DA: Adversarial Image Reconstruction for Unsupervised Domain Adaptive Object Detection
Kunyang Sun ,
Wei Lin ,
Haoqin Shi,
Zhengming Zhang ,
Yongming Huang ,
Horst Bischof
IEEE Robotics and Automation Letters (RA-L) 2023
paper
/
arxiv
/
code
/
video
Your browser does not support the video tag.
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering
Wei Lin ,
Anna Kukleva ,
Horst Possegger ,
Hilde Kuehne ,
Horst Bischof
Computer Vision Winter Workshop , 2023
arxiv
/
code
/
video
Your browser does not support the video tag.
Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions
Stefan Leitner ,
Muhammad Jehanzeb Mirza ,
Wei Lin ,
Jakub Micorek ,
Marc Masana ,
Mateusz Kozinski ,
Horst Possegger ,
Horst Bischof
Intelligent Vehicle Conference , 2023
arxiv
/
code
/
video
Academic Service
Conference Reviewer : ECCV 2022, ISMAR 2023, CVPR 2023, NeurIPS 2023, WACV 2024, CVPR 2024, ECCV 2024, NeurIPS 2024, NeurIPS 2024 Dataset and Benchmark Track, ICLR 2025, CVPR 2025
Journal Reviewer : TPAMI 2023, TNNLS 2023, IEEE Trans. Multimedia 2023, Pattern Recognition Letters 2024, Trans. Image Processing 2024
Teaching
Deep Learning and Neural Networks I - Exercise
Machine Learning: Supervised Techniques - Exercise
Deep Learning and Neural Networks II - Exercise
Machine Learning: Unsupervised Techniques - Exercise