publications | Arka Mukherjee

2025

ICCV
Toward Socially Aware Vision-Language Models: Evaluating Cultural Competence Through Multimodal Story Generation

Arka Mukherjee and Shreya Ghosh

In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Oct 2025

Oral Abs Bib HTML PDF

Oral Presentation

We present the first systematic evaluation framework for assessing cultural competence in Vision-Language Models (VLMs) through multimodal story generation, analyzing five contemporary VLMs with novel evaluation metrics.
@inproceedings{mukherjee2025cultural, title = {Toward Socially Aware Vision-Language Models: Evaluating Cultural Competence Through Multimodal Story Generation}, author = {Mukherjee, Arka and Ghosh, Shreya}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)}, year = {2025}, month = oct, }
AACL
mmJEE-Eval: A Bilingual Multimodal Benchmark for Evaluating Scientific Reasoning in Vision-Language Models

Arka Mukherjee and Shreya Ghosh

Findings of IJCNLP-AACL 2025, Nov 2025

Abs arXiv Bib HTML PDF Website

We introduce mmJEE-Eval, a bilingual multimodal benchmark comprising 1,460 STEM problems evaluating 17 VLMs. Models detect 53% of errors but correct only 3.5%, exposing metacognitive gaps between open and closed models.
@article{mukherjee2025mmjee, title = {mmJEE-Eval: A Bilingual Multimodal Benchmark for Evaluating Scientific Reasoning in Vision-Language Models}, author = {Mukherjee, Arka and Ghosh, Shreya}, journal = {Findings of IJCNLP-AACL 2025}, year = {2025}, month = nov, }