Gerasimos Chatzoudis
PhD · mechanistic interpretability for Vision
Papers Writing Contact
Open for Work · Summer 2026

GerasimosChatzoudis.

I'm a PhD Candidate at Rutgers, advised by Professor Dimitris N. Metaxas. I work on Mechanistic Interpretability: interpreting and controlling black-box Foundational Vision-Language Models.

Before Rutgers I was an ML Scientist at the Athena Research Center in Athens, and earned my BS/MEng in Electrical & Computer Engineering (ECE) from the National Technical University of Athens. I've spent the last two summers as an Applied Scientist Intern (L5) at Amazon, working on GraphRAG for reasoning and for LLM-based knowledge graphs for customer understanding.

Graduating Summer 2026, looking for full-time research or applied scientist roles.

Institution
Rutgers University · CBIM
Advisor
Prof. Dimitris N. Metaxas, Board of Governors & Distinguished Professor, Rutgers CS
Based in
New Brunswick, NJ
§ 01  · Experience

Experience.

  1. Rutgers University
    PhD Candidate · Computer Science
    2022 to present
  2. Amazon
    Applied Scientist Intern, L5 · San Diego · LLM-based behavioral knowledge graphs for customer understanding
    Summer 2025
  3. Amazon
    Applied Scientist Intern, L5 · AWS Santa Clara · GraphRAG
    Summer 2024
  4. Athena Research Center
    Machine Learning Scientist · Cross-lingual aphasia detection
    2020 to 2022
  5. Samsung (Innoetics)
    Machine Learning Intern · Text-to-Speech prosody
    2019
  6. National Technical University of Athens
    BSc + MEng · Electrical & Computer Engineering
    2014 to 2020
§ 02  · Publications

Papers, to date.

5 entries
2022 to 2026
[1]
'26
Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision.
We bring Cross-Layer Transcoders to vision for the first time, decomposing a Vision Transformer's computation into an additive sum of sparse, cross-layer contributions that make each MLP block an interpretable proxy with faithful layer-wise attribution.
Chatzoudis, G., Polyzos, K.D., Li, Z., Gu, D., Moran, G.E., Wang, H., Metaxas, D.N.
CVPR 2026
XAI4CV workshop
[2]
'25
Visual Sparse Steering (VS²): Unsupervised Adaptation for Image Classification using Sparsity-Guided Steering Vectors.
We turn inherently interpretable sparse autoencoder features into effective and reliable steering vectors that adapt vision-language classifiers at inference time, and surface the alignment gap that had kept SAEs mostly confined to pure interpretability.
Chatzoudis, G., Li, Z., Moran, G.E., Wang, H., Metaxas, D.N.
arXiv 2025
under review
[3]
'25
LUCID-SAE: Learning Unified Vision-Language Sparse Codes for Interpretable Concept Discovery.
A shared sparse codebook across vision and language, surfacing concepts that align across both modalities.
Gu, D., Gao, Y., Chatzoudis, G., Dong, Z., Zhang, G., Guo, B., Zhou, Y., Zhou, M., Metaxas, D.N.
arXiv 2025
preprint
[4]
'23
A Web-Based Application for Eliciting Narrative Discourse from Greek-Speaking People with Language Impairments.
We design and validate a web-based application for collecting narrative discourse from Greek-speaking adults, including people with aphasia, and show in a pilot study that remote and in-person elicitation yield comparable linguistic measures.
Stamouli, S., …, Chatzoudis, G., et al.
Frontiers
in Communication · 2023
[5]
'22
Zero-Shot Cross-Lingual Aphasia Detection using Automatic Speech Recognition.
We build an end-to-end pipeline for zero-shot aphasia detection that does transfer learning from rich to low-resource languages where manually annotated transcripts are limited.
Chatzoudis, G., Plitsis, M., Stamouli, S., Dimou, A.-L., Katsamanis, A., Katsouros, V.
INTERSPEECH 2022
§ 03  · Contact
Get in touch.