Subhadip
Mitra
Toggle navigation
about
blog
now
publications
projects
cv
more
repositories
archive
contact
licenses
privacy
docs: UPIR
docs: AI Metacognition
ctrl k
Interpretability
an archive of posts in this category
Dec 20, 2025
I Trained Probes to Catch AI Models Sandbagging