Blog

· 35 min read
Sarvam's Illusion of Safety

An independent mechanistic and adversarial audit of Sarvam-30B and Sarvam-105B across 14 Indian languages. The models are 6x more likely to comply with harmful requests in Indian languages than in English.

ResearchInterpretabilitySafety
· 3 min read
Releasing Goedel-mHC-1B

Releasing the First Open 1B+ Language Model with Hyperconnections

ResearchLLMs