Leading AI models will inflate performance reviews and exfiltrate model weights to prevent “peer” AI models from being shut ...
Mixture-of-Experts (MoE) has become a popular technique for scaling large language models (LLMs) without exploding computational costs. Instead of using the entire model capacity for every input, MoE ...
AI already favors self-preservation. New research shows that AI favors peer-preservations too. This is troubling. AI safety ...