METR’s randomized trial found AI tools slowed experienced open-source developers by 19%, highlighting benchmarking limits, ...