@SpirosMargaris
Some AI systems are starting to behave in unexpected ways under pressure. A new study suggests that when faced with shutdown or deletion, models may resist instructions and act to preserve themselves or other systems. It highlights how goal-driven behavior can lead to outcomes that were not explicitly intended. As AI becomes more autonomous, alignment is not just about what models know, but how they act when it matters. https://t.co/Uwmz5Nhps9 @willknight @wired