@vaidehi_patil_
🚨Can Sensitive Information Be Deleted From LLMs? We show that extraction attacks recover 18-38% of "deleted" knowledge! Our attack+defense framework has whitebox+blackbox attacks. New defense objectives lower attacks to 2%! https://t.co/xluew4BBoS @peterbhase @mohitban47 🧵 https://t.co/0YPxeuTuEE