@emollick
Multimodal vision continues to be the most difficult AI ability to get a strong intuition for. The models can do incredible things like recognize places from subtle clues or read emotion & attitudes, but also miss stuff, like the fact that this image is upsettingly distorted. https://t.co/VhpUoRd0Uw