@s_batzoglou
I tried GPT o1 in three simple Hilbert-style logical derivations. It failed in two of them, with one failure after repeated attempts. Here is an example of a failed attempt to show (not phi => phi) => phi. Step 7 is wrong, as can be easily observed by doing the substitutinos that o1 claims. And a failed attempt to show (not not phi => phi), where Step 5 is obviously wrong. Here is the prompt I tried. If someone has o1 pro, let me know if it succeeds. @DeryaTR_ @OpenAI