On Sep 12, OpenAI o1 preview was released. See Introducing OpenAI o1-preview. According to the site,
In our tests, the next model update performs similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology. We also found that it excels in math and coding. In a qualifying exam for the International Mathematics Olympiad (IMO), GPT-4o correctly solved only 13% of problems, while the reasoning model scored 83%. Their coding abilities were evaluated in contests and reached the 89th percentile in Codeforces competitions.
ChatGPT Plus and Team users will be able to access o1 models in ChatGPT starting today.
This is an early release. I have not yet tried it, so I don't know how well it performs. Though here is a review. It looks like it can solve problems from Jackson. And it should be getting better quickly.
Doing your homework for you is the opposite of what we are all about. I don't know how it does explaining concepts. If it does well with concepts, what do we think about it?
Our criticism has been that it just uses words without understanding the meaning. It sounds like this no longer applies.
So, should we reconsider our policy that prohibits use of chatGPT?
Should we consider a criterion where we will consider it good enough? What might this criterion be?