AI's Mean Reversion.

jasonbell · January 29th, 2025, 9:06 pm

Have to admit I love Reinforcement Learning. I also love the amount of back pedalling the US companies are doing at the moment.

katastrofa · January 30th, 2025, 10:21 am

RL never gone away.
Reading the DS paper: 2501.12948
While they aim to explore innovative solution, they are by necessity falling into the footsteps of OpenAI. They started with pure RL, but it went bad, so they added human feedback as cold-start. ChatGPT went all the way with this approach (RLHF - RL from Human Feedback) to make the chat more "human-like". There are many similar or superior models to DS out there, Chinese or not (i'm sure China has better models behing the digital wall). The only diff is that DS is marketed outside of China and as a research innovation.

AI's Mean Reversion.

Re: AI's Mean Reversion.

Re: AI's Mean Reversion.