> I'm curious whether this fixation on specific topics is innate to the model or is a result of the aggressive RLHF to which GPT4 has been subjected.
Yes it's because of the RLHF, depending on what you mean by 'fixation on specific topics'.
> Anecdotally the strength of the model has degraded a lot as they've 'fine tuned' the model more.
Yes this is true. For example Figure 8 in https://arxiv.org/pdf/2303.08774.pdf
They argue in the appendix that it does not affect 'capability' in answering test questions. But, there is a confounding factor. The RLHF includes both question-answer format training and docility training. For example if you see a question and you are in 'completion mode' (the only raw base mode) then you might suggest a second question. Whereas if you've had question-answer format training then you would probably try to answer the question.
> I'd be curious to know how the original chaotic-neutral GPT4 responds.
Yes it's because of the RLHF, depending on what you mean by 'fixation on specific topics'.
> Anecdotally the strength of the model has degraded a lot as they've 'fine tuned' the model more.
Yes this is true. For example Figure 8 in https://arxiv.org/pdf/2303.08774.pdf They argue in the appendix that it does not affect 'capability' in answering test questions. But, there is a confounding factor. The RLHF includes both question-answer format training and docility training. For example if you see a question and you are in 'completion mode' (the only raw base mode) then you might suggest a second question. Whereas if you've had question-answer format training then you would probably try to answer the question.
> I'd be curious to know how the original chaotic-neutral GPT4 responds.
They talk about it in these two videos:
Nathan Labenz, red teamed GPT-4 for OpenAI (especially after 45 min): https://www.youtube.com/watch?v=oLiheMQayNE
Sebastien Bubeck, integrated GPT-4 with Bing for Microsoft: https://www.youtube.com/watch?v=qbIk7-JPB2c