A similar idea is used with experience replay in DQNs. For RL, it's important to ensure failure states are retained in the replay buffer so it keeps being reminded they are a failure or it starts to forget and then does dumb things. In RL the phenomenon is called 'catastrophic forgetting'.
EDMismyO2 t1_izy6ydb wrote
Reply to comment by IntelArtiGen in [D] G. Hinton proposes FF – an alternative to Backprop by mrx-ai
A similar idea is used with experience replay in DQNs. For RL, it's important to ensure failure states are retained in the replay buffer so it keeps being reminded they are a failure or it starts to forget and then does dumb things. In RL the phenomenon is called 'catastrophic forgetting'.