Viewing a single comment thread. View all comments

currentscurrents t1_jaetyg1 wrote on February 28, 2023 at 10:40 PM

Reply to comment by dancingnightly in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Can't the reward model be discarded at inference time? I thought it was only used for fine-tuning.