Emergency_Apricot_77 t1_jdgbocg wrote on March 24, 2023 at 4:39 AM

Reply to comment by whyelrond in [N] ChatGPT plugins by Singularian2501

Care to explain more on symbolic approaches via Wolfram?

Emergency_Apricot_77 t1_jah9rb7 wrote on March 1, 2023 at 12:51 PM

Reply to comment by Kaleidophon in [D] backprop through beam sampling ? by SaltyStackSmasher

Why go with BLEU though ? OP didn't particularly mention optimizing sequence level metrics. Can't we still use cross entropy ? Something as follows:

Sample first token, calculate cross-entropy with first token of gold

Sample second token, calculate cross-entropy with second token of gold

Sample third token, calculate cross-entropy with third token of gold

... and so on ?

This way we still have differentiable metric but we have a much better alignment between train and inference scenarios -- as opposed to current teacher forcing training and sampling inference -- which I thought the OP was going for.

Emergency_Apricot_77 t1_j9b68si wrote on February 20, 2023 at 5:21 PM

Reply to comment by Rockingtits in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

They literally asked for LARGE language models

Emergency_Apricot_77 OP t1_j0fe4lo wrote on December 16, 2022 at 6:01 AM

Reply to comment by prototypist in [D] Is "natural" text always maximally likely according to language models ? by Emergency_Apricot_77

Thanks for this ! Typical decoding paper contains really useful information that is similar to what I was looking for

Emergency_Apricot_77 OP t1_j0c3cii wrote on December 15, 2022 at 3:53 PM

Reply to comment by dojoteef in [D] Is "natural" text always maximally likely according to language models ? by Emergency_Apricot_77

This is VERY similar to what I was looking for. Thanks a LOT for this

Emergency_Apricot_77 t1_iqurybv wrote on October 3, 2022 at 6:36 AM

Reply to comment by Lone-Pine in [D] Types of Machine Learning Papers by Lost-Parfait568

Who?