askingforhelp1111
askingforhelp1111 t1_j81hia1 wrote
Reply to comment by gingerbread42 in [D] Speed up HuggingFace Inference Pipeline by [deleted]
Thanks for the idea!
askingforhelp1111 t1_j81ggm0 wrote
Reply to comment by coolmlgirl in [D] Speed up HuggingFace Inference Pipeline by [deleted]
Sure, I have a few links. All of them have an inference speed of 4-9 seconds.
https://huggingface.co/poom-sci/WangchanBERTa-finetuned-sentiment
https://huggingface.co/ayameRushia/bert-base-indonesian-1.5G-sentiment-analysis-smsa
I call each checkpoint like this:
nlp = pipeline('sentiment-analysis',
model=checkpoint,
tokenizer=checkpoint)
Thank you!
askingforhelp1111 t1_j8cmbr6 wrote
Reply to comment by machineko in [D] Speed up HuggingFace Inference Pipeline by [deleted]
Much thanks for the reply, would love to read your resources on compression and inference.
I'm keen on cutting down costs. Previously ran on GPU via AWS EC2 instance but gotta tighten the company's belt this year and my manager suggested running on CPU. Love to hear your suggestions too (if any).