Viewing a single comment thread. View all comments

halixness t1_j9e80y1 wrote on February 21, 2023 at 7:22 AM

So far I have tried BLOOM Petals (a distributed LLM), inference took me around 30s for a single prompt on a 8GB VRAM gpu, but not bad!