Submitted by learningmoreandmore t3_1077ni4 in MachineLearning
- Cost, Effort, and Performance-wise, does it make more sense to instead just pay to use the OpenAI API and use a cheaper GPT-3 model to lessen business costs? My biggest concern is having my entire business reliant on a 3rd-party API, even more so than the costs of using the model.
- How good is it at writing short stories? If there are better open-source alternatives for doing this better or at a similar level but less resource expensive, what are they?
- How resource-expensive is it to use locally? These are my laptop capabilities:16.0 GB of RAM, AMD Ryzen 7 5800H with Radeon Graphics 3.20 GHz.
- How would I approach fine-tuning it? Are there any resources going through the step-by-step process? Currently, in my mind, I just need to shove a large free-to-use data-set like stories and wait like a day but I have no expertise in this area.
- If I want to incorporate it into a website with an API that takes prompts from users, are there any costs that I should account for? Is there a way to minimize these costs? For example, is there a specific API set-up or one-time cost like an expensive laptop to host it locally and take prompts that I could be implementing?
- Are there any concerns I should have when scaling it for users, such as costs and slow response rate? Also, is there a cap in terms of the requests it can handle or is that just limited by what my own machine can handle?
Tuggummii t1_j3kyf2w wrote
I'm not a professional, but I can answer some of your questions as my personal opinion.
How good is it at writing short stories?
- I don't think GPT-J is dramatically better than the others, especially for text generation. I often see hallucinating, illogical, misconceived text generation. If you want a result like OpenAI's Davinci-003, you may be disappointed despite your fine tuning.
How resource-expensive is it to use locally?
- You need 40GB+ RAM if you're running on CPU. One of my friends has failed on her 32GB RAM and she had to increase her swap memory, then she succeeded with an extremely slow loading time. ( Almost 7~8 minutes ) If you want GPU power, VRAM with float16 need 32GB+ VRAM ( I saw someone using on 24GB ). CPU generates a text from a prompt in 30~45 seconds whereas a GPU generates a text from the same prompt in 3 to 5 seconds.