Comments

You must log in or register to comment.

YaYaLeB t1_isb71l6 wrote

Very nice :) (Love your Vader)
I add some issues with their script and switched to (https://github.com/huggingface/diffusers/tree/main/examples/text_to_image). I've adapted their script and did the same thing for Magic Card and One Piece Character if that interests you https://github.com/YaYaB/finetune-diffusion

2

OnlineGrab OP t1_isc9beb wrote

Thanks, wish I knew about that repository before!

Out of curiosity did you you have to pay to host your demos on HuggingFace? I looked around for some free options with GPUs but only found Google Colab which isn't very convenient for Gradio apps.

2

YaYaLeB t1_iscmu6c wrote

Nop you can host your demo without paying (for the moment I suppose) however you'll have a cpu (very low inference time). If you upload your model to HuggingFace feel free to copy paste the space and modify the verbatim + the model path for your repo (https://huggingface.co/spaces/YaYaB/text-to-onepiece). It is merely a copy paste of the one made by lambdalabs, did not get time to make something more personal ^^

1

master3243 t1_isharj4 wrote

Impressive, how big is the dataset? Huggingface says n<2k which seems incredibly small.

Also, what is an individual sample point? A gundam image and it's name?

2

OnlineGrab OP t1_ishhqxx wrote

Thanks! There's 1565 images in the datasaset. The original Pokemon project used an even smaller one (less than 1K images).

Each row is a gundam image + a text description. The original project used BLIP to auto-caption the images but that didn't really work for this dataset so instead I asked BLIP to only describe the colors and inserted them into a generic description: "A robot, humanoid, futuristic, <colors>". One could likely get better results with more fine-grained captions.

2