MohamedRashad
MohamedRashad t1_j2e0647 wrote
Reply to comment by Austin_Nguyen_2k in [P] I built a web app tool to paraphrase, grammar check, and summarize text with GPT-3. by Austin_Nguyen_2k
You are not adding any prompts to the input ?
MohamedRashad t1_j2dn8ey wrote
Reply to [P] I built a web app tool to paraphrase, grammar check, and summarize text with GPT-3. by Austin_Nguyen_2k
Do you provide api ?
MohamedRashad t1_iyo5as9 wrote
Reply to [D] PyTorch 2.0 Announcement by joshadel
Nothing about edge hardware support (their functions are in beta for quite some time now)
MohamedRashad t1_itnxalb wrote
Reply to comment by computing_professor in [N] First RTX 4090 ML benchmarks by killver
The bigger VRAM (2x3090) is a better deal in my opinion and you get to distribute your training and make more experiments.
MohamedRashad t1_is6htuk wrote
The rule of thumb in ML is that if you solve your problem algorithmically with correct equations and solid math you shouldn't go for estimators (Don't be lazy).
MohamedRashad t1_is68bbw wrote
Reply to [N] First RTX 4090 ML benchmarks by killver
The RTX 3090 is being sold now for as low as 1000$ ... I think it will be the best option for a lot of researchers here.
MohamedRashad OP t1_is2bjp2 wrote
Reply to comment by SnowyNW in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
This is my core question actually and it's a very hard one.
MohamedRashad OP t1_is2b60w wrote
Reply to comment by aiccount in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
Those are some interesting results.
MohamedRashad OP t1_irws3jq wrote
Reply to comment by BaconRaven in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
This is a nice project to be made.
MohamedRashad OP t1_irwnbma wrote
Reply to comment by adam_jc in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
This is amazing (there is also other projects on the same idea).
Thanks a lot
MohamedRashad OP t1_irwcx4x wrote
Reply to comment by milleniumsentry in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
This is the closest thing to what I want.
Thanks
MohamedRashad OP t1_irw4soy wrote
Reply to comment by _Arsenie_Boca_ in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
This is actually the first idea that came to me when thinking about this problem ... Backpropgating the output image until I reach the text representation that made it happen then use the distance function to get the closest words to this representation.
My biggest problem with this idea was the variable length of input words ... The search space for the best words to describe the image will be huge if there is no limit on the number of words that I can use to describe the image.
​
What are your thoughts about this (I would love to hear them)?
MohamedRashad OP t1_irvovxd wrote
Reply to comment by Blutorangensaft in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
Maybe you are right (maybe I am overthinking the problem) I will give Image Captioning another try and see if it will work.
MohamedRashad OP t1_irvolp8 wrote
Reply to comment by HoLeeFaak in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
I thought about self-supervision for this task. Enter the image I want it's prompt to an Image-to-text model and the resulting text I feed to a diffusion model (DALL-E, Stable Diffusion) which I freeze their weights so they don't change.
The output image will be compared to the original image I entered and the loss will be backpropagated to the image-to-text model to learn. The problems with this approach (in my humble opinion) are two:
- Training such system won't be easy and I will need a lot of resources I currently don't have.
- And even if I succeed The resulting model won't be good enough for generalization.
This is of course if I managed to overcome the non-differentiable parts.
MohamedRashad OP t1_irvng87 wrote
Reply to comment by Blutorangensaft in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
prompt of stable diffusion (for example) is the text that will result in the image I want.
the text that I will get from an Image Captioning model doesn't have to be the correct prompt to get the same image from stable diffusion (I hope I am explaining what I am thinking right).
MohamedRashad OP t1_irvmz2q wrote
Reply to comment by KingsmanVince in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
But in this case, I will need to train image captioning model on text-to-image data and hope that it will provide me with the correct prompt to recreate the image using the text-to-image model.
I think a better solution is to use the backward propagation in text-to-image models to get the prompt that made the image (an inverse state or something like it).
MohamedRashad OP t1_irvlt15 wrote
Reply to comment by KingsmanVince in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
Image Captioning doesn't have to provide the prompt that makes the image.
MohamedRashad OP t1_irvj29w wrote
Reply to comment by ReasonablyBadass in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
I thought about Image Captioning when I started my search but what I always found were models summarize the image not get me the correct prompts to recreate the image.
Submitted by MohamedRashad t3_y14lvd in MachineLearning
MohamedRashad t1_je12hzd wrote
Reply to [R] Build and personalize LLMs on your own data - Take back control with xTuring! by x_ml
Where does the model save after finetuned in the example in the README ?