SeymourBits
SeymourBits t1_jdlwrgi wrote
Reply to comment by itsnotlupus in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
This is the most accurate comment I've come across. The entire system is only as good and granular as the CLIP text description that's passed into GPT-4 which then has to "imagine" the described image, often with varying degrees of hallucinations. I've used it and can confirm it is currently not possible to operate anything close to a GUI with the current approach.
SeymourBits t1_jdlkln7 wrote
Reply to comment by Disastrous_Elk_6375 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
I second this. I was able to extract fairly useful results from Neo but it took a huge amount of prompt trial and error, eventually getting decent/stable results but not in the same ballpark as GPT3+. The dolly training results here seem good, if not expected. I'm now ready to move to a superior model like LLaMA/Alpaca though. What are you running?
SeymourBits t1_jdh76ol wrote
Reply to comment by JigglyWiener in [N] ChatGPT plugins by Singularian2501
HAL-9000 and KITT... here we come!
SeymourBits t1_jdh6v46 wrote
Reply to comment by mudman13 in [N] ChatGPT plugins by Singularian2501
I tried it yesterday and it worked fairly well but described some details that didn't exist.
SeymourBits t1_jdnttwn wrote
Reply to [D] What happens if you give as input to bard or GPT4 an ASCII version of a screenshot of a video game and ask it from what game it has been taken or to describe the next likely action or the input? by Periplokos
Interesting experiment. I have not done it but I predict it would hallucinate a well-documented video game screen, like Pac-man, and then describe probable actions within the hallucinated game.