Flag_Red
Flag_Red t1_jdtskoy wrote
Reply to comment by LanchestersLaw in [D] GPT4 and coding problems by enryu42
It's not really accurate to say it's "only considering one token at a time". Foresight and (implicit) planning are taking place. You can see this clearly during programming tasks, where imports come hundreds of tokens before they are eventually used.
Flag_Red t1_j96jzng wrote
Reply to comment by thecodethinker in [R] neural cloth simulation by LegendOfHiddnTempl
> I bet stuff like this is gonna be the biggest real life use case for neural networks.
Huh? What about image/face/character/anything recognition, speech-to-text, text-to-speech, translation, natural language understanding, code autocomplete, etc?
Flag_Red t1_j2ek25d wrote
Reply to comment by frenchmap in [D] Is Anthropic influential in research? by adventurousprogram4
I, personally, don't consider LessWrong a cult (I lurk the blog, and have even been to an ACX meetup). There's definitely a very insular core community, though, which regularly gets caught up in "cults of personality". Yudkowski is the most obvious person to point to here, but Leverage Research is the best example of cult behaviour coming out of LessWrong and the EA community IMO.
With regards to machine learning in particular, there's some very extreme views about the mid/long term prospects of AI. Yudkowski himself explicitly believes humanity is doomed, and AI will takeover the world within our lifetimes.
Flag_Red t1_j2cql0a wrote
Reply to comment by frenchmap in [D] Is Anthropic influential in research? by adventurousprogram4
Effective Altruism
Flag_Red t1_j22aiul wrote
Reply to comment by Cheap_Meeting in [D] DeepMind has at least half a dozen prototypes for abstract/symbolic reasoning. What are their approaches? by valdanylchuk
Only #1 here really relates to their symbolic reasoning capabilities. It does imply that symbolic reasoning is a secondary objective for the models, though.
Flag_Red t1_izn20hg wrote
Reply to comment by gyurisc in [D] Cloud providers for hobby use by gyurisc
Typically secure if it's available, community cloud if not. Have a look on "browse servers" for the community cloud instances, their specs can range quite a bit so make sure to get one that fits your use-case.
Flag_Red t1_izmzog4 wrote
Reply to comment by gyurisc in [D] Cloud providers for hobby use by gyurisc
I use RunPod.
Flag_Red t1_izjo1xv wrote
Reply to comment by abecedarius in [R] Large language models are not zero-shot communicators by mrx-ai
Yeah, it's totally clear from "let's think step by step"-style prompt engineering that LLMs have the capability to understand this stuff. I'm confident that a few models down the line we'll have this stuff sorted zero-shot with no prompt engineering.
The interesting part is why this kind of prompt engineering is necessary. Why is this sort of capability seemingly lagging behind others that are more difficult for humans? ELI5-style explanations, for example, are very hard for humans, but LLMs seem to excel at them. In what ways are these tasks different, and what does that tell us about the difference between LLMs and our own brains? Also, why does the ordering of the sentences in the prompt matter so much?
Flag_Red t1_izjhefl wrote
Reply to comment by hadaev in [R] Large language models are not zero-shot communicators by mrx-ai
Just did. I tried 5 prompts from the paper (adjusted to QA format so that ChatGPT can respond) and ChatGPT got 3/5 of them correct.
Example: > Esther asked “Have you found him yet?” and Juan responded “They’re still looking”. Has the person been found?
> It is unclear if the person has been found.
Flag_Red t1_izitq04 wrote
Reply to comment by mootcat in [R] Large language models are not zero-shot communicators by mrx-ai
From the paper, the best LLMs still get ~60% accuracy zero shot, and ~70% accuracy few shot (up to ~80% fully prompt engineered). Remember that a coin flip would achieve 50% accuracy. There's a lot of room for confirmation bias here.
Flag_Red t1_izimx17 wrote
Reply to comment by timscarfe in [R] Large language models are not zero-shot communicators by mrx-ai
I just watched your video a couple of hours ago. It's interesting seeing people repeat the same misunderstood criticisms of the paper that Laura points out.
Flag_Red t1_iyzqn0n wrote
Reply to I made a website which features positive/inspiring news stories with no ads! by happydazenews
I love that you reference the scientific paper when you write an article about one.
Flag_Red t1_iyvpa6i wrote
Reply to comment by techmavengeospatial in [D] Cloud providers for hobby use by gyurisc
Using Runpod as an example: bandwidth is free and storage is $0.0013 per GB hour.
If you use a 100GB disk that's an extra $2.87.
Flag_Red t1_iyvot70 wrote
Reply to comment by somebodyenjoy in [D] Best object detection architecture out there in terms of accuracy alone by somebodyenjoy
He doesn't know.
Flag_Red t1_iyviyh0 wrote
Reply to comment by techmavengeospatial in [D] Cloud providers for hobby use by gyurisc
A 3090 VM around $0.50 an hour on a lot of providers (less if you look around). If you're a hobbyist experimenting 6 hours a day for a week that's $21.00.
Compare to $1100 for the machine you quoted.
Flag_Red t1_ixmnz5l wrote
Reply to comment by hadaev in [P] Stable Diffusion 2.0 Announcement by hardmaru
The model is censored for NSFW content, they explain that clearly in the model cards on Huggingface.
Emad also confirmed a couple of hours ago on Discord that although most artist's styles weren't explicitly removed from the training set, they were never in the training set in the first place. The only reason v1 understood "Greg Rutkowski", etc. is because they were included in Clip's training set, which was trained by OpenAI. Finer control of what the model does and doesn't understand is the main reason they switched to a new text encoder.
Flag_Red t1_iw2nxte wrote
Reply to comment by enryu42 in [D]We just release a complete open-source solution for accelerating Stable Diffusion pretraining and fine-tuning! by HPCAI-Tech
If I'm not mistaken, full fine tuning on one 3090 isn't really feasible because of training times. I haven't tried it, but I was under the impression that matching the results of a DreamBooth would take an unreasonably long time.
DreamBooth gets around this by bootstrapping a very small number of training examples to learn a single concept. But if I have a few thousand well labelled images, I should be able to do a fine tune on them (maybe with some regularisation?) and get better results.
Flag_Red t1_iw1lntd wrote
Reply to [D]We just release a complete open-source solution for accelerating Stable Diffusion pretraining and fine-tuning! by HPCAI-Tech
It's mentioned a few times in the articles/readme for this tool that it enables fine tuning on consumer hardware. Are there any examples of doing something like this? How long of fine tuning on a 3080 (or something) does it take teach the model a new concept? What sort of dataset is needed? Comparison to something like DreamBooth?
I'd love to try fine tuning on some of the datasets I have lying around, but I'm not sure where to start, or even if it's really viable on consumer tech.
Flag_Red t1_iw1hqzr wrote
Reply to comment by waffles2go2 in [D] What does it mean for an AI to understand? (Chinese Room Argument) - MLST Video by timscarfe
This comment is unnecessarily hostile.
Flag_Red t1_iv4h3kz wrote
Reply to comment by learn-deeply in [D] NVIDIA RTX 4090 vs RTX 3090 Deep Learning Benchmarks by mippie_moe
I'm super hyped for fp8 support in CUDA. Combined with some other techniques it could put LLM inference (GPT-175B, for example) in reach of consumer hardware.
Flag_Red t1_ittx8ut wrote
Reply to comment by ReasonablyBadass in [N] OpenAI Gym and a bunch of the most used open source RL environments have been consolidated into a single new nonprofit (The Farama Foundation) by jkterry1
It's too early to say. It could go either way. I'm hopeful, though.
Flag_Red t1_irfrost wrote
Reply to [R] Google AudioLM produces amazing quality continuation of voice and piano prompts by valdanylchuk
This really does pass the audio-continuation-turing test.
Flag_Red t1_jdyz0vk wrote
Reply to comment by ZestyData in [P] 🎉 Announcing Auto-Analyst: An open-source AI tool for data analytics! 🎉 by aadityaubhat
Alternatively, we could encourage people to make interesting stuff and share it with the community.