csreid
csreid t1_j91llzp wrote
Reply to comment by Optimal-Asshole in [D] Please stop by [deleted]
>Be the change you want to see in the subreddit.
The change I want to see is just enforcing the rules about beginner questions. I can't do that bc I'm not a mod.
csreid t1_j8p5z30 wrote
Reply to comment by farmingvillein in [R] RWKV-4 14B release (and ChatRWKV) - a surprisingly strong RNN Language Model by bo_peng
But they theoretically support infinite context length. Getting it is a problem to be solved, not a fundamental incompatibility like it is with transformers.
csreid t1_j8dqcrs wrote
Reply to [D] Quality of posts in this sub going down by MurlocXYZ
I like that /r/science (I think?) has verification and flair to show levels of expertise in certain areas, and strict moderation. I wouldn't hate some verification and a crackdown on low-effort bloom-/doom-posting around AI ("How close are we to star trek/skynet?").
csreid t1_j2yva43 wrote
Reply to [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
Yes, but I'm less sure about language models at the really high level (eg arriving at novel solutions to hard problems through LLMs).
Most ML in practice isn't about doing better than a person, it's about doing it faster and cheaper. Could a human who studied my viewing habits curate better Netflix recommendations for me? Obviously, but Netflix can't afford to do that for everyone and it would take forever.
There's also ML that's not based on data generated by humans. I know we're in the era of LLMs, but that's not all there is
csreid t1_j109ic5 wrote
Reply to comment by Hyper1on in [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite
>I also think that federation is a terrible way to run a social network
How come? I was a little put off by the federated nature, but it doesn't actually get in your way once you're in. I expected it to be more siloed but it's not. Discoverability actually seems better than twitter bc people sort themselves into nice buckets. It's a little like if "ML twitter" was an actual thing rather than just a collection of accounts.
I am also into the idea of opting in to mod/admin policies that suit me, and I've become pretty skeptical of centralizing after this whole fiasco
csreid t1_j0wsuhz wrote
Reply to comment by tpm319 in [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite
> the same group(s)
What do you mean? I'm not turned off by the groups on discord/slack, I'm turned off by the whole experience. It's like ppl are trying to jam a social network into a chat app (bc they are).
I'm using tusky for mastodon on my phone and it's kinda rad. I will probably never use the native Mastodon web interface. I also never used the Twitter web interface, but I'm learning that's maybe weird
csreid t1_j0vqx9d wrote
Reply to comment by tpm319 in [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite
Slack and discord are terrible twitter replacements, cmv (no really change my view, lots of my former Twitter communities have forked to discord and I wanna participate)
csreid t1_j0vqr3p wrote
Reply to comment by killver in [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite
I like the idea of something built on mastodon, or at least something that can interop with it. For some reason I'm feeling very wary about siloing all my social media into something that can be bought and completely burned down.
csreid t1_izbhqi8 wrote
Reply to comment by Wahajs in [D] Simple Questions Thread by AutoModerator
You might be able to start with Rasa, which is an open source chatbot framework.
csreid t1_izbhhqs wrote
Reply to comment by augustintherome in [D] Simple Questions Thread by AutoModerator
What you're describing is just called "question answering" in NLP afaik. A language model will take in a source document and a question and spit out either a generated answer to the question or a section of the source text containing the answer.
Check some of the QA models on huggingface to get an idea if you're not already familiar
csreid t1_izbgnyy wrote
Reply to comment by still_tyler in [D] Simple Questions Thread by AutoModerator
> The spatial components make me want to use a CNN, but each input being just a 1x3 vector rather than something bigger makes me think that's not possible?
The point of the convolution is to efficiently capture information from surrounding pixels when considering a single pixel. Back in the pre-DL olden days, computer vision stuff still involved convolutions, they were just handcrafted -- we had a lot of signal processing machinery we could use to eg detect edges and such. In your case, you don't really have anything to convolve over.
You could try just feeding the coordinates into an MLP with the other covariates and it should be able to capture that spatial component.
csreid t1_iza3c96 wrote
csreid t1_iza2yzk wrote
Reply to comment by hayAbhay in [D] If you had to pick 10-20 significant papers that summarize the research trajectory of AI from the past 100 years what would they be by versaceblues
Was it expected at the time that the embeddings would be so ... idk, semantically significant?
I feel like "king - man + woman = queen" is very unintuitive if you don't know about it already and it would've felt huge
csreid t1_iykq7xn wrote
Reply to comment by whatsafrigger in [R] Statistical vs Deep Learning forecasting methods by fedegarzar
And it's sometimes kinda hard to realize you're doing a bad job, especially if your bunk experiments give good results
I didn't have a ton of guidance when I was writing my thesis (so, my first actual research work) and was so disheartened when I realized my excellent groundbreaking results were actually just from bad experimental setup.
Still published tho! ^^jk
csreid t1_iwrhjh0 wrote
Reply to comment by FetalPositionAlwaysz in [D] Simple Questions Thread by AutoModerator
You don't need to go super huge to practice. You can try building a CNN on CIFAR-10, for example, which will be slower than using some $200/hr AWS box but can be done in a reasonable amount of time in a laptop.
csreid t1_iwrh9rt wrote
Reply to comment by stjernen in [D] Simple Questions Thread by AutoModerator
>- Is reward a input?
Kind of, in that it comes from the environment
>- Is reward the process of constant retraining?
I'm not sure what this means
>- Is reward the process of labeling?
No, probably not, but I'm not sure what you mean again.
>- Can it only be used with mdp?
MDP is part of the mathematical backbone of reinforcement learning, but there's also work on decision processes that don't satisfy the Markov property (a good google term for your card-playing use case would probably be "partially observable Markov decision processes", for example)
>- Can it only be used in ql / dql?
Every bit of reinforcement learning uses a reward, afaik
>- I dont use cnn and images, can it be done without?
Absolutely! Training process is the same regardless of the underlying design of your q/critic/actor/etc function
>- Lots of examples out there using «gym», can you do it without?
You can, you just need something which provides an initial state and then takes actions and returns a new state, a reward, and (sometimes) an "end of episode" flag.
>- Many examples use -100 to 100 as reward, should it not be -1 to 1?
Magnitude of reward isn't super important as long as it's consistent. If you have sparse rewards (eg 0 except on win or loss), it might help to have larger values to help the gradient propagate back through the trajectory, but that's just me guessing. You can always try scaling to -1/1 and see how it goes.
I read "Reinforcement Learning" by Sutton and Barto (2018 edition) over a summer and it was excellent. Well-written, clear, and extremely helpful. I think what you're missing is maybe the Bellman background context.
csreid t1_iwrfqwx wrote
Reply to comment by Ordinary_Style_7641 in [D] Simple Questions Thread by AutoModerator
Yeah, this is a pretty typical use case for NLP.
csreid t1_iwreu2l wrote
Reply to comment by princessdrive in [D] Simple Questions Thread by AutoModerator
Check out rasa
There's ML in there, and I think any chatbot would qualify as AI, basically.
csreid t1_iwre4vj wrote
Reply to comment by nomanland21 in [D] Simple Questions Thread by AutoModerator
The obvious answer is recommender systems in basically everything -- online shopping, streaming, social media, etc
csreid t1_iwrc1qs wrote
Reply to comment by ssharpe42 in [P] Modeling baseball injuries with temporal point processes by ssharpe42
Hell yeah, just needed to scroll down a lil further, ty
csreid t1_iwm0di2 wrote
I've always kicked around the idea of using a Hawkes process to model the concept of "momentum" in sports (which statistically doesn't seem to exist but has tons and tons of people who will chase you with weapons when you tell them that), but I'm lazy.
You wouldn't be willing to open source the code here, would you? 😅
csreid t1_iwlzags wrote
Depending on what you mean by "custom", I still put those things together like legos and fine-tune
Also, I do mostly RL things and a lot of that stuff doesn't have good pretrained things (at least not for my purposes).
csreid t1_iwlyxnz wrote
Reply to [Research] Can we possibly get access to large language models (PaLM 540B, etc) like GPT-3 but no cost? by NLP2829
>Also, I can request up to 372GB VRAM, is there any large language model (#parameters > 100B) that I can actually download and run "locally"?
I've never done anything non-trivial with LLMs but even using 32 bit floats for 100B parameters should take 400 gigs of RAM, right?
csreid t1_isat8n1 wrote
Reply to comment by echoAwooo in Do crickets respond to TV’s and video audio, with their own sounds? by Bony_Geese
Sometimes I wonder what, when I'm old, is going to be the thing that my generation was obviously backwards and awful and ignorant about, but more and more I think it's gonna be that lots of animals are smarter/more aware than we realized and we're going to be severely but fairly judged for the way we treated them.
csreid t1_j91lr4n wrote
Reply to comment by Deep-Station-1746 in [D] Please stop by [deleted]
More people with varied backgrounds and interests in a place is good, especially in a field with as much cross-niche potential as machine learning.