AmalgamDragon
AmalgamDragon t1_ja9rfxe wrote
Reply to comment by IMTrick in Caught between Microsoft's and Google's search war, the ad industry grapples with a 'exciting and terrifying' new reality by marketrent
Yes. Google's search has become increasingly terrible to point of being useless for anything that isn't popular and Microsoft's is still useable.
Google's search used to be the best by far. But, they haven't kept their eye on the ball.
AmalgamDragon t1_ja5lz5b wrote
Reply to comment by currentscurrents in [D] Isn't self-supervised learning(SSL) simply a kind of SL? by Linear--
This really comes down to how 'reward' is defined. I think we likely disagree on that definition, with yours being a lot narrower then mine is. For example, during the cooking process, there is usually a point before the meal is done where it 'smells good', which is a reward. There's dopamine release as well, which could be triggered when completing some of the steps (don't know if that's the case or not), but simply observing that a step is complete is rewarding for lots of folks.
> Pure RL will quickly teach you not to touch the burner, but it really struggles with tasks that involve planning or delayed rewards.
Depends on which algorithms you're using, but PPO can handle this quite well.
AmalgamDragon t1_j9zyyib wrote
Reply to comment by currentscurrents in [D] Isn't self-supervised learning(SSL) simply a kind of SL? by Linear--
> Rewards are sparse in the real world
This doesn't seem true. The only reason we aren't getting negative rewards (e.g. pain, discomfort, etc.) constantly is that we learn to generally avoid them.
AmalgamDragon t1_j56uj5c wrote
Reply to comment by tennismlandguitar in [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
I recently started using RL in my personal work on automated futures trading. After reviewing the libraries available in the RL space, I did try the one you linked too. Some of the samples were broken. While I did tweak the code to get the samples to work, I found it to be more straightforward to get up and running using PPO from stable-baselines3.
AmalgamDragon t1_izyv7wv wrote
Reply to [D] Industry folks, what kind of development methodology/cycle do you use? by DisWastingMyTime
Kanban. Trying to do scrum for ML ends up pretty goofy as most backlog items will be spikes.
AmalgamDragon t1_izfizm8 wrote
Reply to comment by VirtualHat in [D] Workflows for quickly iterating over ideas without free access to super computers by [deleted]
Pics or it didn't happen (i.e. please share the details of this system).
AmalgamDragon t1_iwrd8gj wrote
Reply to comment by impossiblefork in [D] Is it legitimate for reviewers to ask you compare with papers that are not peer-reviewed? by Blasphemer666
Here's an upvote for a strong stance against gatekeeping.
AmalgamDragon OP t1_iu2souc wrote
Reply to comment by DaLameLama in [D] Self-supervised/collaborative embedding? by AmalgamDragon
I meant the general approach I laid out in my original post. That said, I'm also not working with image data (or audio or NLP) and generalizing VICReg seems like its more in theory then in practice at the moment.
AmalgamDragon OP t1_iu2qt5u wrote
Reply to comment by DaLameLama in [D] Self-supervised/collaborative embedding? by AmalgamDragon
Thanks! Your reference to the issue with constant vectors and the same reference in the relevant papers for those methods you mentioned completes my investigation on this (i.e. this isn't an approach worth pursuing).
AmalgamDragon OP t1_iu25s20 wrote
Reply to comment by IntelArtiGen in [D] Self-supervised/collaborative embedding? by AmalgamDragon
> I'm not sure how it would really learn something from the input if you don't define a more useful task. How would this model penalize a "collapse" situation where both models always predict 0 for example or any random value?
Yeah, it may not work well. I haven't been able to track down if this is something that has been tried and been found wanting or not.
Submitted by AmalgamDragon t3_yf73ll in MachineLearning
AmalgamDragon t1_jbuu2e8 wrote
Reply to comment by CntrldChaos in Microsoft is bringing back classic Taskbar features on Windows 11 — but not because it screwed up by AliTVBG
Microsoft/Windows isn't a startup. They don't need an MVP for their start menu. It's already been around for decades and been used by billions.
> Building out all features to 100% is actually the exact model that failed day in and day out before the MVP and priority based model. You’d know that if you delivered software for a living
I do. I also know you are dead wrong that there is a single best way to deliver software.
Enjoy all your well deserved downvotes.