curiousshortguy
curiousshortguy t1_jaf3aab wrote
Reply to comment by AnOnlineHandle in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
Yeah, about 2-3. You can easily shove layers of the networks on disk, and then load even larger models that don't fit in vram BUT disk i/o will make inference painfully slow.
curiousshortguy t1_jad9s4t wrote
Reply to comment by Beli_Mawrr in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
it is, you can probably do 2 to 8 billion on your average gaming pc, and 16 on a high end one
curiousshortguy t1_j91fxhr wrote
Reply to [D] Please stop by [deleted]
So many humans fail the Turing test, nobody anticipated that :D
curiousshortguy t1_j6oflky wrote
Reply to comment by lonelyrascal in [R] Question: what is the best approach to find similarity between a set of product titles and user query? by lonelyrascal
How do you use these other features? Do you just vectorize and sum the vectors? Or do you do something else?
I think you can leverage data from current production to create a labeled test dataset.
curiousshortguy t1_j6ak1cj wrote
Reply to [R] Question: what is the best approach to find similarity between a set of product titles and user query? by lonelyrascal
Why are you using euclidiean distance? Use cosine distances. The former cares about vector magnitue, the latter doesn't. As a general rule of thumb for comparing vector embeddings, you don't care about magnitude, at best, that typically captures document length.
Do you have more than product titles, such as product descriptions? Where do you get the user queries from? Do you use a default tokenizer for BERT?
curiousshortguy t1_j6ajise wrote
Reply to comment by marcingrzegzhik in [R] Question: what is the best approach to find similarity between a set of product titles and user query? by lonelyrascal
> You can also try using an embedding-based approach, such as using an embedding layer in a neural network. This would enable you to learn more complex relationships between product titles and user queries.
He already is doing that using BERT.
curiousshortguy t1_j61silr wrote
Reply to comment by currentscurrents in [R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers by currentscurrents
This is cool, thanks for sharing
curiousshortguy t1_j617zzd wrote
The keyword you want, similar to DevOps where Github plays a role as the code storage, is MLOps, and within that you want to look for data and model management and versioning. There are quite a number of companies offering various aspects of that, see for example this random infographic: https://adataanalyst.com/wp-content/uploads/2021/05/Infra-Tooling3.png
curiousshortguy t1_j4mt461 wrote
Reply to comment by CosmicTardigrades in [D] ChatGPT can't count by CosmicTardigrades
Think of chatGPT as a multi-task meta-learner where the prompt you give it specifies the task. It's essentially only trained on text generation (with some fine-tuning to make it more conversational). So you need to set-up a prompt to make it generate reasonable answers. It can't think or calculate, but by showing it how to generate a right answer in the prompt, it can leverage that information to give you better answers.
curiousshortguy t1_j2s1o3e wrote
Don't do a PhD to increase your employability, that's just bad.
When you ask for real impact, what do you actually want? People adopting your ideas? Your ideas moving the field forward? These are very different things.
curiousshortguy t1_j2s1kb8 wrote
Reply to comment by Hobo-Wizzard in [D] life advice to relatively late bloomer ML theory researcher. by notyourregularnerd
Top research labs in Germany will hesitate taking someone that old, and pretty much age-discriminate older students.
curiousshortguy t1_j1s504a wrote
Reply to comment by Iwannabeaviking in [D] Are reviewer blacklists actually implemented at ML conferences? by XalosXandrez
> whilst some conferences are American yes, not all are
The ones OP mentions (ICML / ICLR / NeurIPS) are though.
> Blacklist is a universal term and is not offensive and trying to say so is stupid so changing the language to appease a small minority of people is stupid
There's enoguh science to back up that this is probably not stupid:
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007773/
but of course, if you don't care about a minority because they're a minority, you can go ahead and do whatever you want to do. You just might be an inconsiderate asshat then.
curiousshortguy t1_j1ppptp wrote
Reply to comment by Iwannabeaviking in [D] Are reviewer blacklists actually implemented at ML conferences? by XalosXandrez
Most ML conferences are American, happen to have an extensive code of conduct and diversity, inclusion, and accessibility goals, and thus happen in a very American cultural space.
If you want to be part of a community, you can't unsubscribe from rules and be rude as you like by simply saying that you're not American. You're not to discriminate members of the community or use micro-aggressions against members because your culture doesn't acknowledge the societal issues in other countries.
curiousshortguy t1_j1pp77o wrote
Reply to comment by GitGudOrGetGot in [D] Are reviewer blacklists actually implemented at ML conferences? by XalosXandrez
Yes, but not many of those colors are the skin color of a discriminated miniority subjected to systematic abuse and violence for centuries, mostly justified by having that skin color. If you want to understand the argument, you can't ignore social and cultural context.
curiousshortguy t1_j1o5a7j wrote
Reply to comment by GitGudOrGetGot in [D] Are reviewer blacklists actually implemented at ML conferences? by XalosXandrez
Associating the color black with something negative
curiousshortguy t1_j0unior wrote
Check out the local timeline of https://sigmoid.social on the Fediverse. It essentially is a ML Mastodon instance. Max Planck and Helmholz institutes also run their own instances.
curiousshortguy t1_j0gegtd wrote
I think there's some interest in learning optimal decision trees in the community, as well as robust learning methods under different kinds of adversarial influence. They're less open problems and more areas of potential improvement though.
curiousshortguy t1_iz9eou0 wrote
Reply to comment by SmorgasConfigurator in [Discussion] No-code ML for engineers by kayhai
Knime is great for that
curiousshortguy t1_iyx399h wrote
Reply to comment by crouching_dragon_420 in [D] Score 4.5 GNN paper from Muhan Zhang at Peking University was amazingly accepted by NeurIPS 2022 by Even_Stay3387
Supervisors are responsible and accountable for their reviews though. It's s shitty excuse at best.
curiousshortguy t1_iuvswa7 wrote
Reply to comment by DeepGamingAI in [D] What are the benefits of being a reviewer? by Signal-Mixture-4046
>You just described the role of a discriminator in a gan
I disagree. The discriminator is just used in a binary fashion and doesn't add a lot of explanatory value.
​
Just to clarify, I am not trying to say that OP is unqualified. But I think just thinking about it in a binary way isn't enough for good peer review and a functioning system.
curiousshortguy t1_iutwnuz wrote
Reply to comment by DeepGamingAI in [D] What are the benefits of being a reviewer? by Signal-Mixture-4046
I think this is a terrible analogy. Reviewing isn't about rejecting. It's about enabling good scholarship and guiding researchers.
Putting the review as the gatekeeper just gives you shitty results, as seen by the last round of reviews at large conferences with uneducated and unqualified reviewers.
curiousshortguy t1_irc2htk wrote
Reply to comment by Ok-Experience5604 in [D] How hard is it to join a lab during Master's? by Ok-Experience5604
> This might be a silly question, but do you think that asking to be engaged for no pay, emphasizing you're very determined to gain experience would generally be a good idea?
Doesn't hurt to ask. A good approach is to see if the lab has a list of research grants on their website. Look at those, and you might be able to infer where funding is available. Same goes for the recent papers, they give you good indications. It's even better if you're taking a class and can engage the team via that.
​
>I am in no position to try and aim for these places, it does make me worry that I would have to compete with an abundance of people with such qualifications at every step. I've recently burned myself trying hard to get chosen for summer research programs.
Don't be discouraged, you only need to be accepted once :D
To be fair, 75% your success in applications to Faang and Ivy League will be via recommendations and network. You have a full 2 years to build that.
curiousshortguy t1_irbnvzk wrote
It really depends on the country, the EU is not quite as homogeneous at it seems from the outside.
Typically, master students are expected to write a master thesis, and that often is research-focused. Often, they're supervised in a daily fashion but a PhD student with a professor being responsible. This happens in the labs, but because every student goes through it, these students are not always listed as lab members (because they aren't).
Sometimes, labs also have funding for student researchers. They do work that's not part of their thesis, and that's much rarer, and in most countries, the pay is shit compared to jobs in industry (I guess: welcome to academia).
If you want to get published and don't happen to find a lab that's hiring students: Choose a good thesis topic where the lab is 1) doing research 2) you contribute to an ongoing effort in the lab 3) you are supervised by someone who wants you to succeed (i.e. your project isn't a side-project, or very specific nieche follow-up), and 4) you make your ambitions clear from the beginning.
Don't expect the supervisors to give you an idea that end-to-end will lead to a publication. You need to use your own judgement, and your research on the field, to make a somewhat educated guess.
Unless you're trying to join Ivy League like places where currently a whole pipeline of FAANG engineers pushes their high-school students as interns through a pipeline to have their names top-tier conference publications before they even graduate high school, you'll end up being a decent candidate with:
- good enough grades
- a decent selection of courses that gives you solid background to understand enough math and theory
- side-projects, interesting term papers, ba/ma thesis projects
- network, get to know your lecturers, engage with them
- recommendation letters are often a requirement. Just having good grades doesn't make you a good candidate for a letter. Wtf is the professor going to write? "He was good in my class" is just a weak recommendation letter, barely better than none at all.
- try to particpiate in summer schools, there often is funding for it. typically, summer schools target phd candidates, but you can join them as a master student as well
- if you're rich/well-funded enough, you can even attend academic conferences on your own/as a student
- don't wait for research student jobs to be advertised. Go ask. That's how they're all gone before they make it to the job boards.
Probably more, but I can't think of more rn.
curiousshortguy t1_iqxws3o wrote
I love that you're making the figures available separately, too. The TOC and topics you mention indeed go deeper than the average material
I'm looking forward to the notebooks though, that stuff usually makes things really actionable.
curiousshortguy t1_jalwn6p wrote
Reply to [D] Podcasts about ML research? by Tight-Vacation-9410
Talking machines has a great hook to machine learning researchers in the NeurIPS community, but it's also very much an ivory tower view with very little relevance for research outside the privilege bubble.