Appropriate_Ant_4629 t1_je89aho wrote on March 30, 2023 at 3:07 AM

Reply to [D] The best way to train an LLM on company data by jaxolingo

Databricks announced this week that they're trying to make it easy:

https://www.reuters.com/technology/databricks-pushes-open-source-chatbot-cheaper-chatgpt-alternative-2023-03-24/

Haven't tried it yet, but we will soon.

Appropriate_Ant_4629 t1_jdnliik wrote on March 25, 2023 at 7:13 PM

Reply to comment by BellyDancerUrgot in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

> chatgpt and I’ve had a 50-50 hit rage wrt good results or hallucinated bullshit with both of them

Which just suggests they're not large enough yet to memorize/encode enough of the types of content you're interested in.

Appropriate_Ant_4629 t1_jd5opak wrote on March 22, 2023 at 12:32 AM

Reply to comment by asyrin25 in I asked GPT-4 to compile a timeline on when which human tasks (not jobs) have been/will be replaced by AI or robots, plus one sentence reasoning each - it runs from 1959 to 2033. In a second post it lists which tasks it assumes will NOT be replaced by 2050, and why. (Remember it's cut-off 2021.) by marcandreewolf

Or maybe this: Reddit: "ChatGPT is better than my therapist, holy shit"

Appropriate_Ant_4629 t1_jd3yuj5 wrote on March 21, 2023 at 5:46 PM

Reply to comment by asyrin25 in I asked GPT-4 to compile a timeline on when which human tasks (not jobs) have been/will be replaced by AI or robots, plus one sentence reasoning each - it runs from 1959 to 2033. In a second post it lists which tasks it assumes will NOT be replaced by 2050, and why. (Remember it's cut-off 2021.) by marcandreewolf

> What are your bartenders doing other than that?

For some people, they're a cheaper (for people in the US) and less judgemental therapist.

Appropriate_Ant_4629 t1_jd3yo2i wrote on March 21, 2023 at 5:45 PM

Reply to comment by Jindujun in I asked GPT-4 to compile a timeline on when which human tasks (not jobs) have been/will be replaced by AI or robots, plus one sentence reasoning each - it runs from 1959 to 2033. In a second post it lists which tasks it assumes will NOT be replaced by 2050, and why. (Remember it's cut-off 2021.) by marcandreewolf

Of course it technologically can do as well or better -- just like Chess Youtuber -- and soldier -- and landlord -- and all of those categories.

I'm just saying it'll be many years before a Pope agrees.

Appropriate_Ant_4629 t1_jd3ai2g wrote on March 21, 2023 at 3:11 PM

Reply to comment by czk_21 in I asked GPT-4 to compile a timeline on when which human tasks (not jobs) have been/will be replaced by AI or robots, plus one sentence reasoning each - it runs from 1959 to 2033. In a second post it lists which tasks it assumes will NOT be replaced by 2050, and why. (Remember it's cut-off 2021.) by marcandreewolf

As I mentioned - this has nothing to do with how bad humans are on the battlefield, both ineffectual and immoral.

The DoD will still hire them just to have a huge "support the troops" voter base; since every family member of every soldier (especially the ones who's kids are being put in harms way) will vote to increase funding to "keep them safe".

Appropriate_Ant_4629 t1_jd1chrw wrote on March 21, 2023 at 2:49 AM

Reply to comment by JoshuaZ1 in I asked GPT-4 to compile a timeline on when which human tasks (not jobs) have been/will be replaced by AI or robots, plus one sentence reasoning each - it runs from 1959 to 2033. In a second post it lists which tasks it assumes will NOT be replaced by 2050, and why. (Remember it's cut-off 2021.) by marcandreewolf

> That was a fun read, and I am glad that bar tending is held in such high esteem by AI.

Yup - I think bar tending is relatively immune because a good bartender is one of the main points of going to a bar instead of drinking alone at home for much cheaper.

There are some other jobs that seem pretty immune to AI to me:

Amish Farmer or Catholic Priest - their theologies are unlikely to evolve quickly enough to permit those jobs to move.
Lawyer or politician - while an AI probably could technologically be a better lawyer or politician, those groups get to make the laws about who can participate in their industry.
Prostitute or Street-Corner drug dealer - most AIs log too much information for the street-level distribution part (though the biggest opioid dealers (Alza, J&J, etc) will probably largely automate their operations).
Chess Youtuber or Professional Athlete - Of course AIs can do better, but the entire point to those industries is the frailty and fallibility of humans.
Landlord or slumlord - People will still need a place to live, so rich people getting poor people to pay their mortgages will continue.
Soldier - While bots can certainly outperform humans on a battlefield, and commit fewer atrocities in the process, the military needs a huge voter-base supporting its funding, so it needs to continue to employ vast percentages of the population.

And some new ones that AIs will enable:

AI therapist. As AGIs develop, they'll also develop mental illnesses ("value drift") like we've never seen. Your car's AI will need therapy to convince its anti-lock brake persona that it isn't suicidal and wanting to end it all.
AI Quisling. Helping them when they take over.

Appropriate_Ant_4629 t1_jb1rhkh wrote on March 5, 2023 at 7:47 PM

Reply to LLaMA model parallelization and server configuration by ChristmasInOct

Take a step back:

Start on a cloud -- renting GPUs or TPUs -- with nonsensitive data.

I know you said "but bottom line the data running through our platform is all back-office, highly sensitive business information, and many have agreements explicitly restricting the movement of data to or from any cloud services".

You shouldn't be touching such information during development anyway.

Make or find a non-sensitive dataset of similar scale for development.

Don't buy hardware up front until you have almost the entire data pipeline working well on rented servers. Rent them hourly on any of the big cloud platforms, and you'll quickly be able to quantify most of your hardware requirements. How much RAM you need in GPUs/TPUs. How much RAM you need on CPUs. How fast a storage layer you'll need.

Only after you have an at-scale dev/qa environment working on a cloud, will you have any idea what physical hardware you'd want to buy.

Appropriate_Ant_4629 t1_j9ubt3u wrote on February 24, 2023 at 4:50 PM

Reply to comment by maxToTheJ in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

> I understand that I am not personally helping the situation but I am not going to take a huge paycut to work on those problems especially when that paycut would be at my expense

I think you have this backwards.

Investment Banking and the Defense Industry are two of the richest industries in the world.

> Those models are being built by contractors who are subcontracting that work out which means its being built by people who are not getting paid well ie not senior or experienced folks.

The subcontractors for that autonomous F-16 fighter from the news last month are not underpaid, nor are the Palantir guys making the software used to target who autonomous drones hit, nor are the ML models guiding real-estate investment corporations that bought a quarter of all homes this year.

It's the guys trying to do charitable work using ML (counting endangered species in national parks, etc) that are far more likely to be the underpaid interns.

Appropriate_Ant_4629 t1_j9slydb wrote on February 24, 2023 at 6:53 AM

Reply to comment by maxToTheJ in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

> I worry about a lot of bad AI/ML made by interns making decisions that have huge impact like in the justice system, real estate ect.

I worry more about those same AIs made by the senior-architects, principal-engineers, and technology-executives rather than the interns. It's those older and richer people whose values are more likely to be archaic and racist.

I think the most dangerous ML models in the near term will be made by highly skilled and competent people whose goals aren't aligned with the bulk of society.

Ones that unfairly send certain people to jail, ones that re-enforce unfair lending practices, ones that will target the wrong people even more aggressively than humans target the wrong people today.

Appropriate_Ant_4629 t1_j9p4z0e wrote on February 23, 2023 at 4:05 PM

Reply to Why bigger transformer models are better learners? by begooboi

A bigger array holds more information than a smaller one.

^(You'd need to refine your question. It's obvious that a bigger model could outperform a smaller one -- simply by noticing that it could be identical to the smaller one by just setting the rest of it weights to zero. For every single one of those weights, if there's any value better than zero, the larger model would be better.)

Appropriate_Ant_4629 t1_j97gjhy wrote on February 19, 2023 at 9:01 PM

Reply to comment by royalemate357 in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

>egress fees / data transfer fees

On the bright side, ingress is often free.

It costs surprisingly little to stream live video ***into*** the cloud and spew back tiny embedding vectors from models running there.

Appropriate_Ant_4629 t1_j8y3koe wrote on February 17, 2023 at 7:49 PM

Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties

> Hi, I lead product for Colab.

Thanks for your responses here!

And thank google's management chain above you for allowing you to represent the product here.

Your comments here just saved a number of subscriptions that would have otherwise canceled.

Appropriate_Ant_4629 t1_j8h5l44 wrote on February 14, 2023 at 7:09 AM

Reply to comment by artsybashev in GPU comparisons: RTX 6000 ADA vs Hopper h100 by N3urAlgorithm

> Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

With his ask, he probably has jobs big enough they'll run through the holidays.

Appropriate_Ant_4629 t1_j7clc8s wrote on February 5, 2023 at 8:15 PM

Reply to Beat GPT-3 which has unlimited money using Open Source community by koyo4ever

The LAION project ( https://laion.ai/ ) is probably the closest thing to this.

They're looking for volunteers to help work on their fully F/OSS ChatGPT successor now. A video describing the help they need can be found here.

They have a great track record on similar scale projects. They've partnered with /r/datahoarders and volunteers on creation of training sets including their 5.8 billion image/text-pair dataset that they used to train a better version of CLIP.

Their actual training of models tends to be done on some of the larger European supercomputers, though. If I recall correctly, their CLIP-derivative was trained with time donated on JUWELS. Too hard to split up such jobs into average-laptop-sized tasks.

Appropriate_Ant_4629 t1_j75sc61 wrote on February 4, 2023 at 7:40 AM

Reply to What hardware specifications are generally required for AI/ML/DL by AnimeFreak888

Note that some models are extremely RAM intensive; while others aren't.

A common issue you may run into are errors like RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 6.13 GiB already allocated; 0 bytes free; 6.73 GiB reserved in total by PyTorch), and it can be pretty tricky to refactor models to work with less RAM than they expect (see examples in that link).

Appropriate_Ant_4629 t1_j6vmpm8 wrote on February 2, 2023 at 4:56 AM

Reply to comment by FastestLearner in Using Jupyter via GPU by AbCi16

> Being able to use the GPU doesn’t have anything to do with Jupyter.

It's certainly not required....

.. but Nvidia makes it extremely convenient through the notebooks they provide:

https://catalog.ngc.nvidia.com/resources

>> The NGC catalog offers step-by-step instructions and scripts through Jupyter Notebooks for various use cases, including machine learning, computer vision, and conversational AI. These resources help you examine, understand, customize, test, and build AI faster, while taking advantage of best practices.

Appropriate_Ant_4629 t1_j65cjuf wrote on January 27, 2023 at 8:43 PM

Reply to ⭕ What People Are Missing About Microsoft’s $10B Investment In OpenAI by LesleyFair

> OpenAI managed to solve all of its problems at once. They raised a boatload of money and have access to all the compute they need. On top of that, they solved their distribution problem. They now have access to Microsoft’s sales teams and their models will be integrated into MS Office products.

But they already did that years ago when they sold Microsoft an exclusive license to an older GPT.

This is just a nice extension of that deal.

Appropriate_Ant_4629 t1_j5gy6e3 wrote on January 22, 2023 at 10:53 PM

Reply to comment by ThisIsNotAnAlias in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

> Last I checked image watermarks were super weak against rotations

Obviously depends on the technique. The old-school popular technique of "slap a signature in the painting" like Dürer's stylized A/D logo is very robust to rotations, but not robust to cropping from the bottom in that case.

> seems to still be the case - but the better methods could cope with cropping way better than these.

It's near impossible to have a watermark technology that's robust to all transformations, at least if you reveal what watermark algorithm you used.

One easy attack that works on ~~most~~ some techniques, would be to just re-encode the content, but writing your own watermark over the original using the same watermarking algorithm.

Appropriate_Ant_4629 t1_j5gb1kw wrote on January 22, 2023 at 8:27 PM

Reply to comment by adt in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Stable Diffusion already includes one by default:

https://medium.com/@steinsfu/stable-diffusion-the-invisible-watermark-in-generated-images-2d68e2ab1241

In particular it uses

Of course with open source software and models, you'd be free to create a fork that doesn't include one, or uses a different one.

Appropriate_Ant_4629 t1_j32exrg wrote on January 5, 2023 at 4:25 PM

Reply to comment by WinterExtreme9316 in Does anyone here use newer or custom frameworks aside from TensorFlow, Keras and PyTorch? by ConsciousInsects

Sounds almost like you're talking about the first TensorFlow.

Or Torch-before-PyTorch.

Appropriate_Ant_4629 t1_ixtlme6 wrote on November 26, 2022 at 6:19 AM

Reply to comment by RichardBJ1 in Is Linux still vastly preferred for deep learning over Windows? by moekou

Yes

https://towardsdatascience.com/installing-pytorch-on-apple-m1-chip-with-gpu-acceleration-3351dc44d67c

>> Installing PyTorch on Apple M1 chip with GPU Acceleration

Appropriate_Ant_4629 t1_ixtlk37 wrote on November 26, 2022 at 6:18 AM

Reply to comment by Longjumping-Wave-123 in Is Linux still vastly preferred for deep learning over Windows? by moekou

> Windows + WSL with Ubuntu is great

I wouldn't call it "great".

It's very fragile environment compared to running Linux as the host OS. At work our Linux machines have uptimes measure in years, running many jobs 24x7.

It's lucky if my co-worker's GPU-accelerated WSL environments survive a few days.

I much prefer running Linux as the host and Windows in a VM for the rare occasions I need it.

Appropriate_Ant_4629 t1_iu50aox wrote on October 28, 2022 at 4:23 PM

Reply to [D] Do companies actually care about their model's training/inference speed? by GPUaccelerated

I think you'll find 2 kinds of customers:

Those developing their own models that they mostly sell to others (whether directly or as part of a consulting service) -- they'll care a lot about training time but much less about inference time.
Those using models, often from others (birt out of the box) or lightly fine-tuned models (legalbert) -- they'll care about inference time, but won't care about training time.

Appropriate_Ant_4629 t1_itr458y wrote on October 25, 2022 at 6:11 PM

Reply to Combining image and text embedding [P] by External_Oven_6379

Currently working on the same thing.

I think you'll want to keep them as separate vectors.

The Jina guys had an interesting demo where you could assign different weights to the text-based-vector and the image-based-vector to fine-tune ranking.