RuairiSpain

RuairiSpain t1_jclph9g wrote

And this is the reason the FED will keep giving free money to banks and keep interest rates low so Venture Capitalist and Vultures can keep doing their financial pump and dumps of Tech and Pharma startups.

In the end this bail out of depositors will be normal tax payers. Like you and me.

Peter is not a normal tax payer, his tax rate is probably around 1% of his income. I checked Peter's philanthropy, he gives $1,000,000 to young people that drop out of school to start businesses, this sounds like an early-bird angel investor rather than philanthropy.

His other "philanthropy" is donating big money to Republicans, 20 million! https://www.opensecrets.org/donor-lookup/results?name=peter+thiel&order=desc&sort=D

On good news, his COVID-19 hide away in New Zealand seems to have lost it's planning permission. His a great American, that runs off to New Zealand when the shit hits the fan. https://nypost.com/2022/08/19/peter-thiels-plans-for-dream-home-in-new-zealand-are-gone/

19

RuairiSpain t1_jacukww wrote

ChatGPT has 125 million vocabulary, to hold that in memory you'd need at least 1 80GB nVidia card, at $30,000 each. As AI models grow they'll need more RAM and Cloud is the cheapest way for companies to timeshare those prices.

It's not just training the models, it's also query the models that need that in memory calculations. I'm not expecting gamer to buy these cards. But scale up the number of using going to query OpenAI, Bing X ChatGPT or Google x Bard, and all the other AI competitors and there will be big demand for large RAM GPUs

−3

RuairiSpain t1_j9yf7g4 wrote

The model is huge though and needs to be in GPU memory for performance calculations (sparse matrix dot product).

Probably one thing teams are working in is reducing the dimensions of the sparse matrix so it can fit on fewer GPUs. Also looking at reduced precision of floating point multiplication, 8 bit floats is probably enough for AI matrix maths. Maybe combining matrix multiplication AND the activation functions (typically ReLU or Sigmoid) so two maths operations can be done in one pass through GPU. That involves refactoring their math library.

Or the build custom TPUs with all this build into the hardware.

The future is bright 🌞 for AI. Until we hit the next brick wall

2

RuairiSpain t1_j4i4t54 wrote

I left academia in the 1990s. When did paper titles becomes so vague? "In my day", you had a good idea what the paper was about just from the title. Reading the first 30-40 papers here, what are authors trying to do? Be comedians?

I need a more up-to-date buzzword thesaurus of research fields and fashions, so I can interpret the context/semantics of these titles! I feel old 😫

5

RuairiSpain t1_j4gk0re wrote

Search and integration into Office products would be big revenue generators. Killing Google and revenue would be a double whammy for the Tech sector, it would destabilise a main competitor and put MS at the front of the Tech arms race for the next decade or two.

I foresee Google losing search market share, which is looking more and more likely, given their terrible search results and spammed too results. That leaves Google with Android and Youtube, which are dependant on a good search engine for revenue.

If MS can move the needle on Bing market share, it could bring them back into the B2C market.

Imaging ChatGPT integrated into Word, PowerPoint, Excel and SharePoint! It would be middle managers wet dream to waste even more time on documents and paperwork 😜

1

RuairiSpain t1_j4giuhl wrote

Do you think ChatGPT will be able to fix the ambiguity in later responses? And improve the partial gibberish that it can add?

I'm not sure people have looked closely at the ChatGPT semantics. To debug where the model goes wrong when it adds gibberish, is a big step in ML. The first hurdle is to get explainability into the model results. I've not wee much discussion on this 2ith ChatGPT

1

RuairiSpain t1_j39q68d wrote

I like the idea. I work for a large enterprises on their ML platform team, providing similar services internally to all DEV, ML and analytics teams. I think there is a business in it, it is a competitive space but the accusation potential is great (to be bought over and merged into a larger org).

I suggest you check out https://www.gitpod.io, which does more general provisioning of GitOps clusters/Pods in their managed Kubernetes clusters. It's not specifically ML, but we've looked at it for POC ML projects that want basic hosting.

Also check out: https://github.com/ml-tooling/ml-workspace, it a nice open source project with lots of packages ready to use.

And Jupyterlabs offering, they'll be your main competition on pricing.

You are going to have a headache with Python version compatibility with your base dependencies, the onces used on GitHub, and the ones needed by Jupyter Notebooks. Same with CUDA drivers, suggest you lock down the AWS node instance types, so it's less confusing for end users.

If you are turning it into a business, I'd recommend you have a tier approach to size of ML project. Simple POC ML projects with a tiny dataset, is a good starting point for most people. But then people was data ingest, cleaning, ETL to Big Data and enterprise sources; this gets complex fast (and where most teams waste time and money). Either keep your focus on POCs, and grow it's as a ML hosting company for SMEs; or embrace the ETL side and simplify data ingest for larger enterprise companies. The second option is more a consulting business but you can charge high fees.

ML ETL space: https://www.ycombinator.com/companies/dagworks-inc

https://www.gathr.one/plans-pricing/

https://www.snowflake.com/en/

Of these 3 ETL companies, I've played with Snowflake and like what they do and their direction. Especially like they acquired https://streamlit.io/ which is a fun way to deploy Python apps without dealing with infrastructure and devOps tasks.

My final comment, include data ingest and ETL in your story to customers. ML training and deploying training pipelines is not where DS people spend their time, 80% is spent on data collection, reshaping and validation.

FYI, I think you'll burn through $75 very quickly for a Nvidia GPU. I presume you are running these in on-demand and not spot prices. That monthly price seems generous for an average ML training pipeline.

14

RuairiSpain t1_j1j2qhg wrote

Also, think a lot of ML research is now highly dependent on GPU parallel computation, which is expensive. My guess is that it will be the academics that collaborate the GCP, AWS, Azure to get near free usage of GPU clusters. Coming up with a paradigm shift means a fair bit of trial and error experimentation. For the time being Cloud providers have been happy to promote ML pipelines to academics. But that may change with the tightening of costs with the layoffs and recession.

The transformer and self-attention progress has been an interesting achievement. I foresee us stuck in this trend until most ML groups have fully explored the avenues of research on self-attention. Without the advances in GPUs I don't think we'd be where we are now.

What's next? I'd love to see more progress on recommender systems and sparse training data.

I feel there are more gains to be had out of the more boring stuff in ML: data rangling design patterns to help non-data scientists choose the best model and customisation to answer their hypothesis questions. Also, the mechanics and infrastructure around ML at scale in an enterprise is not mature.

There are a lot of pain points for ML and big data teams in Enterprise, they need to skill up on a variety of hardcore DevOps tasks. Once it becomes trivial to spin up a ML pipeline with a Cloud infrastructure team supporting that work, then we'll see more commercial successes and collaborations between academia and Tech companies.

ML is at a strange evolutionary stage, there are lots of Tech companies relying on ML models to give them a USP over their competitors. So their willingness to share their breakthroughs is small. Once the barriers to entry are reduced for enterprise scale ML modelling, that's when we'll see more adoption of ML systems across whole sectors. Right now ML commercial research is too expensive for small players to get involved

3

RuairiSpain t1_izunqwy wrote

Data Science is different from development, the agile methodologies don't apply because it's more split into 2-3-4 stages:

  1. a discovery stage, where you hone in on the question you want to ask, where you get sample data from that simplified the actual data you'll work on
  2. Which ML algorithm or strategy is close to answering your question with the sample data you have
  3. Setup training, validation and test samples. Then validation that they are representative of you real data
  4. Run the model and iterate over the process to improve results. Maybe use hyper parameters optimisation to come up with best results for your lose function.
  5. Present your result for peer review
  6. Refactor your model for performance and deployment

There is a lot of data science preamble before you get to a peer review. So quick feedback loops are different compared to software development. The discovery phase is more about understanding the data and extracting the appropriate features that should be tested. It's mostly about applying stats to your data, that then gives you hints about which ML modeling to choose from. See this article on stats: https://towardsdatascience.com/10-machine-learning-methods-that-every-data-scientist-should-know-3cc96e0eeee9

The developer stage is more at the tail end where you look at refactoring the algorithm to make it as fast and explainable as possible. Maybe also add a feedback loops in production to check for model drift, that's where your agile frameworks would potentially be used.

2