ggf31416

ggf31416 t1_jcmql4c wrote

At the speeds these things move, when you see them coming it's already too late to do any corrective maneuver. It's the same reason you don't use your eyeballs to detect aircraft 100km away. See https://en.wikipedia.org/wiki/Space_debris#Tracking_and_measurement, Algorithms to Antenna: Tracking Space Debris with a Radar Network, RADAR and LIDAR are used.

2

ggf31416 t1_jca7zwz wrote

https://fullstackdeeplearning.com/cloud-gpus/

Your best bet to reach 256Gb in the cloud would be Azure with 4x80GB A100 instances, however your 40k budget will only buy you 3000 hours of compute at best on demand, with spot instances stretching that a bit further.

If that's not enough for you then you will have to figure out how to make a server with RTX A6000 Adas with 48GB each. RTX4090 would be cheaper but there may be legal issues due to the gaming driver license, you would need to use multiple servers due to power usage or strongly limit the power limit, and Nvidia dropped P2P that may o may not matter depending on how much communication you need between the GPUs (https://discuss.pytorch.org/t/ddp-training-on-rtx-4090-ada-cu118/168366)

3

ggf31416 t1_jac61sd wrote

2060 has 6GB of VRAM, right?

It should be possible to train with that amount https://huggingface.co/docs/transformers/perf_train_gpu_one#optimizer

If you need to train from scratch (most people will just finetune) this will take a while, original training took 90 hours in 8xV100, each one should be faster than your GPU https://www.arxiv-vanity.com/papers/1910.01108/

2

ggf31416 t1_j9clwen wrote

I actually have a 3060 too, in theory a 3060ti should be up to 30% faster, but most of the times the 3060 is fast enough and faster than any T4.

For making a few images on stable diffusion maybe the difference will be 15 vs 20 seconds, for running whisper on several hours of audio it could be 45 minutes vs 1 hour. The difference will only matter if the model is optimized to fully use the GPU in the first place.

1

ggf31416 t1_j9a8p88 wrote

3070 and 3060ti both have 8GB, and while the 3070 will be a bit faster, most people will agree that the difference is not worth the price if you have a tight budget.

For training the extra 4GB from the plain 3060 is quite useful, but for inference only you can run most small and medium models (such as stable diffusion) in 8GB and the 3060ti will be faster.

2

ggf31416 t1_j99y9e1 wrote

https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/

https://lambdalabs.com/gpu-benchmarks

How much VRAM you need will depend mostly on the number of parameters of the model with some extra for the data. At FP32 precision each parameter needs 4 bytes, at FP16 or BF16 2 bytes, and at FP8 or INT8 only one byte. Almost all models can be run at FP16 without noticeable accuracy loss, FP8 sometimes works, sometimes it doesn't depending on the model.

3

ggf31416 t1_j7waxlu wrote

It will depend on how much preprocessing and augmentation is needed. I don't think text needs much preprocessing or augmentation, but for example image classification or detection training needs to create a different augmented image on each iteration and will benefit from a more powerful processor.

Note that you can also use cloud services. If you aren't dealing with confidential data vast ai often is one of the cheapest, otherwise you can use Lambda Labs, Google Engine, AWS or other services. At least in the case of Google Engine and AWS you have to request access to GPU instances, which may take some time.

2

ggf31416 t1_j741sxn wrote

One possibility is GPU acceleration using the cuML framework, but if you are must use a specific framework like sklearn it won't be feasible. https://medium.com/rapids-ai/accelerating-random-forests-up-to-45x-using-cuml-dfb782a31bea

There are some alternatives such as Google, AWS, Gradient, you may be able to get student credits. Also, even if you don't need a GPU, you can rent an instances with many CPU cores at Vast.ai for cheap (even with the GPU it's cheaper than a CPU only AWS instance with the same amount of cores), for example the cheapest instance with 16vCPU is < $0.20/hour and only needs a credit card. The main issue with vast.ai is that you should save your results before shutting down the instance because they are tied to the machine which may become unavailable.

1

ggf31416 t1_j0ypnpp wrote

Training a large model only in CPU is madness, it will take forever and waste a lot of electricity. You need a GPU with CUDA or an equivalent solution fully supported by your framework. See e.g. this benchmark.

A t2.micro instance may be free during the free trial but is useless for anything resource intensive. You are much better off just using Google Colab or Kaggle notebooks.

If you have to train models very often (like everyday) and 24GB from a RTX3090 or better a RTX4090 is enough, a dedicated computer is the most cost effective way in the long run. If you cant afford a RTX3090 and 12GB is enough, a 3060 with 12GB will do (for ML we usually want as much VRAM as possible, raw computing power often is not the bottleneck).

Vast ai is a cost effective way of renting computing power for non-constant use, much cheaper than AWS or GCP, but beware that because of how it works the instance is not fully secure against attacks from the host so you can't use it with sensitive data.

Any good CUDA GPU will be able to train with a small dataset in less of a day, so take that into account for the decision between purchasing a GPU and cloud computing.

7

ggf31416 t1_iyfacbd wrote

Do you want something that will run on battery or something powerful that won't last too much in battery?

The ones you mentioned belong to the first category, if you want something powerful you will need a GPU, one from Nvidia will make things much easier than one from AMD.

1