entropyvsenergy t1_jcroge4 wrote on March 19, 2023 at 1:06 AM

Transformers do well with lots of data. This is because the transformer is an extremely flexible and generic architecture. Unlike a fully connected neural network where each input is mapped through a weight matrix to the next layer and the weight matrices are fixed with respect to any input, transformers use attention blocks where the actual "effective" weight matrices are computed using the attention operation using query, key, and value vectors and thus depend on the inputs. What this means is that in order to train a transformer model you need a lot of data in order to get better performance than less flexible neural network architectures such as LSTMs or fully connected networks.

entropyvsenergy t1_iw58dge wrote on November 13, 2022 at 1:33 AM

Reply to [D] When was the last time you wrote a custom neural net? by cautioushedonist

It's all frameworks now, some better than others. I haven't written one outside of demos or interviews in years. With that being said, I've modified neural networks a whole bunch. Usually you can just tweak parameters in a config file but sometimes you want additional outputs or to fundamentally change the model in some way...usually minor tweaks codewise.

entropyvsenergy t1_ivqltwc wrote on November 9, 2022 at 9:58 PM

Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS

Batching does this, generally and it's a good thing for stability. Reduces the variance of the gradient update proportional to the batch size.

entropyvsenergy t1_iuiwahp wrote on October 31, 2022 at 5:49 PM

Reply to comment by your_city_councilor in Communities in Worcester divided over Question 5 by HRJafael

It's 1.5% on top of the tax not on top of the house value. So if you own a $700k house, your tax bill is likely around $8,000 so this would add $120 (1.5% of $8,000) to your tax bill.

entropyvsenergy t1_itv6d3r wrote on October 26, 2022 at 3:26 PM

Reply to Does anyone take the city bus and if so is there a fee to ride it? by LumpyTown4103

WRTA is free. MBTA local buses are $1.70 with a Charlie Card or $11 for a 1-day pass.

The buses are pretty good, though they may arrive early or late so factor in arriving early to the stop and give yourself a buffer of time if you need to arrive somewhere by a specific time.

entropyvsenergy t1_ith6v4w wrote on October 23, 2022 at 4:59 PM

Reply to [D] Comprehension issues with papers from non-English speakers by Confused_Electron

That sentence doesn't make any sense to me as a native English speaker either.