Submitted by OkAssociation8879 t3_10nmaoh in MachineLearning
[removed]
Submitted by OkAssociation8879 t3_10nmaoh in MachineLearning
[removed]
I mean, it is kind of a very basic question and it takes like 15 minutes at most if you understand what you are doing. It is similar to leetcode-style questions for SE, it is not something that you will do on the job, but if you are smart, you will pass easily, and if you are not, you will struggle - so a great interview task
It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.
Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch. This question is more suited for someone fresh out of college, in my opinion.
>It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.
I mean, if it was on leetcode, it wouldn't make sense to ask it in the interview, because then you will get prepared answers.
>Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch
If you cannot apply chain rule, your math is very weak. If your math is very weak, you probably won't be a great ML engineer. It's not that you need a lot of math, but you need a broad general understanding of what can work and what can't quite often, actually.
Would be interested to know what kind of experience you have?
I think literally none of the ML engineers I work with (myself very much included) could pull a chain rule implementation out in 10 mins.
90% of this job is just finding an existing implementation of something to make work.
You are right. Interviewers generally ask about backpropagation. They should definitely test me on neural network concepts. But do you not think, asking to code entire neural network from scratch was an overdo for the interview?
The human brain doesn’t work like this. It’s not a question about “being smart” or simply having learned something previously. In order to perform an implementation of this on the spot in a stressful situation, the relevant theory needs to be very fresh in your memory.
I highly doubt you would be able to reproduce a proof of the Fundamental Theorem of Algebra on the spot, even though it’s a simple concept that many people learn in middle school.
I would probably fail this question because I haven’t worked with deep learning much since I graduated 4 years ago. I majored in math at an Ivy League school and graduated with a pretty good GPA, so I don’t think my math is ‘weak’, either.
This kind of question does not make sense to ask on a live call unless someone claims to be working with deep learning architectures as part of their daily work.
Oh, a fellow mathematician. Look, I graduated from Cambridge 6 years ago, but I could still prove the fundamental theorem of algebra analytically or with Galois theory (I still remember the general ideas of both proofs I think), so I guess it depends on a person. But FTA is also a much more complicated thing to prove than the chain rule, and you don't even need to prove it to know how to use it. And sorry, if you don't remember how to differentiate multivariable functions, then you are an extraordinarily lousy mathematician. And if you know how to differentiate multivariable functions and if you are smart, you should be able to quickly come up with an implementation for backprop even if you don't remember anything else
I never said I didn’t remember how to differentiate multivariate functions - my point was that equating conceptual mathematical knowledge and the ability to implement a specific application of such concepts in a time-constrained and stressful situation is inappropriate.
A lot of things need to come together in answering a question like this - remembering that the chain rule is the key concept in backprop the first place, knowledge of how to implement matrix algebra in code, knowing the commonly-used loss functions, how to compute their derivatives, and how to represent the differentiation in code, etc. None of these things is complicated on its own; the difficulty arises in bringing everything together in a small amount of time. It’s fair to expect people in the field to intuitively remember what is going on but on the spot implementation in under 30 minutes requires a level of rigor that is unrealistic for even a competent person who does not have the theory fresh in their memory.
You keep using the term ‘smart’ and I don’t know what you mean by this. Your last statement is just an assertion without argument, one you’ve repeated throughout your comments but I see no reason to believe, given the above.
I think often times with these absurdly complicated interview questions, they’re less interested in the final answer and more interested that you have the correct knowledge and problem solving skills to work through how your would attempt it. Often times with highly competitive positions they’re splitting hairs for the best candidate and the most extreme questions can nudge their decision one way or another when on paper candidates are fairly equivalent. Super stressful to be put on the spot like that nonetheless.
What does that even mean? Just the matrix multiplication of a perceptron or including the back propagation of an optimizer?
Everything. Loss function, optimizer, backpropagation, update weights
Stupid question. There is no time to code whole NN from scratch that actually can optimize a function.
Interviewer should ask use pytorch/tf to code a nn frame.
I don't think it is great question generally but this also depends on the position, the level of technical aptitude and seniority expected.
It is not something I would expect most people to rock up with in 15' while someone is looking over their shoulder. Probably it is a question that will help me to distinguish a kick-ass junior than someone who has a standard Keras syntax understanding but for more seniors roles this is likely a bad indicator. Far less senior engineer tasks fail because the person couldn't code backprop from scratch than because the wrong architecture was chosen, or they didn't know what part of an existing pipeline to optimise, or where to look for potential bugs given a particular unexpected behaviour, etc.
In general, I think it was more a point of "showing your thought process" than actually getting the code right. I would "abstract" things quite a bit first and then "start coding". But as others said, if they absolutely need to "split hairs" that is a way to do it too.
Best of luck with your interview in any case!
Just have chatGPT open at a side monitor and type in the prompt on a silent keyboard.
marcingrzegzhik t1_j69k6qd wrote
It's definitely a valid interview question, but it's not something you should be asked to do during a live call. It's too much to tackle in the limited time of a call and it's not a fair way to assess your skills. I would suggest asking to review a code sample you've written in the past that demonstrates your knowledge and experience. That would be a better way to assess your skills and it would be much less stressful. Good luck!