I've been writing a new textbook on deep learning for publication by MIT Press late this year. The current draft is at:

It contains a lot more detail than most similar textbooks and will likely be useful for all practitioners, people learning about this subject, and anyone teaching it. It's (supposed to be) fairly easy to read and has hundreds of new visualizations.

Most recently, I've added a section on generative models, including chapters on GANs, VAEs, normalizing flows, and diffusion models.

Looking for feedback from the community.

If you are an expert, then what is missing?
If you are a beginner, then what did you find hard to understand?
If you are teaching this, then what can I add to support your course better?

Plus of course any typos or mistakes. It's kind of hard to proof your own 500 page book!

Comments

You must log in or register to comment.

Philpax t1_j5lqko9 wrote on January 23, 2023 at 10:12 PM

#1,457,992

Awesome! I'll add it to my reading list :)

Own_Quality_5321 t1_j5lt463 wrote on January 23, 2023 at 10:29 PM

#1,458,223

Nice. I wil have a look and possibly recommend it. Thanks for sharing, that must have been a huge amount of work

promiise t1_j5lu8wl wrote on January 23, 2023 at 10:36 PM

#1,458,326

Nice, thanks for sharing your hard-work!

taleofbenji t1_j5lw5ie wrote on January 23, 2023 at 10:49 PM

#1,458,492

I love your book and refer to it often. I keep hitting F5 for Chapter 19. :-)

aristotle137 t1_j5m25xi wrote on January 23, 2023 at 11:30 PM

#1,459,062

Btw, I absolutely loved your computer vision textbook, clear, comprehensible and so much fun! Best visulations in the biz. Also loved your UCL course on the subject, I was there 2010/2011 -- will definitely check out the next book

arsenyinfo t1_j5m33oc wrote on January 23, 2023 at 11:37 PM

#1,459,145

As a practitioner, I am surprised to see no chapter on finetuning

Nhabls t1_j5m4lxa wrote on January 23, 2023 at 11:47 PM

#1,459,281

Obviously haven't had the time to read through it, and this is a clear nitpick but i really don't like when sites like this force you to download the files rather than display it in the browser by default

Comfortable_End5976 t1_j5m4w76 wrote on January 23, 2023 at 11:49 PM

#1,459,309

having a skim, it looks good mate. i like your writing style. please let us know once it's published and we can pick up a physical copy.

[deleted] t1_j5m9vib wrote on January 24, 2023 at 12:24 AM

#1,459,768

[removed]

NihonNoRyu t1_j5md892 wrote on January 24, 2023 at 12:49 AM

#1,460,055

Will you add a section for Forward-forward algorithm?

PabloEs_ t1_j5mdolq wrote on January 24, 2023 at 12:52 AM

#1,460,103

Looks good and fills a gap, imo there is no good DL book out there. What could be better: state results more clear as a Theorem with all needed assumptions.

SimonJDPrince OP t1_j5mfyr1 wrote on January 24, 2023 at 1:09 AM

#1,460,287

Replying to Nhabls (#1,459,281)

I'll give people the choice in the end...

K_is_for_Karma t1_j5mh00k wrote on January 24, 2023 at 1:16 AM

#1,460,369

how recent is your chapter on generative models? I’m starting to pursue research in the area and need to get up to date

profjonathanbriggs t1_j5mljdr wrote on January 24, 2023 at 1:50 AM

#1,460,839

Added to my reading stack. Thanks for this. Will revert with comments.

sweetlou357 t1_j5mtytp wrote on January 24, 2023 at 2:51 AM

#1,461,530

Wow this looks like an amazing resource!

aamir23 t1_j5mzuw2 wrote on January 24, 2023 at 3:37 AM

#1,462,085

There's another book with the same title. Understanding deep learning

fkrhvfpdbn4f0x t1_j5n3ypm wrote on January 24, 2023 at 4:11 AM

#1,462,450

Replying to aristotle137 (#1,459,062)

>u/SimonJDPrince

could you share a link to a CV textbook

like_a_tensor t1_j5n7akg wrote on January 24, 2023 at 4:40 AM

#1,462,732

Very nice work! Do you plan to release any solutions to the problems?

bacocololo t1_j5ng5fs wrote on January 24, 2023 at 6:09 AM

#1,463,573

Will be please to look at it, especially the figure explaining algorithms, Thanks

bacocololo t1_j5nk0vo wrote on January 24, 2023 at 6:54 AM

#1,463,846

Replying to bacocololo (#1,463,573)

Fitst impression base on transformers figure : You book could become a best seller…

libai123456 t1_j5nkizf wrote on January 24, 2023 at 7:00 AM

#1,463,880

I really like the book for it provides many beautiful pictures and gives us many intuitions behind deep learning algorithms, really appreciate the work you have done on this book.

Qpylon t1_j5nlguz wrote on January 24, 2023 at 7:12 AM

#1,463,941

Replying to aamir23 (#1,462,085)

That one’s full title is actually “ Understanding Deep Learning: Application in Rare Event Prediction”

_harias_ t1_j5nnb3t wrote on January 24, 2023 at 7:36 AM

#1,464,044

Replying to fkrhvfpdbn4f0x (#1,462,450)

This probably: http://www.computervisionmodels.com/

Maximum-Mission-9377 t1_j5nr0e2 wrote on January 24, 2023 at 8:25 AM

#1,464,295

Many thanks for sharing

H0lzm1ch3l t1_j5nrium wrote on January 24, 2023 at 8:32 AM

#1,464,332

Replying to NihonNoRyu (#1,460,055)

Didn‘t Hinton just try to start talking about it again at NeurIps? Isn‘t it like super irrelevant right now or am I missing something?

SimonJDPrince OP t1_j5o09dv wrote on January 24, 2023 at 10:37 AM

#1,464,880

Replying to like_a_tensor (#1,462,732)

I will release solutions to about half of them. Have to keep the rest back for professors. You can always message me if you want to know the solution to a particular problem.

SimonJDPrince OP t1_j5o0cj7 wrote on January 24, 2023 at 10:39 AM

#1,464,888

Replying to NihonNoRyu (#1,460,055)

I'm planning to add extra material on line for things like this where it's still unclear how important they are. If they get widely adopted, I'll incorporate into next edition.

SimonJDPrince OP t1_j5o0gbn wrote on January 24, 2023 at 10:40 AM

#1,464,898

Replying to aamir23 (#1,462,085)

Yeah -- I feel a bit bad about that, but as someone else pointed out, the title is not actually the same. I should put a link to this book on my website though, so anyone looking for this book can find it.

SimonJDPrince OP t1_j5o0jd9 wrote on January 24, 2023 at 10:41 AM

#1,464,905

Replying to K_is_for_Karma (#1,460,369)

There are five chapters and around 100 pages. I think it would be a good start.

SimonJDPrince OP t1_j5o0n0t wrote on January 24, 2023 at 10:43 AM

#1,464,911

Replying to _harias_ (#1,464,044)

Yup -- some of it is a bit out of date now, but the stuff on probabilistic/graphical models is all still good and so is the geometry.

SimonJDPrince OP t1_j5o0orv wrote on January 24, 2023 at 10:43 AM

#1,464,915

Replying to arsenyinfo (#1,459,145)

Can you give me an example of a review article or chapter in another book that covers roughly what you expect to see?

like_a_tensor t1_j5o1j14 wrote on January 24, 2023 at 10:55 AM

#1,464,966

Replying to SimonJDPrince (#1,464,880)

Sounds great, thanks!

[deleted] t1_j5o68mg wrote on January 24, 2023 at 11:53 AM

#1,465,381

[deleted]

NoRexTreX t1_j5o9hd8 wrote on January 24, 2023 at 12:29 PM

#1,465,665

Replying to SimonJDPrince (#1,464,915)

I can't give an example material outside of just huggingface documentation but it's the big thing right now to leverage pre trained models so if your book doesn't mention it then it's missing the hipest thing. And also adapterhub.

bacocololo t1_j5obxnl wrote on January 24, 2023 at 12:53 PM

#1,465,901

In page 41 just near problem 3.9 you write twice the. Do you need this type of comment too ?

SimonJDPrince OP t1_j5ocgd0 wrote on January 24, 2023 at 12:58 PM

#1,465,952

Replying to bacocololo (#1,465,901)

Yes! Any tiny errors (even punctuation) are super useful! Couldn't find this though. Can you give me more info about which sentence?

SimonJDPrince OP t1_j5ocrdo wrote on January 24, 2023 at 1:00 PM

#1,465,969

Replying to [deleted] (#1,465,381)

I'd say that mine is more internally consistent -- all the notation is consistent across all equations and figures. I have made 275 new figures, whereas he has curated existing figures from papers. Mine is more in depth on the topics that it covers (only deep learning), but his has much greater breadth. His is more of a reference work, whereas mine is intended mainly for people learning this for the first time.
Full credit to Kevin Murphy -- writing book is much more work than people think, and so completing that monster is quite an achievement.

Thanks for tip about Hacker News -- that's a good idea.

bacocololo t1_j5of34s wrote on January 24, 2023 at 1:21 PM

#1,466,192

Replying to SimonJDPrince (#1,465,952)

just above 3.1.2 sum of slopes from the the regions

TheMachineTookShape t1_j5ofex4 wrote on January 24, 2023 at 1:24 PM

#1,466,226

What is the most efficient way for someone to tell you about typos, or provide suggestions? I'll try to have a read over the weekend.

TheMachineTookShape t1_j5ofkzb wrote on January 24, 2023 at 1:26 PM

#1,466,246

Replying to TheMachineTookShape (#1,466,226)

Sorry, you've written the instructions right there on the Web page! Just ignore me...

TheMachineTookShape t1_j5ohns9 wrote on January 24, 2023 at 1:43 PM

#1,466,470

Replying to SimonJDPrince (#1,465,952)

There's another on page 349 in section "Combination with other models":

>...will ensure that the the aggregated posterior...

Apprehensive-Grade81 t1_j5oijh0 wrote on January 24, 2023 at 1:50 PM

#1,466,578

Replying to arsenyinfo (#1,459,145)

Definitely would like something like this. Maybe SOTA benchmarks as well.

new_name_who_dis_ t1_j5oix1c wrote on January 24, 2023 at 1:53 PM

#1,466,624

Replying to arsenyinfo (#1,459,145)

Fine tuning isn’t any different than just training…? You just don’t start with random network, but fine tuning doesn’t really have anything different besides that and the size of the dataset

SimonJDPrince OP t1_j5olux3 wrote on January 24, 2023 at 2:15 PM

#1,466,929

Replying to new_name_who_dis_ (#1,466,624)

That was kind of my impression. And I do discuss this in the chapters on transformers and regularization. Was wondering if there is more to it.

SimonJDPrince OP t1_j5olz4s wrote on January 24, 2023 at 2:16 PM

#1,466,947

Replying to TheMachineTookShape (#1,466,470)

Thanks! If you send your real name to the e-mail on the front page of the book, then I'll add you to the acknowledgements.

bythenumbers10 t1_j5pa9iz wrote on January 24, 2023 at 4:57 PM

#1,469,551

When to reach for deep learning over older, simpler methods. Just an executive summary to keep folks from sandblasting soda crackers, or being forced to.

SimonJDPrince OP t1_j5pdwwr wrote on January 24, 2023 at 5:20 PM

#1,469,896

Replying to bythenumbers10 (#1,469,551)

That's not a bad idea actually!

SatoshiNotMe t1_j5ponoh wrote on January 24, 2023 at 6:25 PM

#1,471,043

Replying to SimonJDPrince (#1,465,969)

Looks like a great book so far. I think it is definitely valuable to focus on giving a clear understanding of some topics rather than covering everything while compromising depth of understanding

AdFew4357 t1_j5pqkqb wrote on January 24, 2023 at 6:36 PM

#1,471,249

I have one minor gripe about deep learning textbooks. I think they are great references, but should not be used as a way for beginners to get into the field. I genuinely feel like time is better spent on the student going down a rabbit hole of actual papers of maybe one of the chapters of those books, say, a student reads the chapter on graph neural networks and the proceeds to read everything in graph neural networks, rather than read the whole book on different subsections.

SimonJDPrince OP t1_j5q2nbo wrote on January 24, 2023 at 7:50 PM

#1,472,656

Replying to AdFew4357 (#1,471,249)

Agreed -- in some cases. Depends on the level of the student, if they are studying in a class etc. My goal was to write the first thing you should read about each area.

arsenyinfo t1_j5r4b5l wrote on January 24, 2023 at 11:47 PM

#1,476,368

Replying to SimonJDPrince (#1,464,915)

Random ideas from the top of my head:

intro why transfer learning works;
old but good https://cs231n.github.io/transfer-learning/;
a concept of catastrophic forgetting;
some intuition on answering empirical questions like what layers should be frozen, how to adapt LR etc.

SimonJDPrince OP t1_j5taba8 wrote on January 25, 2023 at 12:28 PM

#1,483,035

Replying to arsenyinfo (#1,476,368)

Thanks. This is useful.

NeoKov t1_j5wmjkr wrote on January 26, 2023 at 1:51 AM

#1,497,327

As a novice, I’m not understanding why the test loss continues to increase— in general, but also in Fig. 8.2b, if anyone can explain… The model continues to update and (over)fit throughout testing? I thought it was static after training. And the testing batch is always the same size as the training batch? And they don’t occur simultaneously, right? So the test plot is only generated after the training plot.

SimonJDPrince OP t1_j5yc4n2 wrote on January 26, 2023 at 12:26 PM

#1,503,972

Replying to NeoKov (#1,497,327)

You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.

(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)

The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.

Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.

NeoKov t1_j5zvrkz wrote on January 26, 2023 at 6:51 PM

#1,511,297

Replying to SimonJDPrince (#1,503,972)

I see, thanks! This seems like a great resource. Thank you for making it available. I’ll post any further questions here, unless GitHub is the preference.

NeoKov t1_j60xeip wrote on January 26, 2023 at 10:48 PM

#1,516,251

Fig. 8.5 mentions “brown line” for b) but line appears to be black.

SimonJDPrince OP t1_j648ce9 wrote on January 27, 2023 at 4:32 PM

#1,530,755

Replying to NeoKov (#1,511,297)

GitHub or e-mail are better. Only occasionally on Reddit.

SimonJDPrince OP t1_j648umm wrote on January 27, 2023 at 4:35 PM

#1,530,812

Replying to NeoKov (#1,516,251)

Thanks! Definitely a mistake. If you send your real name to the e-mail address on the website, I'll add you to the acknowledgements in the book.

Let me know if you find any more.

LornartheBreton t1_j6i180q wrote on January 30, 2023 at 2:05 PM

#1,613,085

Please let us know when it's published so I can tell my university to buy some copies for its' library!