Reviewed in the United States on September 11, 2012
(Disclaimer: I have worked with a draft of the book and been allowed to use the instructor's review copy for this review. I have bought the book from Amazon.co.uk, but apparently this Amazon.com review can't be tagged "verified purchase". I don't receive any compensation whatsoever for writing this review. I hope it will help you chose a machine learning textbook.)
Similar textbooks on statistical/probabilistic machine learning (links to book websites, not Amazon pages):
- Barber's Bayesian Reasoning and Machine Learning ("BRML", Cambridge University Press 2012)
- Koller and Friedman's Probabilistic Graphical Models ("PGM", MIT Press 2009)
- Bishop's Pattern Recognition and Machine Learning ("PRML", Springer 2006)
- MacKay's Information Theory, Inference and Learning Algorithms ("ITILA", CUP 2003)
- Hastie, Tibshirani and Friedman's Elements of Statistical Learning ("ESL", Springer 2009)
* Perspective: My perspective is that of a machine learning researcher and student, who has used these books for reference and study, but not as classroom textbooks.
* Audience/prerequisites: they are comparable among all the textbooks mentioned. BRML has lower expected commitment and specialization, PGM requires more scrupulous reading. The books differ in their topics and disciplinary approach, some more statistical (ESL), some more Bayesian (PRML, ITILA), some focused on graphical models (PGM, BRML). K Murphy compares MLAPP to others here. For detailed coverage comparison, read the table of contents on the book websites.
* Main strength: MLAPP stands out for covering more advanced and current research topics: there is a full chapter on Latent Dirichlet Allocation, learning to rank, L1 regularization, deep networks; in the basics, the decision theory part is quite thorough (e.g. will mention Jeffrey's/uninformative priors). The book is "open" and vivid, doesn't shy away from current research and advanced concepts. This seems to be purposeful, as it shows in many aspects:
- quotes liberally from web sources, something usually not done in academic publications
- borrows "the best" from other authors (always with permission and acknowledgment, of course): most importantly the best pictures and diagrams, but also tables, recaps, insightful diagrams. Whereas other books will produce their own pictures and diagrams themselves (eg, PRML has a distinctive clarity and style in its illustrations), MLAPP takes many of its colour illustrations from other people's publications; therefore it can select the most pithy and relevant pictures to make a point. You could think that reproductions may be illegible and require extra effort to interpret because they come from a variety of sources; I have found that the bonus coming from having precisely the right image prevails.
- frequent references to the literature, mentions of extensions and open questions, as well as computational complexity considerations: for instance, the section on HMMs will mention duration modeling and variable-duration Markov models, and a comparison of the expressive power of hierarchical HMMs versus stochastic context-free grammars, complete with relevant citations, and a brief mention of the computational complexity results from the publications. All this connects the material with research and new ideas in a fine way -- which other textbooks don't achieve, I find. For instance, PGM defers references to a literature section at the end of each chapter, resulting in a more self-contained, but more poorly "linked" text.
* Didactic aids: Another distinctive feature is that the author clearly has tried to include didactic aids gathered over the years, such as recaps, comparative tables, diagrams, much in the spirit of the "generative model of generative models" (Roweis and Ghahramani): e.g. table comparing all models discussed, pros and cons of generative vs. discriminative models, recap of operations on HMMs (smoothing, filtering etc), list of parameter estimation methods for CRFs.
* Editorial features: Other editorial features worth mentioning are
- compared to others, helpful mentions of terminology, e.g. jargon, nomenclature, concept names, in bold throughout the text ("you could also devise a variant thus; this is called so-and-so")
- mathematical notation relatively clear and consistent, occasional obscurities. PGM stands out as excruciatingly precise on this aspect.
- boxes/layout: no "skill boxes" or "case study boxes" (PGM), not many roadmap/difficulty indications like ITILA or PGM, examples are present but woven into the text (not separated like PGM or BRML). Layout rather plain and homogeneous, much like PRML.
- sadly lacks list of figures and tables, but has index of code
* Complete accompanying material:
- interesting exercises (yet fewer than PRML, BRML, PGM); solutions, however, are only accessible to instructors (same with BRML, PGM), which in my experience makes them only half as useful for the self-learner. PRML and ITILA have some solutions online resp. in the book.
- accompanying Matlab/Octave source code, which I found more readily usable than BRML's. PGM and PRML have no accompanying source code, even though the toolkit distributed with Koller's online PGM class might qualify as one. I find accompanying code a truly useful tool for learning; there's nothing like trying to implement an algorithm, checking your implementation against a reference, having boilerplate/utility code for the parts of the algorithm you're not interested in re-implementing. Also, code may clarify an algorithm, even when presented in pseudo-code. By the way, MLAPP has rather few pseudo-code boxes (like BRML or PRML, while PGM is very good here).
- MLAPP is not freely available as a PDF (unlike BRML, closest topic-wise, ESL, or ITILA). This will no doubt reduce its diffusion. My own take on the underlying controversy is in favor of distributing the PDF: makes successful books widely popular and cited (think ITILA or Rasmussen and Williams' Gaussian Processes), increases the book's overall value, equips readers with a weightless copy to annotate with e-ink, or consult on the go. I believe PDF versions positively impact sales, too: impact neutral-to-positive to course textbook/university library sales, indifferent to sales in countries with widely different purchase power, positive to all other segments due to enormous diffusion/popularity.
The closest contender to this book I believe is BRML. Both are excellent textbooks and have accompanying source code.
BRML is more accessible, has a free PDF version, and a stronger focus on graphical models.
MLAPP has all the qualities of an excellent graduate textbook (unified presentation, valuable learning aids), and yet is unafraid of discussing detail points (e.g. omnipresent results on complexity), as well as advanced and research topics (LDA, L1 regularization).