The Elements of Statistical Learning Data Mining, Inference, and Prediction, Second Edition by Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome [Springer,2009] (Hardcover) 2nd Edition Paperback – January 1, 1994
The Amazon Book Review
Book recommendations, author interviews, editors' picks, and more. Read it now
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Customers who viewed this item also viewed
Customers who bought this item also bought
There was a problem filtering reviews right now. Please try again later.
The authors are the real thing (Tibshirani is the one behind the LASSO regularization technique).
Uses some mathematical statistics without the burdens of measure theory and avoids the obvious but complicated proofs.
I own two copies of this edition, one for the office, one for my house, and the authors generously provide the PDF for travelers like me.
The tragedy is that the material is well selected, and obviously essential for work in the field. It is just very poorly supported by off-handed, sketchy derivations that resemble inside jokes more than explanations.
The book brags of color illustrations. I would have preferred didactic coherence.
Download the free pdf from the Stanford site. If you find a companion text, only then buy the hard copy.
Fortunately, The Elements of Statistical Learning proves the exception. The text is full with the equations necessary to root the methodology without engaging the reader with long proofs that would tax those of us employing these techniques in the business world.
The visual aspects of the text seem to have been written with John Tukey or Edward Tufte in mind. Though their frequent use makes the book some seven hundred pages long, reading and comprehension is made much easier.
And, though it’s been almost ten years since the book was published, the techniques described remain, for the most part, at the cutting edge of data science.
I was told by some other analysts I know that this was their bible for data science. I was somewhat skeptical of this kind of hyperbole but was pleasantly surprised that the book matched these high expectations. If you have an undergraduate degree in a mathematically related discipline, The Elements of Statistical Learning will prove to be an invaluable reference to understand the rapidly advancing avalanche of data mining techniques.
I have since discovered textbooks that, although I haven't had the time to read them completely, in my opinion, serve as better resources for the material in this textbook, as well as prerequisite material.
For prerequisite material, I mentioned Plane Answers to Complex Questions by Christensen. Since I wrote my last review, I have also discovered A Primer on Linear Models by Monahan, as well as Foundations of Linear and Generalized Linear Models by Agresti. I've heard that the classic text on Linear Models by Searle is amazing, but have not had time to read it.
For alternatives to Elements of Statistical Learning, my #1 choice by far are the texts by Theodoridis, namely Machine Learning, and Pattern Recognition. These texts are huge and give a very realistic idea of the background it would take to learn this material. For something which isn't quite as dense but could serve as an alternative to Elements of Statistical Learning, I prefer Izenman's Modern Multivariate Statistical Techniques, which covers a lot of the same material. It has its own problems as well, but at the very least, Izenman goes out of his way in his text to define the notation that is used precisely over a chapter or two.
The statement of this textbook "expect[ing] that the reader will have had at least one elementary course in statistics, covering basic topics including linear regression." still bothers me to this day.
This book is a pedagogical disaster. It is essentially a bunch of topics lumped into one book, with unclear explanations and notation.
I am speaking particularly to the linear regression section. I took two linear models courses in a graduate-level statistics program, and I can say I finally understand the material in here. But the explanations are extremely poor. Note I didn't say poor - but beyond poor. The authors attest that they "expect that the reader will have had at least one elementary course in statistics, covering basic topics including linear regression." This seems to suggest that any student who has taken freshman-level/AP-level statistics can handle this material.
Yes, I agree with everyone, you need linear algebra and some mathematical maturity. But not only do you need linear algebra and mathematical maturity, you need much more than that: you need a substantial background in linear models, to even understand how in the world you find a variance-covariance matrix or expected value or a random vector, to understand why N-p-1 is the degrees of freedom for the unbiased estimator of the variance, to understand how in the world one defines differentiating with respect to the beta vector, and perhaps most importantly, to understand that the estimate of the response vector is H * y, H denoting the projection matrix onto the column space of X or the hat matrix.
Some may argue, this book isn't about the "how" and that I'm arguing too much about the theory, and that this is more of an "applied" book. Simply put, no matter what math or stats book I read, if I can't figure out the results that come out of a book using what is in the text itself and whatever prerequisites are suggested by the text, then it's a crappy textbook. You can't read this book with an elementary statistics background. Furthermore, I don't know anyone with only an elementary statistics background who could understand taking derivatives with respect to a vector.
I don't recommend this book unless you have an excellent grasp in linear model theory. Introduction to Statistical Learning, contrary to popular belief, is not a good prerequisite for this book. I recommend reading Plane Answers to Complex Questions by Christensen, or any similar book, before reading this text.
In addition the bound copy, I have the PDF from Hastie's site on my kindle, but for a reference as good as this one, having the bound version is absolutely worth the money. When I'm reading a math text, I tend to have one finger in a previous section, so I can refer back to definitions when I need to, and one finger in the index, which just isn't possible with the PDF / Kindle edition.
Top international reviews
My experience with the book so far if very positive. It contains incredibly relevant machine learning methods/tools which many other books, most notably PRML, doesn't touch upon or at least explain very shortly, which are extensively used in practice. Most notably: Support Vector Machines, Random Forests and Ensemble Learning. Also, the structure of ESL has made a lot more sense to me compared to PRML, it wraps parts of the field into more easily digestible chunks, and therefore makes for a better reference than PRML (just compare the table of contents). Also, as the authors themselves point out, the book itself will rather the reader understands the intuition, algorithm and the cases in which they perform good/bad rather than the mathematical background/proofs behind them (don't worry, most of them are still presented in ESL though). In conclusion, if you can accept the skimming of proof and some rigour in ESL, this book is perfect, and summarizes a large part of the field in such a way that even a mathematically mediocre computer scientist is able to somewhat grasp and apply in real world problems. However, if you want to get the entire picture, you might want to read both ESL and PRML, which will give you some of that Bayesian goodies as well.
beside the formulas and tables there are many figures, in color. these figures give a very good idea how al types of analyses work. After reading a lot of books in this field one can say that this is the best.
It is the best book for an in-depth understanding of ML. A famous ML Mooc course from IIT Madras teaches straight out of this book.
Request to the author: Please add codes for the data and model visualisations and some critical parts of the modelling
Geliefert wurde aus Indien, Achtung beim Kauf, dass man nicht den selben Fehler macht. Obwohl das Buch gut gemeint in Polsterfolie verpackt war, kam es an den Ecken stark verknickt hab und auch in der Mitte am Rand des Buchdeckels fühlt es sich stark beeinträchtigt an. Die Seiten waren verklebt und auf den Rändern waren Rückstände von irgendetwas zu sehen, die die Seiten ebenfalls verklebt haben. Das Buch hat sich feucht angefühlt, als es aus der Verpackung kam.
Scritto dai titani del campo, è un libro omnicomprensivo che, partendo dalle basi (nei primi capitoli, probabilmente per introdurli in maniera strumentale alla trattazione sviluppata, vengono descritte le tecniche base di regressione e classificazione) arriva a descrivere concetti molto più complessi e avanzati, come le varie tecniche di regolarizzazione (Ridge, LASSO), il metodo di Benjamini-Hochberg, le SVM etc.