Database Internals: A Deep Dive into How Distributed Data Systems Work 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Frequently bought together
Customers who bought this item also bought
From the Publisher
From the Preface
Who is this book for?
In conversations at technical conferences, I often hear the same question: “How can I learn more about database internals? I don’t even know where to start.” Most of the books on database systems do not go into details of storage engine implementation, and cover the access methods, such as B-Trees, on a rather high level. There are very few books that cover more recent concepts, such as different B-Tree variants and log-structured storage, so I usually recommend reading papers.
Everyone who reads papers knows that it’s not that easy: you often lack context, the wording might be ambiguous, there’s little or no connection between papers, and they’re hard to find. This book contains concise summaries of important database systems concepts and can serve as a guide for those who’d like to dig in deeper, or as a cheat sheet for those already familiar with these concepts.
Not everyone wants to become a database developer, but this book will help people who build software that uses database systems: software developers, reliability engineers, architects, and engineering managers.
If your company depends on any infrastructure component, be it a database, a messaging queue, a container platform, or a task scheduler, you have to read the project change-logs and mailing lists to stay in touch with the community and be up-to-date with the most recent happenings in the project.
Understanding terminology and knowing what’s inside will enable you to yield more information from these sources and use your tools more productively to troubleshoot, identify, and avoid potential risks and bottlenecks. Having an overview and a general understanding of how database systems work will help in case something goes wrong. Using this knowledge, you’ll be able to form a hypothesis, validate it, find the root cause, and present it to other project maintainers.
This book is also for curious minds: for the people who like learning things without immediate necessity, those who spend their free time hacking on something fun, creating compilers, writing homegrown operating systems, text editors, computer games, learning programming languages, and absorbing new information.
The reader is assumed to have some experience with developing backend systems and working with database systems as a user. Having some prior knowledge of different data structures will help to digest material faster.
About the Author
Alex is a data infrastructure engineer, database and storage systems enthusiast, Apache Cassandra committer and PMC member, interested in storage, distributed systems and algorithms.
There was a problem filtering reviews right now. Please try again later.
Database Internals is divided into two parts - the first deals with database storage. Especially good sections put a 9-cell flash-light on how many recent architectures are indeed built to tackle complexity bottom-up. i.e., LSM (log-structured merge) trees nicely complement the "write amplification" of Solid-State Disks. The discussion on the canonical B-tree and its multiple siblings (especially Bw-tree) is very well done. The functional difference between locks and latches would be enlightening even for experienced database practitioners - locks are used to manage transactions, latches to guard the *physical* storage representation.
The second half of the book focusing on distributed systems is more uneven in quality. It is, however, a great start of economized discussion of about 50 "Best Papers" on Leader Election, Failure/Crash detection, Replication and how distributed systems friendly "consensus protocols", rather than atomic ones like 2-phase commit work better. In many ways, distributed systems have veered from monarchy (single, immutable leader deciding everything, including the next leader) to a true republic (leader is still almost omnipotent, but is regularly replaced by the constituents). The comparative analysis of Paxos, ZAB and Raft - with clear sequence diagrams - is very well done.
The quality of writing is good, though could have been helped with more ruthless editing. The area covered is simply too broad, other than the intersect of SSDs and Modern DB architecture which is very deep and very good. Still the book easily deserves at least 4-stars for the enthusiasm and for its good attempt to convey distributed systems pedagogy to general practitioners. Pair it with Martin Kleppmann's "Designing Data Intensive Applications" and Ken Birman's "Guide to Reliable Distributed Systems".
The books covers a lots of core concepts. however, it's not deep enough. i would recommend this as distributed system/DB entry level book where you can learn many many concepts but you have to google more to learn it in depth.
it's totally my personal opinion. if the book does put "Deep Dive" in the title. i would rate it 4 or 5 stars.
This is the only book I know of that has all of this information relevant to database design all in one place. As someone who has read a lot of the resources listed in this book (there are a ton!), it’s nice to see all of this information condensed into a single book.
The book definitely has good parts to it, but the target audience is not clear. For the high-level understanding of the topic "Designing data-intensive applications" does the job so much better, and for the ones who are looking to find implementation level understanding - some chapters might be useful, but in the overall book remains a high-level one.
I found this book to be very interesting for the technical detail. There were some concepts that I found difficult to grasp at first.
I won't go into details of what you get in each chapter but this is more than what you find if you just google around and to be fair it does assume a certain level of experience with the subject.
Top international reviews
It's really hard to get an overview of the way databases work, given how diverse and, well, *big* they really are. Decades of practical experience don't mean one has a clear understanding of query processing, optimisation, storage subsystems, transaction processing, concurrency control, etc.
Sometimes, just sometimes, mortals get lucky and somebody writes a survey of a subfield, or an extended overview, of relevant problems. Best example I am aware of: the Red Book aka Readings in Database Systems. It's a vast survey of academic work on databases. But it's more of a collection of paper references than a linear reading.
Database Internals also feels a bit like an extended survey: numerous paper references are, no code, mostly conceptual explanations. What stands out is its good linear narration, gradually coming up with definitions and clarifying explanations.
So, what this book is not: introductory text, a textbook, theory-centric volume or practise-centric work.
What this book is: a survey of typical approaches to two major aspects of databases (local storage subsystems and problems of distributed systems). Interested reader will have to follow the references, casual reader will get familiar with terminology and common concepts in a condensed way.
I would (and definitely will) recommend the book to people already working with databases for at least a few years looking for additional insights or an overview of the field.
I found the book informative, but not very effective in building a solid understanding of concepts. I felt the author jumps from idea to (related) idea too frequently in the manner of short paragraphs, and in so doing doesn't see an idea through to the end in enough detail for it to be learned properly. Perhaps the first part was better presented; the second was not.
Seja relacional seja os NoSQL com um foque maior no relacional