Discover products with sustainability features this Earth month
Enjoy fast, free delivery, exclusive deals, and award-winning movies & TV shows.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

  • Database Internals: A Deep Dive into How Distributed Data Systems Work

Follow the author

Get new release updates & improved recommendations
Something went wrong. Please try your request again later.

Database Internals: A Deep Dive into How Distributed Data Systems Work 1st Edition


{"desktop_buybox_group_1":[{"displayPrice":"$36.33","priceAmount":36.33,"currencySymbol":"$","integerValue":"36","decimalSeparator":".","fractionalValue":"33","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"YMLqrTsTdX5%2FNYUcEaZJMSaK3mI%2F4ODOW9G4Q9MqMhCWFqrTN8dMdxa7RvYtIKt7bo4qUnqxSUXRkAMk9jlNtfUS1Q5PbfTQ7hItCs4%2Bd2JSIv9Mx2bp8UTuwE3XenMsIJVvGjTN%2B%2F2ImNpynXU28Q%3D%3D","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}]}

Purchase options and add-ons

When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals.

Throughout the book, you’ll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You’ll discover that the most significant distinctions among many modern databases reside in subsystems that determine how storage is organized and how data is distributed.

This book examines:

  • Storage engines: Explore storage classification and taxonomy, and dive into B-Tree-based and immutable Log Structured storage engines, with differences and use-cases for each
  • Storage building blocks: Learn how database files are organized to build efficient storage, using auxiliary data structures such as Page Cache, Buffer Pool and Write-Ahead Log
  • Distributed systems: Learn step-by-step how nodes and processes connect and build complex communication patterns
  • Database clusters: Which consistency models are commonly used by modern databases and how distributed storage systems achieve consistency

Frequently bought together

This item: Database Internals: A Deep Dive into How Distributed Data Systems Work
$36.33
Get it as soon as Thursday, Apr 16
In Stock
Ships from and sold by Amazon.com.
+
$59.99
Get it as soon as Thursday, Apr 16
In Stock
Ships from and sold by Amazon.com.
+
$43.99
Get it as soon as Thursday, Apr 16
In Stock
Ships from and sold by Amazon.com.
Total price: $00
To see our price, add these items to your cart.
Details
Added to Cart
Choose items to buy together.

Customers also bought or read

Loading...

From the brand


From the Publisher

databases, database, database internals

From the Preface

Who is this book for?

In conversations at technical conferences, I often hear the same question: “How can I learn more about database internals? I don’t even know where to start.” Most of the books on database systems do not go into details of storage engine implementation, and cover the access methods, such as B-Trees, on a rather high level. There are very few books that cover more recent concepts, such as different B-Tree variants and log-structured storage, so I usually recommend reading papers.

Everyone who reads papers knows that it’s not that easy: you often lack context, the wording might be ambiguous, there’s little or no connection between papers, and they’re hard to find. This book contains concise summaries of important database systems concepts and can serve as a guide for those who’d like to dig in deeper, or as a cheat sheet for those already familiar with these concepts.

Not everyone wants to become a database developer, but this book will help people who build software that uses database systems: software developers, reliability engineers, architects, and engineering managers.

If your company depends on any infrastructure component, be it a database, a messaging queue, a container platform, or a task scheduler, you have to read the project change-logs and mailing lists to stay in touch with the community and be up-to-date with the most recent happenings in the project.

databases, database, database internals

Understanding terminology and knowing what’s inside will enable you to yield more information from these sources and use your tools more productively to troubleshoot, identify, and avoid potential risks and bottlenecks. Having an overview and a general understanding of how database systems work will help in case something goes wrong. Using this knowledge, you’ll be able to form a hypothesis, validate it, find the root cause, and present it to other project maintainers.

This book is also for curious minds: for the people who like learning things without immediate necessity, those who spend their free time hacking on something fun, creating compilers, writing homegrown operating systems, text editors, computer games, learning programming languages, and absorbing new information.

The reader is assumed to have some experience with developing backend systems and working with database systems as a user. Having some prior knowledge of different data structures will help to digest material faster.

Editorial Reviews

About the Author

Alex is a data infrastructure engineer, database and storage systems enthusiast, Apache Cassandra committer and PMC member, interested in storage, distributed systems and algorithms.

Product details

About the author

Follow authors to get new release updates, plus improved recommendations.
Alex Petrov
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Alex is a data infrastructure engineer, database and storage systems enthusiast, Apache Cassandra committer and a PMC member. His expertise is in storage, distributed systems, and algorithms.

Customer reviews

4.6 out of 5 stars
543 global ratings

Customers say

Customers find the book to be a great resource for understanding database internals, covering many core concepts. However, the writing quality receives mixed feedback, with several customers noting that it is often poor.
AI Generated from the text of customer reviews

Select to learn more

36 customers mention content, 28 positive, 8 negative
Customers find the book to be a great resource for understanding database internals, covering many core concepts and being one of the best texts on the subject.
...This is a must read for anyone who uses database; really helps explain how they workRead more
Solid content on how databases are put togetherRead more
Nice book to read if you want to understand how database systems work underneath....Read more
Delivering on the title is a tall order. The book tries to cover a lot of ground. Some of the content could use better editing....Read more
6 customers mention writing quality, 4 positive, 2 negative
Customers have mixed opinions about the writing quality of the book, with some finding it poor.
...Succinct, detailed, and quality.Read more
While it does contain interesting and useful information, the writing is bland and uninspiring, and the author comes across as disinterested in the...Read more
...The quality of writing is good, though could have been helped with more ruthless editing....Read more
Very knowledgeable and detail oriented but concise at the same timeRead more
One of the Best Books out there
5 out of 5 stars
One of the Best Books out there
This is one of the best texts covering Database internals. Databases are used everyday, and understanding what happens under the hood is daunting task. This book takes a pragmatic approach on the topic, starting with basics and then taking a deeper dive into how the basic data structures and concept come together. IMHO, this book shall appeal to both Database developer's and engineer's who want to understand how databases work. This book is must have to for the engineer's who really want to get into Database development. Otherwise also this book is a must have reference in general. I personally liked the attention to details in the book on what really matter's when writing a real database. The concepts are equally applicable to SQL and NoSQL databases.
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

  • Reviewed in the United States on November 17, 2020
    Format: KindleVerified Purchase
    Can't believe I forgot to write a review for this one!

    Partly it's probably because I usually have less to say (or more precisely it's harder for me to be properly articulate) about things I like than I do about the ones I don't. And boy did I like Database Internals! I'll try my best to explain why, the book and the author surely deserve it.

    Being a back-end engineer, the main reason for picking this one up was to better understand the distributed databases that I may end up in (or have already had) contact with. With that in mind, I planned on just skimming the first part of the book but imagine my surprise when I found myself Googling BW and LSM trees and going through papers comparing this and that algorithm and their impacts on memory, storage and CPU caches in multicore systems. The geek got suckered in! With my curiosity circuits pleasantly warmed by the first part, I moved on to the second part of the book - the main dish - where a similar scenario unfolded: again I swallowed up whatever was served and ended up digging for more and adding scores of books and papers to my to-read list.

    All in all, Database Internals reads felt a lot like a trip to the zoo or a local museum: chock full of data structures and algorithms used by modern-day databases (and distributed systems in general), the book will showcase each item with sufficient details for you to grasp what they're about and then provide you with enough bibliography and reference material to last you a lifetime... or at least a couple of years.
    14 people found this helpful
    Report
  • Reviewed in the United States on May 19, 2024
    Format: PaperbackVerified Purchase
    This book took me a few years to get through. It is much more low-level than something like DDIA. Having worked on database code for the past couple of years, that context was crucial in helping me understand the book. The book is great because it covers all of the important ideas in databases and talks about the tradeoffs of using various algorithms. Succinct, detailed, and quality.
    4 people found this helpful
    Report
  • Reviewed in the United States on November 3, 2019
    Format: PaperbackVerified Purchase
    This is one of the best texts covering Database internals. Databases are used everyday, and understanding what happens under the hood is daunting task. This book takes a pragmatic approach on the topic, starting with basics and then taking a deeper dive into how the basic data structures and concept come together. IMHO, this book shall appeal to both Database developer's and engineer's who want to understand how databases work. This book is must have to for the engineer's who really want to get into Database development. Otherwise also this book is a must have reference in general. I personally liked the attention to details in the book on what really matter's when writing a real database. The concepts are equally applicable to SQL and NoSQL databases.
    Customer image
    5.0 out of 5 stars
    One of the Best Books out there

    Reviewed in the United States on November 3, 2019
    This is one of the best texts covering Database internals. Databases are used everyday, and understanding what happens under the hood is daunting task. This book takes a pragmatic approach on the topic, starting with basics and then taking a deeper dive into how the basic data structures and concept come together. IMHO, this book shall appeal to both Database developer's and engineer's who want to understand how databases work. This book is must have to for the engineer's who really want to get into Database development. Otherwise also this book is a must have reference in general. I personally liked the attention to details in the book on what really matter's when writing a real database. The concepts are equally applicable to SQL and NoSQL databases.
    Images in this review
    Customer image
    11 people found this helpful
    Report
  • Reviewed in the United States on November 30, 2019
    Format: PaperbackVerified Purchase
    Mastery in systems abstraction comes through a philosophical pivot. While an enthusiastic beginner considers successful "use cases", an experienced traveler - through her implicit awareness of futility against entropy - often only considers failure and just tries her best. As more systems, and more of every system, are being dictated by the twin forces of economics and architectural modernism, a much higher percentage of design and development efforts in software should be dedicated to understanding fundamentals (CPU registers, branch prediction etc.) and essential complexities (multi-node consensus, replication failures etc.). This book is a good start.

    Database Internals is divided into two parts - the first deals with database storage. Especially good sections put a 9-cell flash-light on how many recent architectures are indeed built to tackle complexity bottom-up. i.e., LSM (log-structured merge) trees nicely complement the "write amplification" of Solid-State Disks. The discussion on the canonical B-tree and its multiple siblings (especially Bw-tree) is very well done. The functional difference between locks and latches would be enlightening even for experienced database practitioners - locks are used to manage transactions, latches to guard the *physical* storage representation.

    The second half of the book focusing on distributed systems is more uneven in quality. It is, however, a great start of economized discussion of about 50 "Best Papers" on Leader Election, Failure/Crash detection, Replication and how distributed systems friendly "consensus protocols", rather than atomic ones like 2-phase commit work better. In many ways, distributed systems have veered from monarchy (single, immutable leader deciding everything, including the next leader) to a true republic (leader is still almost omnipotent, but is regularly replaced by the constituents). The comparative analysis of Paxos, ZAB and Raft - with clear sequence diagrams - is very well done.

    The quality of writing is good, though could have been helped with more ruthless editing. The area covered is simply too broad, other than the intersect of SSDs and Modern DB architecture which is very deep and very good. Still the book easily deserves at least 4-stars for the enthusiasm and for its good attempt to convey distributed systems pedagogy to general practitioners. Pair it with Martin Kleppmann's "Designing Data Intensive Applications" and Ken Birman's "Guide to Reliable Distributed Systems".
    Customer image
    4.0 out of 5 stars
    Summarized Recent Overview of Storage & Distributed Systems

    Reviewed in the United States on November 30, 2019
    Mastery in systems abstraction comes through a philosophical pivot. While an enthusiastic beginner considers successful "use cases", an experienced traveler - through her implicit awareness of futility against entropy - often only considers failure and just tries her best. As more systems, and more of every system, are being dictated by the twin forces of economics and architectural modernism, a much higher percentage of design and development efforts in software should be dedicated to understanding fundamentals (CPU registers, branch prediction etc.) and essential complexities (multi-node consensus, replication failures etc.). This book is a good start.

    Database Internals is divided into two parts - the first deals with database storage. Especially good sections put a 9-cell flash-light on how many recent architectures are indeed built to tackle complexity bottom-up. i.e., LSM (log-structured merge) trees nicely complement the "write amplification" of Solid-State Disks. The discussion on the canonical B-tree and its multiple siblings (especially Bw-tree) is very well done. The functional difference between locks and latches would be enlightening even for experienced database practitioners - locks are used to manage transactions, latches to guard the *physical* storage representation.

    The second half of the book focusing on distributed systems is more uneven in quality. It is, however, a great start of economized discussion of about 50 "Best Papers" on Leader Election, Failure/Crash detection, Replication and how distributed systems friendly "consensus protocols", rather than atomic ones like 2-phase commit work better. In many ways, distributed systems have veered from monarchy (single, immutable leader deciding everything, including the next leader) to a true republic (leader is still almost omnipotent, but is regularly replaced by the constituents). The comparative analysis of Paxos, ZAB and Raft - with clear sequence diagrams - is very well done.

    The quality of writing is good, though could have been helped with more ruthless editing. The area covered is simply too broad, other than the intersect of SSDs and Modern DB architecture which is very deep and very good. Still the book easily deserves at least 4-stars for the enthusiasm and for its good attempt to convey distributed systems pedagogy to general practitioners. Pair it with Martin Kleppmann's "Designing Data Intensive Applications" and Ken Birman's "Guide to Reliable Distributed Systems".
    Images in this review
    Customer image Customer image
    81 people found this helpful
    Report
  • Reviewed in the United States on April 16, 2021
    Format: PaperbackVerified Purchase
    I've been looking for a book that covers these topics for a long time. Even just working with different databases on a day-to-day basis it's incredibly helpful to understand how components of each database actually work. Furthermore the topics covered in this book span a very wide array of different topics and techniques which are incredibly handy for distributed systems. It's really hard to find this much information in a single book. Usuaully you'd have to know each of the topics you're interested in and buy an entire book on that topic. This book packs a pretty in depth view on several topics related to database systems into one book without needless fluff.

    I highly recommend this book not only to people working on distributed data systems, but to anyone working with databases. This is one of my most frequently referenced books I own.
    14 people found this helpful
    Report
  • Reviewed in the United States on December 29, 2024
    Format: PaperbackVerified Purchase
    Strong recommend to anyone from a beginner to an expert
    One person found this helpful
    Report

Top reviews from other countries

  • Vladimir Kazanov
    5.0 out of 5 stars A very good book with developers already working with databases and database-like systems
    Reviewed in the United Kingdom on February 28, 2020
    Format: PaperbackVerified Purchase
    There are two infinitely big and comparably old topics in software engineering: compilers and databases. Both have traditions and history, both are recognised as deep research topics, with developers and academics working on related problems for decades.

    It's really hard to get an overview of the way databases work, given how diverse and, well, *big* they really are. Decades of practical experience don't mean one has a clear understanding of query processing, optimisation, storage subsystems, transaction processing, concurrency control, etc.

    Sometimes, just sometimes, mortals get lucky and somebody writes a survey of a subfield, or an extended overview, of relevant problems. Best example I am aware of: the Red Book aka Readings in Database Systems. It's a vast survey of academic work on databases. But it's more of a collection of paper references than a linear reading.

    Database Internals also feels a bit like an extended survey: numerous paper references are, no code, mostly conceptual explanations. What stands out is its good linear narration, gradually coming up with definitions and clarifying explanations.

    So, what this book is not: introductory text, a textbook, theory-centric volume or practise-centric work.

    What this book is: a survey of typical approaches to two major aspects of databases (local storage subsystems and problems of distributed systems). Interested reader will have to follow the references, casual reader will get familiar with terminology and common concepts in a condensed way.

    I would (and definitely will) recommend the book to people already working with databases for at least a few years looking for additional insights or an overview of the field.
  • Clément Grimault
    5.0 out of 5 stars Great book
    Reviewed in Spain on July 2, 2023
    Format: PaperbackVerified Purchase
    Amazing book, in my top 3 technical books. I learned a lot, it goes really deep and explains everything very well. I would suggest to have at least a good understanding of database basics before starting though (indexes, distributed systems)
  • hailizhang
    5.0 out of 5 stars Great value with true knowledge!
    Reviewed in Canada on August 27, 2025
    Format: PaperbackVerified Purchase
    The book is really good, mint without any scratches. Love it!
    Customer image
    hailizhang
    5.0 out of 5 stars
    Great value with true knowledge!

    Reviewed in Canada on August 27, 2025
    The book is really good, mint without any scratches. Love it!
    Images in this review
    Customer image
  • Amazon Customer
    5.0 out of 5 stars Good read, and came in well wrapped
    Reviewed in India on February 12, 2026
    I read other reviews, which said "this book is very technical". That made me think, that this book is going to have source code of database internals..
    But, these are just theories.
    Because, of that i am a little disappointed.
    If you are an experienced programmer, then you can read and absorb this books content in single day.

    It is a good book, otherwise.
    Customer image
    Amazon Customer
    5.0 out of 5 stars
    Good read, and came in well wrapped

    Reviewed in India on February 12, 2026
    I read other reviews, which said "this book is very technical". That made me think, that this book is going to have source code of database internals..
    But, these are just theories.
    Because, of that i am a little disappointed.
    If you are an experienced programmer, then you can read and absorb this books content in single day.

    It is a good book, otherwise.
    Images in this review
    Customer image Customer image
  • Amazon Customer
    5.0 out of 5 stars Alles, was man wissen muss.
    Reviewed in Germany on December 20, 2025
    Format: PaperbackVerified Purchase
    Sehr gute Zusammenfassung for essenzielle Wissen.