MongoDB Performance Optimization Strategies Presentation outline By Enterprise Account Manager Kevin Batt Kevin.batt@enteros.com 408-207-8408 Enteros, Inc.
MongoDB 2014-03-13 Enteros, Inc. Overview Before going deep into performance optimization ensure that MongoDB was right choice for your project as it is completely non relational database means it is document oriented database. Map-Reduce Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command. Consider the map-reduce operation on the next slide:
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization Update to MongoDB 2.4 or later versions as it supports V8 JavaScript engine and includes feature like security enhancements, and text search (beta) and hashed index. The switch to V8 improves concurrency by permitting multiple JavaScript operations to run at the same time. In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 1. Sharding Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Shard keys should satisfy the following: • “distributable” – the worst case of the shard key is auto-incremented value (this will entail the “hot shard” behavior, when all writes will be balanced to the single shard – here is the bottle neck). Ideal shard key should be as much “randomness” as possible. • Ideal shard key should be the primary field used for your queries. • An easily divisible shard key makes it easy for MongoDB to distribute content among the shards. Shard keys that have a limited number of possible values can result in chunks that are “unsplittable.” • unique fields in your collection should be part of the shard key Here is the doc about shard key
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 2. Balancing Bear in mind that moving chunks from shard to another shard is a very expensive operation (adding of new shards may significantly slow down the performance). As an helpful option – you could stop the balancer during the “prime time”.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 4. Locks MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write operation. When a read lock exists, many read operations may use this lock. However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock. Locks are “writer greedy,” which means writes have preference over reads. When both a read and write are waiting for a lock, MongoDB grants the lock to the write.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 5. Fast Writes Use Capped Collections for Fast Writes Capped Collections are circular, fixed-size collections that keep documents well-ordered, even without the use of an index. This means that capped collections can receive very high-speed writes and sequential reads. These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections where appropriate.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 6. Fast Reads Use Natural Order for Fast Reads. To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a capped collection, this also returns the documents in the order in which they were written. Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 7. Query Performance Read out about query performance, especially please pay attention to Indexes and Compound Indexes.
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization 9. The size of Database As far as you might understand MongoDB will store e.g. this document { UserFirstAndLastName: "Mikita Manko", LinkToUsersFacebookPage: "https://www.facebook.com/mikita.manko" } “as-is”. I mean that names of these fields “UserFirstAndLastName” and “LinkToUsersFacebookPage” will reduce free space. Buy the using “name shorting” technique you can minimize the usage of memory (you can get rig of something like 30-40% of unnecessary data):
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization Obviously that it will cause the creation of “mapper” in your code (You should map shortened unreadable names from database to long ones to allow to use readable fields in your code) { FL: "Mikita Manko", BFL: "https://www.facebook.com/mikita.manko" }
MongoDb 2014-03-13 Enteros, Inc. Performance Optimization C. Updates The most obvious point is to be on the cutting edge of technologies and Investigate and Install last updates.
Enteros 2014-03-13 Enteros, Inc. Upbeat High Load Capture Database Root Cause and Spike Analysis for multi-tiered applications Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High Load Capture user interface visually correlates performance and system load metrics across multiple IT production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture makes analysis possible for the most transient database performance spikes. Features • Multi-threaded, high-precision performance collection engine • Extensible, dynamically configurable, centrally controlled collection agents • Comprehensive library of collector agents • Cross-tier correlation • Safe, secure agent communication • Load-sensitive collection controller
Enteros 2014-03-13 Enteros, Inc. Upbeat High Load Capture
Enteros 2014-03-13 Enteros, Inc. Upbeat High Load Capture Supported Infrastructure, Database, Application server, OS monitoring Database Server OS: Linux, Sun Solaris, HP/UX, AIX, Windows Server Client OS: Windows, Linux Database: Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB Application Server: Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
MongoDb 2014-03-13 Enteros, Inc. Enteros, Inc http://www.enteros.com Enteros is an innovative software company specializing in Performance Management and Load Testing Software for Production Databases - RDBMS and NOSQL/Big Data Enteros solutions enable IT professionals to identify and remediate performance problems in business- critical databases with unprecedented speed, accuracy and scope. Kevin Batt; kevin.batt@enteros.com 408-207-8408

Mongo db pefrormance optimization strategies

  • 1.
    MongoDB Performance Optimization Strategies Presentationoutline By Enterprise Account Manager Kevin Batt Kevin.batt@enteros.com 408-207-8408 Enteros, Inc.
  • 2.
    MongoDB 2014-03-13 Enteros, Inc. Overview Beforegoing deep into performance optimization ensure that MongoDB was right choice for your project as it is completely non relational database means it is document oriented database. Map-Reduce Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command. Consider the map-reduce operation on the next slide:
  • 3.
  • 4.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization Update to MongoDB 2.4 or later versions as it supports V8 JavaScript engine and includes feature like security enhancements, and text search (beta) and hashed index. The switch to V8 improves concurrency by permitting multiple JavaScript operations to run at the same time. In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.
  • 5.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 1. Sharding Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Shard keys should satisfy the following: • “distributable” – the worst case of the shard key is auto-incremented value (this will entail the “hot shard” behavior, when all writes will be balanced to the single shard – here is the bottle neck). Ideal shard key should be as much “randomness” as possible. • Ideal shard key should be the primary field used for your queries. • An easily divisible shard key makes it easy for MongoDB to distribute content among the shards. Shard keys that have a limited number of possible values can result in chunks that are “unsplittable.” • unique fields in your collection should be part of the shard key Here is the doc about shard key
  • 6.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 2. Balancing Bear in mind that moving chunks from shard to another shard is a very expensive operation (adding of new shards may significantly slow down the performance). As an helpful option – you could stop the balancer during the “prime time”.
  • 7.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
  • 8.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 3. Disk Input Output operations In most cases the hardware bottleneck will be HDD (not CPU or RAM), especially if you have several shards. So, during the growth of data, the number of I/O operations will rapidly increase. Also keep monitoring free disk space. So fast disks are more important in case if you are using sharding.
  • 9.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 4. Locks MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write operation. When a read lock exists, many read operations may use this lock. However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock. Locks are “writer greedy,” which means writes have preference over reads. When both a read and write are waiting for a lock, MongoDB grants the lock to the write.
  • 10.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 5. Fast Writes Use Capped Collections for Fast Writes Capped Collections are circular, fixed-size collections that keep documents well-ordered, even without the use of an index. This means that capped collections can receive very high-speed writes and sequential reads. These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections where appropriate.
  • 11.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 6. Fast Reads Use Natural Order for Fast Reads. To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a capped collection, this also returns the documents in the order in which they were written. Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk.
  • 12.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 7. Query Performance Read out about query performance, especially please pay attention to Indexes and Compound Indexes.
  • 13.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization 9. The size of Database As far as you might understand MongoDB will store e.g. this document { UserFirstAndLastName: "Mikita Manko", LinkToUsersFacebookPage: "https://www.facebook.com/mikita.manko" } “as-is”. I mean that names of these fields “UserFirstAndLastName” and “LinkToUsersFacebookPage” will reduce free space. Buy the using “name shorting” technique you can minimize the usage of memory (you can get rig of something like 30-40% of unnecessary data):
  • 14.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization Obviously that it will cause the creation of “mapper” in your code (You should map shortened unreadable names from database to long ones to allow to use readable fields in your code) { FL: "Mikita Manko", BFL: "https://www.facebook.com/mikita.manko" }
  • 15.
    MongoDb 2014-03-13 Enteros, Inc. PerformanceOptimization C. Updates The most obvious point is to be on the cutting edge of technologies and Investigate and Install last updates.
  • 16.
    Enteros 2014-03-13 Enteros, Inc. UpbeatHigh Load Capture Database Root Cause and Spike Analysis for multi-tiered applications Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High Load Capture user interface visually correlates performance and system load metrics across multiple IT production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture makes analysis possible for the most transient database performance spikes. Features • Multi-threaded, high-precision performance collection engine • Extensible, dynamically configurable, centrally controlled collection agents • Comprehensive library of collector agents • Cross-tier correlation • Safe, secure agent communication • Load-sensitive collection controller
  • 17.
  • 18.
    Enteros 2014-03-13 Enteros, Inc. UpbeatHigh Load Capture Supported Infrastructure, Database, Application server, OS monitoring Database Server OS: Linux, Sun Solaris, HP/UX, AIX, Windows Server Client OS: Windows, Linux Database: Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB Application Server: Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
  • 19.
    MongoDb 2014-03-13 Enteros, Inc. Enteros,Inc http://www.enteros.com Enteros is an innovative software company specializing in Performance Management and Load Testing Software for Production Databases - RDBMS and NOSQL/Big Data Enteros solutions enable IT professionals to identify and remediate performance problems in business- critical databases with unprecedented speed, accuracy and scope. Kevin Batt; kevin.batt@enteros.com 408-207-8408