Avoiding Query Pitfalls @nleite
2 Agenda Common Query Pitfalls How to think about and how to solve them Tips and Tricks for Query Optimization
3 Howdy! Disclaimer Who's this guy? Norberto Leite Lead Engineer @nleite MongoDB
https://university.mongodb.com/
M201: MongoDB Performance
6 https://usatthebiglead.files.wordpress.com/2012/10/dez-drop-10-1-12.gif?w=1000 What's the story about query pitfalls?
7 The Power of Query Optimization Query tuning results in: ● Improved performance ● Reduced resource utilization This may lead to: ● Improved stability and predictability ● Smaller hardware footprint Not uncommon to observe efficiency improvements $gte 99%
Acme Games Introduces: ShortFite Tales of a release pre-launch testing
Getting Ready for Launch! MongoDB Certified Developers and DBA Stakeholders
Become a MongoDB Certified Developer / DBA
11 Stakeholder Concerns ● Game nearly complete ● Team starts to do capacity testing
12 Stakeholder Concerns ● Game nearly complete ● Team starts to do capacity testing Indexes support the efficient execution of queries in MongoDB
13 Stakeholder Concerns ● Game nearly complete ● Team starts to do capacity testing Ace Sue … … Bob Indexes support the efficient execution of queries in MongoDB
Stakeholder #1 Concern App being stress tested Concerns over current performance
Stakeholder #1 Concern Developers created index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 })
Stakeholder #1 Concern Developers created index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 }) “Clearly MongoDB is not webscale!”
Blocking Operations
18 Blocking Operation Formally: “An operation which must process all input before it can begin to produce any output.” Opposite of the often desirable “fully pipelined” plan which can stream results back as soon as they are found. Commonly observed when a sort is added to a query
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Indexes that support sorting db.games.createIndex({ gamerTag: 1, score:-1})
Sorting without blocking
Sorting without blocking
Sorting without blocking
Sorting without blocking
32 Blocking Stages $sort In aggregation and find $group $bucket $count $facet Are there any other blocking operations?
Aggregation Pipeline Stages?
34 Working with blocking stages For sorting: Add a supporting index Worth the overhead in almost all circumstances For other stages: Do you need the blocking stage? Offload to secondary member
Stakeholder #1 Concern Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) “Clearly MongoDB is not webscale!”
Stakeholder #1 Concern Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 })
Stakeholder #1 Concern Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 }) "That’ll work great!”
Stakeholder #2 Concern The $and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Concern #2: Multiple queries same index The $and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) We just created an index with both those fields… Can it be used?
$or
$and example Query on games: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 }, { gamerTag: "Bob", score: 500 }
Groups of documents score: {$gt: 9000}gamerTag: "Ace" { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 }
gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$and Venn Diagram (logical) { gamerTag: "Bob", score: 500 } gamerTag: "Ace" { gamerTag: "Ace", score: 500 } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 $and Index Visualization db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } $and Index Visualization db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
Ace Bob 500 9500 500 9500 {gamerTag:1 , score:-1} $and Index Visualization "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
Bob 500 9500 500 9500 Ace {gamerTag:1 , score:-1} $and Index Visualization "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$and Index Visualization Bob 500 500 9500 Ace 9500 {gamerTag:1 , score:-1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
$or example Query on games: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 } Non-matching games: { gamerTag: "Bob", score: 500 }
gamerTag: "Ace" $or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000} $or Venn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Bob", score: 500 } { gamerTag: "Ace", score: 500 } gamerTag: "Ace" { gamerTag: "Bob", score: 9500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000} $or Venn Diagram (logical) gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
score: {$gt: 9000} $or Venn Diagram (logical) gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (single) Index visualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Ace Bob 500 9500 500 9500 {gamerTag:1 , score:-1} $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Bob 500 9500 500 9500 {gamerTag:1 , score:-1} Ace $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Bob 500 9500 {gamerTag:1 , score:-1} Ace 500 9500 $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
500 Bob 9500 $or (single) Index visualization {gamerTag:1 , score:-1} Ace 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
500 Bob 9500 Actual (Hinted) Index Bounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ]} $or (single) Index visualization {gamerTag:1 , score:-1} Ace 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Ace Bob 500 9500 500 9500 {gamerTag:1 , score:-1} So is there anything we can do to improve the performance of this query? $or (single) Index visualization Actual (Hinted) Index Bounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ]} Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Recommendations Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1})
$or (multiple) Index visualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob {gamerTag:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob {gamerTag:1} 500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 {score:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Ace Bob 500 9500 {score:1}{gamerTag:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Bob 500 9500 {score:1} Ace {gamerTag:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Bob 500 9500 {score:1}{gamerTag:1} Ace "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Bob 500 9500 {gamerTag:1} Ace {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
$or (multiple) Index visualization Bob 500 9500 {gamerTag:1} Ace {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
Recommendations Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1})
Recommendations Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1}) We already have the {gamerTag:1, score:-1} index, do we need both of these new ones?
Recommendations Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1}) db.games.createIndex({ gamerTag: 1, score:-1 })
Stakeholder #2 Concern db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Having the right index is critical "Super!!”
93 Stakeholder #3 Concern “Wait wait wait, we can’t even FIND the gamers!” A basic search on gamerTag takes several seconds already: db.games.find({gamerTag: /^Ace$/i}) “This query is SLOWER with the index than it is without it!”
Case Insensitive Queries
Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } db.games.find({ gamerTag: /^Ace$/ }) //equivalent to db.games.find({ gamerTag: “Ace” }) Case Sensitive
Case Sensitive ace aCe acxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ "["Ace", "Acf")", "[/^Ace$/, /^Ace$/]" ]} Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalent to db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’,strength:2})
Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalent to db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’, strength:2}) Would a $text search be the same as well? Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
Case Insensitive ace aCe acxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ “["", {})", "[/^Ace$/i, /^Ace$/i]" ] } Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
100 Recommendations Case insensitive index! Collations available since 3.4
101 Recommendations Case insensitive index! Collations available since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } ) Store a transformed (eg toLower()) copy of the string
102 https://docs.mongodb.com/manual/reference/collation/index.html Collations
103 Stakeholder #3 Concern db.games.find({gamerTag: “Ace”}) .collation({locale:'en', strength:2})
104 Stakeholder #3 Concern db.games.find({gamerTag: “Ace”}) .collation({locale:'en', strength:2})
105 Stakeholder #3 Concern db.games.find({gamerTag: “Ace”}) .collation({locale:'en', strength:2}) “Wow, MongoDB can do anything!!!!1!”
Summary
107 Work Smarter Not Harder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
108 Work Smarter Not Harder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
109 Countdown to ShortFite Powered by an optimized MongoDB environment, ShortFite is sure to be a hit!
110 The Power of Query Optimization Query tuning results in: ● Improved performance ● Reduced resource utilization This may lead to: ● Improved stability and predictability ● A smaller hardware footprint Not uncommon to observe efficiency improvements $gte 99%
111 Become the HERO! - Plenty of resources out there - http://university.mongodb.com - http://docs.mongodb.com
Norberto Leite Lead Engineer norberto@mongodb.com @nleite
Avoid Query Pitfalls

Avoid Query Pitfalls

  • 1.
  • 2.
    2 Agenda Common Query Pitfalls Howto think about and how to solve them Tips and Tricks for Query Optimization
  • 3.
    3 Howdy! Disclaimer Who's this guy? NorbertoLeite Lead Engineer @nleite MongoDB
  • 4.
  • 5.
  • 6.
  • 7.
    7 The Power ofQuery Optimization Query tuning results in: ● Improved performance ● Reduced resource utilization This may lead to: ● Improved stability and predictability ● Smaller hardware footprint Not uncommon to observe efficiency improvements $gte 99%
  • 8.
    Acme Games Introduces: ShortFite Talesof a release pre-launch testing
  • 9.
  • 10.
  • 11.
    11 Stakeholder Concerns ● Game nearly complete ●Team starts to do capacity testing
  • 12.
    12 Stakeholder Concerns ● Game nearly complete ●Team starts to do capacity testing Indexes support the efficient execution of queries in MongoDB
  • 13.
    13 Stakeholder Concerns ● Game nearly complete ●Team starts to do capacity testing Ace Sue … … Bob Indexes support the efficient execution of queries in MongoDB
  • 14.
    Stakeholder #1 Concern Appbeing stress tested Concerns over current performance
  • 15.
    Stakeholder #1 Concern Developerscreated index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 })
  • 16.
    Stakeholder #1 Concern Developerscreated index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 }) “Clearly MongoDB is not webscale!”
  • 17.
  • 18.
    18 Blocking Operation Formally: “An operationwhich must process all input before it can begin to produce any output.” Opposite of the often desirable “fully pipelined” plan which can stream results back as soon as they are found. Commonly observed when a sort is added to a query
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Indexes that supportsorting db.games.createIndex({ gamerTag: 1, score:-1})
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    32 Blocking Stages $sort In aggregation andfind $group $bucket $count $facet Are there any other blocking operations?
  • 33.
  • 34.
    34 Working with blockingstages For sorting: Add a supporting index Worth the overhead in almost all circumstances For other stages: Do you need the blocking stage? Offload to secondary member
  • 35.
    Stakeholder #1 Concern Performanceof db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) “Clearly MongoDB is not webscale!”
  • 36.
    Stakeholder #1 Concern Performanceof db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 })
  • 37.
    Stakeholder #1 Concern Performanceof db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 }) "That’ll work great!”
  • 38.
    Stakeholder #2 Concern The$and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 39.
    Concern #2: Multiplequeries same index The $and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) We just created an index with both those fields… Can it be used?
  • 40.
  • 41.
    $and example Query ongames: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 }, { gamerTag: "Bob", score: 500 }
  • 42.
    Groups of documents score:{$gt: 9000}gamerTag: "Ace" { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 }
  • 43.
    gamerTag: "Ace" $and VennDiagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 44.
    $and Venn Diagram(logical) { gamerTag: "Bob", score: 500 } gamerTag: "Ace" { gamerTag: "Ace", score: 500 } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 45.
    gamerTag: "Ace" $and VennDiagram (logical) score: {$gt: 9000} db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 46.
    gamerTag: "Ace" $and VennDiagram (logical) score: {$gt: 9000} db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 47.
    $and Venn Diagram(logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 48.
    $and Venn Diagram(logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 49.
    $and Venn Diagram(logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 50.
    Ace Bob {gamerTag:1 , score:-1} 5009500 500 9500 $and Index Visualization db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 51.
    Ace Bob {gamerTag:1 , score:-1} 5009500 500 9500 "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } $and Index Visualization db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 52.
    Ace Bob 500 9500 5009500 {gamerTag:1 , score:-1} $and Index Visualization "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 53.
    Bob 500 9500 500 9500 Ace {gamerTag:1 ,score:-1} $and Index Visualization "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 54.
    $and Index Visualization Bob 500 5009500 Ace 9500 {gamerTag:1 , score:-1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } db.games.find( { $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]})
  • 55.
    $or example Query ongames: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 } Non-matching games: { gamerTag: "Bob", score: 500 }
  • 56.
    gamerTag: "Ace" $or VennDiagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 57.
    $or Venn Diagram(logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 58.
    $or Venn Diagram(logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 59.
    score: {$gt: 9000} $orVenn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Bob", score: 500 } { gamerTag: "Ace", score: 500 } gamerTag: "Ace" { gamerTag: "Bob", score: 9500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 60.
    score: {$gt: 9000}gamerTag:"Ace" $or Venn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 61.
    score: {$gt: 9000}gamerTag:"Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 62.
    score: {$gt: 9000}gamerTag:"Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 63.
    score: {$gt: 9000}gamerTag:"Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 64.
    score: {$gt: 9000} $orVenn Diagram (logical) gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 65.
    score: {$gt: 9000} $orVenn Diagram (logical) gamerTag: "Ace" db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 66.
    $or (single) Indexvisualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 67.
    Ace Bob {gamerTag:1 , score:-1} 5009500 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 68.
    Ace Bob 500 9500 5009500 {gamerTag:1 , score:-1} $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 69.
    Bob 500 9500 500 9500 {gamerTag:1 ,score:-1} Ace $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 70.
    Bob 500 9500 {gamerTag:1 , score:-1} Ace 5009500 $or (single) Index visualization Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 71.
    500 Bob 9500 $or (single) Indexvisualization {gamerTag:1 , score:-1} Ace 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 72.
    500 Bob 9500 Actual (Hinted) IndexBounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ]} $or (single) Index visualization {gamerTag:1 , score:-1} Ace 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 73.
    Ace Bob 500 9500 5009500 {gamerTag:1 , score:-1} So is there anything we can do to improve the performance of this query? $or (single) Index visualization Actual (Hinted) Index Bounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ]} Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ]} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 74.
  • 75.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 76.
    $or (multiple) Indexvisualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 77.
    $or (multiple) Indexvisualization Ace db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 78.
    $or (multiple) Indexvisualization Ace Bob db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 79.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 80.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1} 500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 81.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1} 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 82.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1} 500 9500 {score:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 83.
    $or (multiple) Indexvisualization Ace Bob {gamerTag:1} 500 9500 {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 84.
    $or (multiple) Indexvisualization Ace Bob 500 9500 {score:1}{gamerTag:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 85.
    $or (multiple) Indexvisualization Bob 500 9500 {score:1} Ace {gamerTag:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 86.
    $or (multiple) Indexvisualization Bob 500 9500 {score:1}{gamerTag:1} Ace "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 87.
    $or (multiple) Indexvisualization Bob 500 9500 {gamerTag:1} Ace {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 88.
    $or (multiple) Indexvisualization Bob 500 9500 {gamerTag:1} Ace {score:1} "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]” ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]” ] } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 89.
  • 90.
    Recommendations Use multiple indexes! db.data.createIndex({gamerTag:1}) db.data.createIndex({score: 1}) We already have the {gamerTag:1, score:-1} index, do we need both of these new ones?
  • 91.
    Recommendations Use multiple indexes! db.data.createIndex({gamerTag:1}) db.data.createIndex({score: 1}) db.games.createIndex({ gamerTag: 1, score:-1 })
  • 92.
    Stakeholder #2 Concern db.games.find({ $or: [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Having the right index is critical "Super!!”
  • 93.
    93 Stakeholder #3 Concern “Waitwait wait, we can’t even FIND the gamers!” A basic search on gamerTag takes several seconds already: db.games.find({gamerTag: /^Ace$/i}) “This query is SLOWER with the index than it is without it!”
  • 94.
  • 95.
    Matching games: { gamerTag:"Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } db.games.find({ gamerTag: /^Ace$/ }) //equivalent to db.games.find({ gamerTag: “Ace” }) Case Sensitive
  • 96.
    Case Sensitive ace aCeacxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ "["Ace", "Acf")", "[/^Ace$/, /^Ace$/]" ]} Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 97.
    Matching games: { gamerTag:"Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalent to db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’,strength:2})
  • 98.
    Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalentto db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’, strength:2}) Would a $text search be the same as well? Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 99.
    Case Insensitive ace aCeacxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ “["", {})", "[/^Ace$/i, /^Ace$/i]" ] } Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 100.
  • 101.
    101 Recommendations Case insensitive index! Collationsavailable since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } ) Store a transformed (eg toLower()) copy of the string
  • 102.
  • 103.
    103 Stakeholder #3 Concern db.games.find({gamerTag:“Ace”}) .collation({locale:'en', strength:2})
  • 104.
    104 Stakeholder #3 Concern db.games.find({gamerTag:“Ace”}) .collation({locale:'en', strength:2})
  • 105.
    105 Stakeholder #3 Concern db.games.find({gamerTag:“Ace”}) .collation({locale:'en', strength:2}) “Wow, MongoDB can do anything!!!!1!”
  • 106.
  • 107.
    107 Work Smarter NotHarder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
  • 108.
    108 Work Smarter NotHarder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
  • 109.
    109 Countdown to ShortFite Poweredby an optimized MongoDB environment, ShortFite is sure to be a hit!
  • 110.
    110 The Power ofQuery Optimization Query tuning results in: ● Improved performance ● Reduced resource utilization This may lead to: ● Improved stability and predictability ● A smaller hardware footprint Not uncommon to observe efficiency improvements $gte 99%
  • 111.
    111 Become the HERO! -Plenty of resources out there - http://university.mongodb.com - http://docs.mongodb.com
  • 112.