Quick overview on mongo db

Quick Overview on MongoDB Eman Abdel Ghaffar

Agenda 1. Introduction 2. CRUD 3. Cursors 4. Indexing 5. Schema Design principles 6. Aggregation 7. Map-Reduce

Introduction - ACID ● Relational databases usually guarantee ACID properties related to how reliably transactions (both reads and writes) are processed. ● The NoSQL movement trades off ACID compliance for other properties, such as 100% availability, and MongoDB is the leader in the field ● https://dzone.com/articles/how-acid-mongodb

Introduction - ACID ● Atomicity requires that each transaction is executed in its entirety, or fail without any change being applied. ● Consistency requires that the database only passes from a valid state to the next one, without intermediate points. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers. ● Isolation requires that if transactions are executed concurrently, the result is equivalent to their serial execution. ● Durability means that the the result of a committed transaction is permanent, even if the database crashes immediately or in the event of a power loss.

Introduction - CAP ● Consistency Every read receives the most recent write or an error. ● Availability Every request receives a (non-error) response – without guarantee that it contains the most recent write. ● Partition tolerance The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes. “It is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees”

Introduction - MongoDB ● MongoDB is written in C++, open source and licensed under the GNU - AGPL . ● The core database server runs via an executable called mongod ( mongodb.exe on Windows) ● The MongoDB command shell is a JavaScript-based tool for administering the database and manipulating data. manual/reference/mongo-shell/

CRUD - Create ● Databases and collections are created only when documents are first inserted.. ● Every MongoDB document requires an _id. db.collection.insertOne() db.collection.insertMany() db.collection.insert()

CRUD - Read db.collection.find(query, projection) db.inventory.find( {} ) SELECT * FROM inventory db.inventory.find( { status: "D" } ) SELECT * FROM inventory WHERE status = "D" db.inventory.find( { status: { $in: [ "A", "D" ] } } ) SELECT * FROM inventory WHERE status in ("A", "D") db.inventory.find( { status: "A", qty: { $lt: 30 } } ) SELECT * FROM inventory WHERE status = "A" AND qty < 30 db.inventory.find( { status: "A", $or: [ { qty: { $lt: 30 } }, { item: /^p/ } ] } ) SELECT * FROM inventory WHERE status = "A" AND ( qty < 30 OR item LIKE "p%")

CRUD - Update ● Some Update Operators ○ $currentDate ○ $inc ○ $min ○ $max ○ $mul ○ $rename ○ $set db.collection.update() db.collection.findAndModify() db.collection.updateOne() db.collection.updateMany() db.collection.replaceOne()

CRUD - Delete ● Indexes ○ Delete operations do not drop indexes, even if deleting all documents from a collection. ● Atomicity ○ All write operations in MongoDB are atomic on the level of a single document. db.collection.remove() db.collection.deleteOne() db.collection.deleteMany()

Cursors ● Cursors, found in many database systems, return query result sets in batches for efficiency iteratively. ● Queries instantiate a cursor, which is then used to retrieve a resultset in manageable chunks, successive calls to MongoDB occur as needed to fill the driver’s cursor buffer. ● Returning a huge result right away would mean: ○ Copying all that data into memory. ○ Transferring it over the wire. ○ Deserializing it on the client side.

Indexing ● Introduction ● Indexing Types ● Indexing Properties

Indexing- Introduction ● Index keys are typically smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk. ● Covered Queries ○ When the query criteria and the projection of a query include only the indexed fields ○ Results returned directly from the index without scanning any documents or bringing documents into memory. ● Ensure Indexes Fit in RAM ○ use the db.collection.totalIndexSize() helper, which returns index size in bytes.

Indexing - Index Types ● Single Field ● Compound Index ● Multikey Index ● Geospatial Index ● Text Indexes ● Hashed Indexes

Indexing - Index Properties ● TTL Indexes ○ The TTL index is used for TTL collections, which expire data after a period of time. ● Unique Indexes ○ A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. ● Partial Indexes ○ A partial index indexes only documents that meet specified filter criteria. ● Case Insensitive Indexes ○ A case insensitive index disregards the case of the index key values. ● Sparse Indexes ○ A sparse index does not index documents that do not have the indexed field.

Schema Design principles ● Introduction ● Embedding Vs. Referencing ● Model One-to-One Relationships ● Model One-to-Many Relationships

Schema Design principles - Introduction ● The application’s data access patterns should govern schema design, with specific understanding of ○ The read/write ratio of database operations. ○ The types of queries and updates performed by the database. ○ The life-cycle of the data and growth rate of documents. ● When designing a data model, consider how applications will use your database. ○ if your application only uses recently inserted documents, consider using Capped Collections data-modeling

Embedding Vs. Refencing ● Embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. ● Not all 1:1 or 1:Many relationships should be embedded in a single document.

Embedding Vs. Refencing ● References store the relationships between data by including links or references from one document to another. ○ When embedding would not provide sufficient read performance advantages ○ Where the object is referenced from many different sources. ○ To represent complex many-to-many relationships. ○ To model large, hierarchical data sets.

One-to-One Relationships - Embedding

One-to-Many Relationships One-to-ManyOne-to-Few

One-to-Many Relationships One-to-Squillions

Aggregation ● Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. ● The aggregate command operates on a single collection, logically passing the entire collection into the aggregation pipeline. ● The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline.

Aggregation https://docs.mongodb.com/manual/core/aggregation-pipeline-optimization/

Aggregation - Limitations ● If any single document that exceeds the BSON Document Size limit, the command will produce an error. ● The $group stage has a limit of 100 megabytes of RAM. By default, if the stage exceeds this limit, $group will produce an error.

Map-Reduce ● Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. ● Map-Reduce is less efficient and more complex than the aggregation pipeline. ● All map-reduce functions in MongoDB are JavaScript and run within the mongod process. ● Map-reduce operations take the documents of a single collection.

Quick overview on mongo db

More Related Content

What's hot

Similar to Quick overview on mongo db

Recently uploaded

Quick overview on mongo db