#MDBlocal Managing Diverse User Needs with MongoDB and SQL Seth Payne Sr. Product Manager, SQL API & Enterprise Tools seth.payne@mongodb.com LONDON
#MDBLocal ● Enable you to query MongoDB using SQL ● Provide an end-to-end review of how the BI Connector makes this possible ● Teach you how to manage schema mappings between MongoDB and SQL tables/columns ● Demo how to connect to and query MongoDB from the MySQL shell Our goals today:
#MDBLocal ● Download BI Connector 2.12 ● Query MongoDB using default schema mapping ● Create a custom schema mapping ● Query MongoDB via SQL using your custom schema When you leave this session:
#MDBLocal A bit about me vs.
#MDBLocal
#MDBLocal My unique role at MongoDB
#MDBLocal Data technology is changing • Key question: • How can I optimize disparate technologies to support a robust and feature-rich data stack to support a broad set of use cases?
#MDBLocal Data technologies are like life • To accomplish big things and solve difficult problems, systems must be able to talk to each other • It’s ok to have strong opinions • MongoDB and the MongoDB Query Language is the best way to interrogate data • SQL is a widely-adopted, powerful query language that empowers both developers and data analysts to gain valuable insights • In the end, it is results that matter most
#MDBLocal Data Access API Change Data Capture (CDC) Extract Transform Load (ETL) MongoDB Cluster Document Data Model Distributed Systems Architecture Cloud | On-Premises Operational Apps and Systems of Record Producers Operational Data Layer Consumers Mainframe Systems CRM ERP Order Management Supply Chain Mgmt Data Lake Marketing Automation Website Social Media Reference Data Third-Party APIs Etc. Batch Load API CallsBatch File Exports Real-Time Data Changes Delta Load MongoDB Change Streams Write Back to Producer Systems (Optional) MongoDB Native Drivers Consuming Operational Apps and Services Internal apps, customer-facing services, and APIs for third-party consumption – across any channel Business Intelligence (BI) and Advanced Analytics Visualization and reporting, data analysis, artificial intelligence, machine learning and more Human Capital Mgmt MongoDB Connectors
#MDBLocal Jane is a MongoDB admin. She recently learned that an analytics team she serves, requires SQL-based access to data sitting in MongoDB. She needs to provide this access and wants to avoid an expensive and error-prone ETL workflow. ● MongoDB Admins need a way to provide their business users access to MongoDB. ● SQL-based applications need to read MongoDB data Problem Statement
#MDBLocal Solution: MongoDB BI Connector
#MDBLocal What is the BI Connector? ● Provides read-only SQL access to any MongoDB standalone or replica set ● Translates incoming SQL queries to MQL aggregation pipelines ○ Pipeline executed on MongoDB cluster ○ Tabular results returned to client via BI Connector ● Supports: ○ ODBC ○ JDBC ○ MySQL
#MDBLocal BI Connector: Maturity and Adoption 1 2.0 - 2.4 2.5 - 2.10 2.11+ Adoption ● Proof of Concept ● Hyper-focus on Tableau ● Expanded SQL function support ● Performance improvements ● Improved usability ● Near 100% coverage of standard SQL functions ● Capable of displacing of RDMS systems ● Performance improvements ● Enterprise management features ● mongotranslate ● Query optimization ● Performance improvements
#MDBLocal August 2019 –MongoDB Atlas BIC SQL Queries ProcessedFull Translation Success Rate
#MDBLocal ● BI Connector is presented as MySQL ● Abstract relational schema defined ○ INFORMATION_SCHEMA ● Mapping between tabular schema and MongoDB document structure ● Schema defined in several ways: ○ Sampling ○ Document Relational Definition Language - mongodrdl We need a relational schema to query
#MDBLocal Default schema: mapping via document sampling Mongosqld connects to MongoDB Documents sampled from namespace(s) Relational schema available to incoming connections MongoDB Document Relational Schema: _id foreign key Database Database Collection Root Table Field Column Arrays & Objects Sub-tables
#MDBLocal > db.sql.findOne() { "_id": ObjectId("5bfabde76f280102ddf27969"), "band": "Slayer", "formation": { "year": ISODate("1982-01-01T00:00:00Z"), "city": "Los Angeles" }, "popular_albums": [ "Show No Mercy!", "Seasons in the Abyss", "Haunting the Chapel", "Divine Intervention" ], "members": [ { "name": "Tom Araya", "dob": ISODate("1961-06-06T07:00:00Z"), "primary_instrument": "Bass/Vocals" }, { "name": "Kerry King", "dob": ISODate("1964-06-03T07:00:00Z"), "primary_instrument": "Guitar" } ]}
#MDBLocal mysql> show tables; +-----------------------------------------+ | Tables_in_sql | +-----------------------------------------+ | bands | | bands_members | | bands_popular_albums | mysql> SELECT * FROM bands -> JOIN bands_popular_albums ON bands._id = bands_popular_albums._id;
Custom Schema Create a tabular schema to meet specific needs of users
#MDBLocal Document Relational Definition Language (DRDL) ● Often, we need to adjust the default schema generated by BIC schema creation logic ● Output generated by mongodrdl ● Utilizes BIC default sampling logic ● Defines both names and data types
#MDBLocal Generate mongodrdl output Edit output to create desired schema Import schema to MongoDB Provide unique name Start mongosqld with named schema
#MDBLocal Generate DRDL Output using mongodrdl mongodrdl -d sql -c bands > bands.drdl
schema: - db: sql tables: - table: bands collection: bands pipeline: [] columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: band MongoType: string SqlName: band SqlType: varchar - Name: formation.city MongoType: string SqlName: formation.city SqlType: varchar - Name: formation.year MongoType: date SqlName: formation.year SqlType: timestamp - table: bands_members collection: bands pipeline: - $unwind: includeArrayIndex: members_idx path: $members preserveNullAndEmptyArrays: false columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: members.dob MongoType: date SqlName: members.dob SqlType: timestamp - Name: members.name MongoType: string SqlName: members.name SqlType: varchar - Name: members.primary_instrument MongoType: string SqlName: members.primary_instrument SqlType: varchar - Name: members_idx MongoType: int SqlName: members_idx SqlType: int Root table: bands Sub-table: bands_members
- table: bands_popular_albums collection: bands pipeline: - $unwind: includeArrayIndex: popular_albums_idx path: $popular_albums preserveNullAndEmptyArrays: false columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: popular_albums MongoType: string SqlName: popular_albums SqlType: varchar - Name: popular_albums_idx MongoType: int SqlName: popular_albums_idx SqlType: int Sub-table: bands_popular_albums
#MDBLocal Current vs Desired mysql> show tables; +-----------------------------------------+ | Tables_in_sql | +-----------------------------------------+ | bands | | bands_members | | bands_popular_albums | mysql> show tables; +-----------------------------------------+ | Tables_in_sql | +-----------------------------------------+ | bands |
#MDBLocal Define new document structure with aggregation [{$unwind: { path: "$popular_albums" }}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] schema: - db: sql tables: - table: bands collection: bands pipeline: [{$unwind: {path: "$popular_albums"}}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] Use aggregation to create document structure: Add pipeline to drdl:
#MDBLocal Edit table/field names and data types schema: - db: sql tables: - table: bands collection: bands pipeline: [{$unwind: { path: "$popular_albums" }}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: band MongoType: string SqlName: band SqlType: varchar - Name: city MongoType: string SqlName: city SqlType: timestamp - Name: popular_album MongoType: string SqlName: album SqlType: varchar - Name: formed MongoType: date SqlName: formed SqlType: timestamp
#MDBLocal Import schema and provide name mongodrdl upload --drdl bands_flat.drdl --schemaSource=bic_schema mongodrdl name-schema --name bands_flat --schemaSource=bic_schema - -schema <unique ID>
#MDBLocal Start mongosqld in “custom mode” mongosqld --schemaSource=bic_schemas / --schemaName=bands_flat / --schemaMode=custom
#MDBLocal Updated schema: mysql> describe bands; +-------+----------------+------+------+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+----------------+------+------+---------+-------+ | _id | varchar(24) | YES | PRI | NULL | | | album | varchar(65535) | YES | | NULL | | | band | varchar(65535) | YES | | NULL | | | city | datetime | YES | | NULL | | +-------+----------------+------+------+---------+-------+
Demo
#MDBLocal ● Get you excited about querying MongoDB using SQL ● Provide an end-to-end review of how the BI Connector enables this ● Teach you how to manage schema mappings between MongoDB and SQL tables/columns ● Demonstrate how to connect to and query MongoDB from the MySQL shell Revisiting our goals today:
#MDBLocal ● Download BI Connector 2.12 ● Query MongoDB using default schema mapping ● Create a custom schema mapping ● Query MongoDB via SQL using your custom schema Now you can:
THANK YOU
THANK YOU
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL

MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL

  • 1.
    #MDBlocal Managing Diverse UserNeeds with MongoDB and SQL Seth Payne Sr. Product Manager, SQL API & Enterprise Tools seth.payne@mongodb.com LONDON
  • 2.
    #MDBLocal ● Enable youto query MongoDB using SQL ● Provide an end-to-end review of how the BI Connector makes this possible ● Teach you how to manage schema mappings between MongoDB and SQL tables/columns ● Demo how to connect to and query MongoDB from the MySQL shell Our goals today:
  • 3.
    #MDBLocal ● Download BIConnector 2.12 ● Query MongoDB using default schema mapping ● Create a custom schema mapping ● Query MongoDB via SQL using your custom schema When you leave this session:
  • 4.
  • 5.
  • 6.
  • 7.
    #MDBLocal Data technology ischanging • Key question: • How can I optimize disparate technologies to support a robust and feature-rich data stack to support a broad set of use cases?
  • 8.
    #MDBLocal Data technologies arelike life • To accomplish big things and solve difficult problems, systems must be able to talk to each other • It’s ok to have strong opinions • MongoDB and the MongoDB Query Language is the best way to interrogate data • SQL is a widely-adopted, powerful query language that empowers both developers and data analysts to gain valuable insights • In the end, it is results that matter most
  • 9.
    #MDBLocal Data Access API Change Data Capture (CDC) Extract Transform Load (ETL) MongoDB Cluster DocumentData Model Distributed Systems Architecture Cloud | On-Premises Operational Apps and Systems of Record Producers Operational Data Layer Consumers Mainframe Systems CRM ERP Order Management Supply Chain Mgmt Data Lake Marketing Automation Website Social Media Reference Data Third-Party APIs Etc. Batch Load API CallsBatch File Exports Real-Time Data Changes Delta Load MongoDB Change Streams Write Back to Producer Systems (Optional) MongoDB Native Drivers Consuming Operational Apps and Services Internal apps, customer-facing services, and APIs for third-party consumption – across any channel Business Intelligence (BI) and Advanced Analytics Visualization and reporting, data analysis, artificial intelligence, machine learning and more Human Capital Mgmt MongoDB Connectors
  • 10.
    #MDBLocal Jane is aMongoDB admin. She recently learned that an analytics team she serves, requires SQL-based access to data sitting in MongoDB. She needs to provide this access and wants to avoid an expensive and error-prone ETL workflow. ● MongoDB Admins need a way to provide their business users access to MongoDB. ● SQL-based applications need to read MongoDB data Problem Statement
  • 11.
  • 12.
    #MDBLocal What is theBI Connector? ● Provides read-only SQL access to any MongoDB standalone or replica set ● Translates incoming SQL queries to MQL aggregation pipelines ○ Pipeline executed on MongoDB cluster ○ Tabular results returned to client via BI Connector ● Supports: ○ ODBC ○ JDBC ○ MySQL
  • 13.
    #MDBLocal BI Connector: Maturityand Adoption 1 2.0 - 2.4 2.5 - 2.10 2.11+ Adoption ● Proof of Concept ● Hyper-focus on Tableau ● Expanded SQL function support ● Performance improvements ● Improved usability ● Near 100% coverage of standard SQL functions ● Capable of displacing of RDMS systems ● Performance improvements ● Enterprise management features ● mongotranslate ● Query optimization ● Performance improvements
  • 14.
    #MDBLocal August 2019 –MongoDBAtlas BIC SQL Queries ProcessedFull Translation Success Rate
  • 15.
    #MDBLocal ● BI Connectoris presented as MySQL ● Abstract relational schema defined ○ INFORMATION_SCHEMA ● Mapping between tabular schema and MongoDB document structure ● Schema defined in several ways: ○ Sampling ○ Document Relational Definition Language - mongodrdl We need a relational schema to query
  • 16.
    #MDBLocal Default schema: mappingvia document sampling Mongosqld connects to MongoDB Documents sampled from namespace(s) Relational schema available to incoming connections MongoDB Document Relational Schema: _id foreign key Database Database Collection Root Table Field Column Arrays & Objects Sub-tables
  • 17.
    #MDBLocal > db.sql.findOne() { "_id": ObjectId("5bfabde76f280102ddf27969"), "band":"Slayer", "formation": { "year": ISODate("1982-01-01T00:00:00Z"), "city": "Los Angeles" }, "popular_albums": [ "Show No Mercy!", "Seasons in the Abyss", "Haunting the Chapel", "Divine Intervention" ], "members": [ { "name": "Tom Araya", "dob": ISODate("1961-06-06T07:00:00Z"), "primary_instrument": "Bass/Vocals" }, { "name": "Kerry King", "dob": ISODate("1964-06-03T07:00:00Z"), "primary_instrument": "Guitar" } ]}
  • 18.
    #MDBLocal mysql> show tables; +-----------------------------------------+ |Tables_in_sql | +-----------------------------------------+ | bands | | bands_members | | bands_popular_albums | mysql> SELECT * FROM bands -> JOIN bands_popular_albums ON bands._id = bands_popular_albums._id;
  • 19.
    Custom Schema Create atabular schema to meet specific needs of users
  • 20.
    #MDBLocal Document Relational DefinitionLanguage (DRDL) ● Often, we need to adjust the default schema generated by BIC schema creation logic ● Output generated by mongodrdl ● Utilizes BIC default sampling logic ● Defines both names and data types
  • 21.
    #MDBLocal Generate mongodrdl output Edit output to create desired schema Import schemato MongoDB Provide unique name Start mongosqld with named schema
  • 22.
    #MDBLocal Generate DRDL Outputusing mongodrdl mongodrdl -d sql -c bands > bands.drdl
  • 23.
    schema: - db: sql tables: -table: bands collection: bands pipeline: [] columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: band MongoType: string SqlName: band SqlType: varchar - Name: formation.city MongoType: string SqlName: formation.city SqlType: varchar - Name: formation.year MongoType: date SqlName: formation.year SqlType: timestamp - table: bands_members collection: bands pipeline: - $unwind: includeArrayIndex: members_idx path: $members preserveNullAndEmptyArrays: false columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: members.dob MongoType: date SqlName: members.dob SqlType: timestamp - Name: members.name MongoType: string SqlName: members.name SqlType: varchar - Name: members.primary_instrument MongoType: string SqlName: members.primary_instrument SqlType: varchar - Name: members_idx MongoType: int SqlName: members_idx SqlType: int Root table: bands Sub-table: bands_members
  • 24.
    - table: bands_popular_albums collection:bands pipeline: - $unwind: includeArrayIndex: popular_albums_idx path: $popular_albums preserveNullAndEmptyArrays: false columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: popular_albums MongoType: string SqlName: popular_albums SqlType: varchar - Name: popular_albums_idx MongoType: int SqlName: popular_albums_idx SqlType: int Sub-table: bands_popular_albums
  • 25.
    #MDBLocal Current vs Desired mysql>show tables; +-----------------------------------------+ | Tables_in_sql | +-----------------------------------------+ | bands | | bands_members | | bands_popular_albums | mysql> show tables; +-----------------------------------------+ | Tables_in_sql | +-----------------------------------------+ | bands |
  • 26.
    #MDBLocal Define new documentstructure with aggregation [{$unwind: { path: "$popular_albums" }}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] schema: - db: sql tables: - table: bands collection: bands pipeline: [{$unwind: {path: "$popular_albums"}}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] Use aggregation to create document structure: Add pipeline to drdl:
  • 27.
    #MDBLocal Edit table/field namesand data types schema: - db: sql tables: - table: bands collection: bands pipeline: [{$unwind: { path: "$popular_albums" }}, {$project: { band: "$band", formed: "$formation.year", city: "$formation.city", popular_album: "$popular_albums" }}] columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: objectid - Name: band MongoType: string SqlName: band SqlType: varchar - Name: city MongoType: string SqlName: city SqlType: timestamp - Name: popular_album MongoType: string SqlName: album SqlType: varchar - Name: formed MongoType: date SqlName: formed SqlType: timestamp
  • 28.
    #MDBLocal Import schema andprovide name mongodrdl upload --drdl bands_flat.drdl --schemaSource=bic_schema mongodrdl name-schema --name bands_flat --schemaSource=bic_schema - -schema <unique ID>
  • 29.
    #MDBLocal Start mongosqld in“custom mode” mongosqld --schemaSource=bic_schemas / --schemaName=bands_flat / --schemaMode=custom
  • 30.
    #MDBLocal Updated schema: mysql> describebands; +-------+----------------+------+------+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+----------------+------+------+---------+-------+ | _id | varchar(24) | YES | PRI | NULL | | | album | varchar(65535) | YES | | NULL | | | band | varchar(65535) | YES | | NULL | | | city | datetime | YES | | NULL | | +-------+----------------+------+------+---------+-------+
  • 31.
  • 32.
    #MDBLocal ● Get youexcited about querying MongoDB using SQL ● Provide an end-to-end review of how the BI Connector enables this ● Teach you how to manage schema mappings between MongoDB and SQL tables/columns ● Demonstrate how to connect to and query MongoDB from the MySQL shell Revisiting our goals today:
  • 33.
    #MDBLocal ● Download BIConnector 2.12 ● Query MongoDB using default schema mapping ● Create a custom schema mapping ● Query MongoDB via SQL using your custom schema Now you can:
  • 34.
  • 35.