Data Modeling for Microservices with Cassandra and Spark

Strata + Hadoop World NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 1Page 1 Jeff Carpenter, Choice Hotels International Data modeling for microservices with Cassandra and Spark

Strata + Hadoop World NYC Sept 26-29, 2016 1 IT Transformation – Distribution and Analytics 2 Creating a Data Architecture 3 Data Modeling for Microservices 4 Using Metadata for Diagnostics and Analytics 5 Challenges Agenda Page 2

Strata + Hadoop World NYC Sept 26-29, 2016 IT Capabilities Corporate IT Guest Franchise Relations Hotel Manage- ment Business Intelligence Distribution Page 3 This talk

Strata + Hadoop World NYC Sept 26-29, 2016 CRSWeb and Mobile External Channels Customer & Loyalty Billing Property Systems Reporting & Analytics Distribution - Central Reservation System Page 4 Distribution Domain Guest Domain Franchisee Domain Hotel Management Domain Business Intelligence Domain

Strata + Hadoop World NYC Sept 26-29, 2016 Current Reservation System – By The Numbers Page 5 25 years 6,000 hotels 50 transactions / second4,000 distribution channels 1 instance

Strata + Hadoop World NYC Sept 26-29, 2016 New Systems: Distribution and Data Platforms Page 6 Distribution Platform Data Platform History Realtime data See: Choice Hotels's journey to better understand its customers through self-service analytics This Talk: how we model data and use the self-service platform

Strata + Hadoop World NYC Sept 26-29, 2016 Distribution Platform - Architecture Tenets Cloud-native Microservices Open Source Infrastructure Extensibility Stable, Scalable, Secure Page 7

Strata + Hadoop World NYC Sept 26-29, 2016 Data Ownership What is a Microservice? (one definition) Page 8 Message Driven Service Entity Service Client REST API AMQ Events DB Composing Service Persistence

Strata + Hadoop World NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 9 How can we design our data architecture & models to be… • Scalable? • Extensible? • Maintainable? • Analytics-ready?

Strata + Hadoop World NYC Sept 26-29, 2016 Non- relational storage Long Term Storage Logging Reporting & Analytics Metrics Our Data Stack Page 10

Strata + Hadoop World NYC Sept 26-29, 2016 Data Modeling – Then and Now Isolated Systems Data Dictionary SOA and Canonical Data Model Services own data Page 11 • Identifying domains and relationships Conceptual Data Model • Identifying data types and relationships Logical Data Model • Java APIs • RESTful APIs (JSON) • Events (JSON) • Cassandra Schemas Physical Models

Strata + Hadoop World NYC Sept 26-29, 2016 Conceptual Data Model - Domains Page 12 rates inventoryhotels reservationsoffers

Strata + Hadoop World NYC Sept 26-29, 2016 Hotel Management Domain Guest DomainDistribution Domain Conceptual Data Model – Domain Relationships Page 13 hotels guest stay loyalty rates inventory offers reservations

Strata + Hadoop World NYC Sept 26-29, 2016 Rates Domain Composite Rate Service Rate Plan Service Rate Service Logical Data Model – Identifying Types Page 14 Rate Plan • id • code • hotelId • effectiveDates • Conditions Rate • id • ratePlanId • productId • hotelId • dateSpan Price • condition • amount Product • id • code • hotelId • features • …

Strata + Hadoop World NYC Sept 26-29, 2016 Standardizing Common Data Types • Instead of a Canonical Data Model, we standardize basic building blocks – Feature, Category, Brand – Geospatial – Financial – Time – Contact information Page 15 Address • lines[] • city • subdivision • country • postalCode

Strata + Hadoop World NYC Sept 26-29, 2016 Data Types → Microservice Identification Page 16 Hotel Service Rates Service Data Maintenance Apps Inventory Service Offer Service Inventory Domain Rates Domain Hotel Domain Offer Domain Internal / External Client Apps Reservation Service Reservation Domain

Strata + Hadoop World NYC Sept 26-29, 2016 Physical Data Models Page 17 Physical Models Java APIs RESTful APIs (JSON) Events (JSON) Cassandra Schemas JSON = primary definition of the data type owned by each service

Strata + Hadoop World NYC Sept 26-29, 2016 Key Data Types → RESTful Resource Paths Page 18 Offer Service /offers /reservations Hotel Service Reservation Service Rates Service Inventory Service /hotels /rates /inventory

Strata + Hadoop World NYC Sept 26-29, 2016 REST Java API GET /types/<id> Type getTypeById() GET /types?<query parameters> Type[] searchType(TypeSearchCriteria) POST /types/ (JSON body) createType(Type) PUT /types/ (JSON body) updateType(Type) DELETE /types/<id> deleteType(TypeId) Java and RESTful APIs – common pattern Page 19

Strata + Hadoop World NYC Sept 26-29, 2016Page 20 Cassandra Data Modeling (an idealized view)

Strata + Hadoop World NYC Sept 26-29, 2016 Viewhotels nearPOI Viewhotel Info ShowPOIs nearhotel Shopfor roomsat hotel Viewroom details Booka room Q1 Q2 Q3 Q4 Q5 Viewreservation byconfirmation number Viewhotel reservationsfor adate Find reservationby guestname Q6 Q8 Q7 Viewguest details Q9 Q9 Q9 Cassandra Data Modeling – Access Patterns Page 21

Strata + Hadoop World NYC Sept 26-29, 2016 pois_by_hotel hotel_id poi_name description Q3 Q1 Q2 Q4 Q5 amenities_by_room hotel_id room_id amenity_name description K K C↑ K C↑ hotels_by_poi poi_name hotel_id name phone address K C↑ hotels hotel_id name phone address K available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ Cassandra Data Modeling – Chebotko Diagrams Page 22

Strata + Hadoop World NYC Sept 26-29, 2016 hotelkeyspace hotels_by_poi poi_name hotel_id name phone address K C↑ pois_by_hotel hotel_id poi_name description amenities_by_room hotel_id room_number amenity_name description K K C↑ K C↑ available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ date smallint boolean text text text text address text text smallint text text text text *address* street city state_or_province postal_code country hotels hotel_id name phone *address* text text text text text text text text address K text Cassandra Data Modeling - Physical Page 23

Strata + Hadoop World NYC Sept 26-29, 2016 Cassandra Data Modeling - Schemas CREATE KEYSPACE hotel WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3}; CREATE TYPE hotel.address ( street text, city text, state_or_province text, postal_code text, country text ); CREATE TABLE hotel.hotels_by_poi ( poi_name text, hotel_id text, name text, phone text, address frozen<address>, PRIMARY KEY ((poi_name), hotel_id) ) WITH CLUSTERING ORDER BY ( hotel_id ASC) ; Page 24

Strata + Hadoop World NYC Sept 26-29, 2016Page 25 And now… Back to reality

Strata + Hadoop World NYC Sept 26-29, 2016 Keyspace hotel Access Patterns and Denormalization Page 26 Locate hotel by identifier Find hotels within X miles of point Y Find hotels by city, state, country Find hotels by postal code Hotels by amenity Find hotels by brand hotels_by_id hotels_by_brand hotels_by_postal_code … Hotels by this Hotels by that Hotels by something else

Strata + Hadoop World NYC Sept 26-29, 2016 Metadata Page 27 Request Context • Requestor • Tracking ID • Token • Locale Service AMQ Logs ELK Stack EventsIncoming Request

Strata + Hadoop World NYC Sept 26-29, 2016 Asynchronous events Page 28 Event • Type • Create • Update • Delete • Request Context • Old entity • New entity Request Context • Requestor • Tracking ID • Token • Locale { "type" : "UPDATE", "trackingId" : "0da7b794-f2c3-…", "requestor": "Legacy CRS", "newEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "22", "totalCount": "25“ }, "oldEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "20", "totalCount": "25“ } } Entity (old/new) • Id • … Sample Inventory Event

Strata + Hadoop World NYC Sept 26-29, 2016 Putting It Together – Diagnostics Page 29 Service C* node node node node Incoming Request Data History Logs Metrics StoreELK StackData Platform Metrics

Strata + Hadoop World NYC Sept 26-29, 2016 Metrics StoreELK Stack Putting It Together – Long Term Storage Page 30 Data Platform C* node node node node Long Term Storage

Strata + Hadoop World NYC Sept 26-29, 2016 Separating Active and History Data Page 31 Now Time Yesterday’s data is ancient history Rate + Inventory Data

Strata + Hadoop World NYC Sept 26-29, 2016 Data Platform - Cloudera History architecture Page 32 Service AMQ Kafka S3 Other subscribers History retrieval History capture Customer Service Apps History Service Spark node node node node Impala*

Strata + Hadoop World NYC Sept 26-29, 2016 Microservice Data Challenges No Joins? Data Maintenance Data Integrity Cascading Deletes Transactions Page 33

Strata + Hadoop World NYC Sept 26-29, 2016 Distributed Transactions, Anyone? Page 34 Commit the contract Reserve the inventory Booking Client Data Maintenance Apps Inventory Service Reservation Service inventory reservations Data synchronization

Strata + Hadoop World NYC Sept 26-29, 2016 Alternatives to Distributed Transactions Approach Example Scope C* Lightweight Transaction Updating inventory counts Data Tier C* Logged Batch Writing to multiple denormalized hotel tables Data Tier Retrying failed calls Data synchronization, reservation processing Service Compensating transactions Verifying reservation processing System Page 35 Eventual consistency Strong consistency

Strata + Hadoop World NYC Sept 26-29, 2016 Final Thoughts Data Models > Microservices Events = Streams Use Metadata Everywhere Page 36

Strata + Hadoop World NYC Sept 26-29, 2016 Now Available! Page 37 Cassandra: The Definitive Guide, 2nd Edition Completely reworked for Cassandra 3.X: • Data modeling in CQL • SASI indexes • Materialized views • Lightweight transactions • DataStax drivers • New chapters on security, deployment, and integration

Strata + Hadoop World NYC Sept 26-29, 2016 Contact Info @choicehotels careers.choicehotels.com @jscarp jeffreyscarpenter Page 38

Data Modeling for Microservices with Cassandra and Spark

More Related Content

What's hot

Viewers also liked

Similar to Data Modeling for Microservices with Cassandra and Spark

Recently uploaded

Data Modeling for Microservices with Cassandra and Spark

Editor's Notes