Strata + Hadoop World NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 1Page 1 Jeff Carpenter, Choice Hotels International Data modeling for microservices with Cassandra and Spark
Strata + Hadoop World NYC Sept 26-29, 2016 1 IT Transformation – Distribution and Analytics 2 Creating a Data Architecture 3 Data Modeling for Microservices 4 Using Metadata for Diagnostics and Analytics 5 Challenges Agenda Page 2
Strata + Hadoop World NYC Sept 26-29, 2016 IT Capabilities Corporate IT Guest Franchise Relations Hotel Manage- ment Business Intelligence Distribution Page 3 This talk
Strata + Hadoop World NYC Sept 26-29, 2016 CRSWeb and Mobile External Channels Customer & Loyalty Billing Property Systems Reporting & Analytics Distribution - Central Reservation System Page 4 Distribution Domain Guest Domain Franchisee Domain Hotel Management Domain Business Intelligence Domain
Strata + Hadoop World NYC Sept 26-29, 2016 Current Reservation System – By The Numbers Page 5 25 years 6,000 hotels 50 transactions / second4,000 distribution channels 1 instance
Strata + Hadoop World NYC Sept 26-29, 2016 New Systems: Distribution and Data Platforms Page 6 Distribution Platform Data Platform History Realtime data See: Choice Hotels's journey to better understand its customers through self-service analytics This Talk: how we model data and use the self-service platform
Strata + Hadoop World NYC Sept 26-29, 2016 Distribution Platform - Architecture Tenets Cloud-native Microservices Open Source Infrastructure Extensibility Stable, Scalable, Secure Page 7
Strata + Hadoop World NYC Sept 26-29, 2016 Data Ownership What is a Microservice? (one definition) Page 8 Message Driven Service Entity Service Client REST API AMQ Events DB Composing Service Persistence
Strata + Hadoop World NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 9 How can we design our data architecture & models to be… • Scalable? • Extensible? • Maintainable? • Analytics-ready?
Strata + Hadoop World NYC Sept 26-29, 2016 Non- relational storage Long Term Storage Logging Reporting & Analytics Metrics Our Data Stack Page 10
Strata + Hadoop World NYC Sept 26-29, 2016 Data Modeling – Then and Now Isolated Systems Data Dictionary SOA and Canonical Data Model Services own data Page 11 • Identifying domains and relationships Conceptual Data Model • Identifying data types and relationships Logical Data Model • Java APIs • RESTful APIs (JSON) • Events (JSON) • Cassandra Schemas Physical Models
Strata + Hadoop World NYC Sept 26-29, 2016 Conceptual Data Model - Domains Page 12 rates inventoryhotels reservationsoffers
Strata + Hadoop World NYC Sept 26-29, 2016 Hotel Management Domain Guest DomainDistribution Domain Conceptual Data Model – Domain Relationships Page 13 hotels guest stay loyalty rates inventory offers reservations
Strata + Hadoop World NYC Sept 26-29, 2016 Rates Domain Composite Rate Service Rate Plan Service Rate Service Logical Data Model – Identifying Types Page 14 Rate Plan • id • code • hotelId • effectiveDates • Conditions Rate • id • ratePlanId • productId • hotelId • dateSpan Price • condition • amount Product • id • code • hotelId • features • …
Strata + Hadoop World NYC Sept 26-29, 2016 Standardizing Common Data Types • Instead of a Canonical Data Model, we standardize basic building blocks – Feature, Category, Brand – Geospatial – Financial – Time – Contact information Page 15 Address • lines[] • city • subdivision • country • postalCode
Strata + Hadoop World NYC Sept 26-29, 2016 Data Types → Microservice Identification Page 16 Hotel Service Rates Service Data Maintenance Apps Inventory Service Offer Service Inventory Domain Rates Domain Hotel Domain Offer Domain Internal / External Client Apps Reservation Service Reservation Domain
Strata + Hadoop World NYC Sept 26-29, 2016 Physical Data Models Page 17 Physical Models Java APIs RESTful APIs (JSON) Events (JSON) Cassandra Schemas JSON = primary definition of the data type owned by each service
Strata + Hadoop World NYC Sept 26-29, 2016 Key Data Types → RESTful Resource Paths Page 18 Offer Service /offers /reservations Hotel Service Reservation Service Rates Service Inventory Service /hotels /rates /inventory
Strata + Hadoop World NYC Sept 26-29, 2016 REST Java API GET /types/<id> Type getTypeById() GET /types?<query parameters> Type[] searchType(TypeSearchCriteria) POST /types/ (JSON body) createType(Type) PUT /types/ (JSON body) updateType(Type) DELETE /types/<id> deleteType(TypeId) Java and RESTful APIs – common pattern Page 19
Strata + Hadoop World NYC Sept 26-29, 2016Page 20 Cassandra Data Modeling (an idealized view)
Strata + Hadoop World NYC Sept 26-29, 2016 Viewhotels nearPOI Viewhotel Info ShowPOIs nearhotel Shopfor roomsat hotel Viewroom details Booka room Q1 Q2 Q3 Q4 Q5 Viewreservation byconfirmation number Viewhotel reservationsfor adate Find reservationby guestname Q6 Q8 Q7 Viewguest details Q9 Q9 Q9 Cassandra Data Modeling – Access Patterns Page 21
Strata + Hadoop World NYC Sept 26-29, 2016 pois_by_hotel hotel_id poi_name description Q3 Q1 Q2 Q4 Q5 amenities_by_room hotel_id room_id amenity_name description K K C↑ K C↑ hotels_by_poi poi_name hotel_id name phone address K C↑ hotels hotel_id name phone address K available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ Cassandra Data Modeling – Chebotko Diagrams Page 22
Strata + Hadoop World NYC Sept 26-29, 2016 hotelkeyspace hotels_by_poi poi_name hotel_id name phone address K C↑ pois_by_hotel hotel_id poi_name description amenities_by_room hotel_id room_number amenity_name description K K C↑ K C↑ available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ date smallint boolean text text text text address text text smallint text text text text *address* street city state_or_province postal_code country hotels hotel_id name phone *address* text text text text text text text text address K text Cassandra Data Modeling - Physical Page 23
Strata + Hadoop World NYC Sept 26-29, 2016 Cassandra Data Modeling - Schemas CREATE KEYSPACE hotel WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3}; CREATE TYPE hotel.address ( street text, city text, state_or_province text, postal_code text, country text ); CREATE TABLE hotel.hotels_by_poi ( poi_name text, hotel_id text, name text, phone text, address frozen<address>, PRIMARY KEY ((poi_name), hotel_id) ) WITH CLUSTERING ORDER BY ( hotel_id ASC) ; Page 24
Strata + Hadoop World NYC Sept 26-29, 2016Page 25 And now… Back to reality
Strata + Hadoop World NYC Sept 26-29, 2016 Keyspace hotel Access Patterns and Denormalization Page 26 Locate hotel by identifier Find hotels within X miles of point Y Find hotels by city, state, country Find hotels by postal code Hotels by amenity Find hotels by brand hotels_by_id hotels_by_brand hotels_by_postal_code … Hotels by this Hotels by that Hotels by something else
Strata + Hadoop World NYC Sept 26-29, 2016 Metadata Page 27 Request Context • Requestor • Tracking ID • Token • Locale Service AMQ Logs ELK Stack EventsIncoming Request
Strata + Hadoop World NYC Sept 26-29, 2016 Asynchronous events Page 28 Event • Type • Create • Update • Delete • Request Context • Old entity • New entity Request Context • Requestor • Tracking ID • Token • Locale { "type" : "UPDATE", "trackingId" : "0da7b794-f2c3-…", "requestor": "Legacy CRS", "newEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "22", "totalCount": "25“ }, "oldEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "20", "totalCount": "25“ } } Entity (old/new) • Id • … Sample Inventory Event
Strata + Hadoop World NYC Sept 26-29, 2016 Putting It Together – Diagnostics Page 29 Service C* node node node node Incoming Request Data History Logs Metrics StoreELK StackData Platform Metrics
Strata + Hadoop World NYC Sept 26-29, 2016 Metrics StoreELK Stack Putting It Together – Long Term Storage Page 30 Data Platform C* node node node node Long Term Storage
Strata + Hadoop World NYC Sept 26-29, 2016 Separating Active and History Data Page 31 Now Time Yesterday’s data is ancient history Rate + Inventory Data
Strata + Hadoop World NYC Sept 26-29, 2016 Data Platform - Cloudera History architecture Page 32 Service AMQ Kafka S3 Other subscribers History retrieval History capture Customer Service Apps History Service Spark node node node node Impala*
Strata + Hadoop World NYC Sept 26-29, 2016 Microservice Data Challenges No Joins? Data Maintenance Data Integrity Cascading Deletes Transactions Page 33
Strata + Hadoop World NYC Sept 26-29, 2016 Distributed Transactions, Anyone? Page 34 Commit the contract Reserve the inventory Booking Client Data Maintenance Apps Inventory Service Reservation Service inventory reservations Data synchronization
Strata + Hadoop World NYC Sept 26-29, 2016 Alternatives to Distributed Transactions Approach Example Scope C* Lightweight Transaction Updating inventory counts Data Tier C* Logged Batch Writing to multiple denormalized hotel tables Data Tier Retrying failed calls Data synchronization, reservation processing Service Compensating transactions Verifying reservation processing System Page 35 Eventual consistency Strong consistency
Strata + Hadoop World NYC Sept 26-29, 2016 Final Thoughts Data Models > Microservices Events = Streams Use Metadata Everywhere Page 36
Strata + Hadoop World NYC Sept 26-29, 2016 Now Available! Page 37 Cassandra: The Definitive Guide, 2nd Edition Completely reworked for Cassandra 3.X: • Data modeling in CQL • SASI indexes • Materialized views • Lightweight transactions • DataStax drivers • New chapters on security, deployment, and integration
Strata + Hadoop World NYC Sept 26-29, 2016 Contact Info @choicehotels careers.choicehotels.com @jscarp jeffreyscarpenter Page 38

Data Modeling for Microservices with Cassandra and Spark

  • 1.
    Strata + HadoopWorld NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 1Page 1 Jeff Carpenter, Choice Hotels International Data modeling for microservices with Cassandra and Spark
  • 2.
    Strata + HadoopWorld NYC Sept 26-29, 2016 1 IT Transformation – Distribution and Analytics 2 Creating a Data Architecture 3 Data Modeling for Microservices 4 Using Metadata for Diagnostics and Analytics 5 Challenges Agenda Page 2
  • 3.
    Strata + HadoopWorld NYC Sept 26-29, 2016 IT Capabilities Corporate IT Guest Franchise Relations Hotel Manage- ment Business Intelligence Distribution Page 3 This talk
  • 4.
    Strata + HadoopWorld NYC Sept 26-29, 2016 CRSWeb and Mobile External Channels Customer & Loyalty Billing Property Systems Reporting & Analytics Distribution - Central Reservation System Page 4 Distribution Domain Guest Domain Franchisee Domain Hotel Management Domain Business Intelligence Domain
  • 5.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Current Reservation System – By The Numbers Page 5 25 years 6,000 hotels 50 transactions / second4,000 distribution channels 1 instance
  • 6.
    Strata + HadoopWorld NYC Sept 26-29, 2016 New Systems: Distribution and Data Platforms Page 6 Distribution Platform Data Platform History Realtime data See: Choice Hotels's journey to better understand its customers through self-service analytics This Talk: how we model data and use the self-service platform
  • 7.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Distribution Platform - Architecture Tenets Cloud-native Microservices Open Source Infrastructure Extensibility Stable, Scalable, Secure Page 7
  • 8.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Data Ownership What is a Microservice? (one definition) Page 8 Message Driven Service Entity Service Client REST API AMQ Events DB Composing Service Persistence
  • 9.
    Strata + HadoopWorld NYC Sept 26-29, 2016Strata + Hadoop World NYC Sept 26-29, 2016Page 9 How can we design our data architecture & models to be… • Scalable? • Extensible? • Maintainable? • Analytics-ready?
  • 10.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Non- relational storage Long Term Storage Logging Reporting & Analytics Metrics Our Data Stack Page 10
  • 11.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Data Modeling – Then and Now Isolated Systems Data Dictionary SOA and Canonical Data Model Services own data Page 11 • Identifying domains and relationships Conceptual Data Model • Identifying data types and relationships Logical Data Model • Java APIs • RESTful APIs (JSON) • Events (JSON) • Cassandra Schemas Physical Models
  • 12.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Conceptual Data Model - Domains Page 12 rates inventoryhotels reservationsoffers
  • 13.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Hotel Management Domain Guest DomainDistribution Domain Conceptual Data Model – Domain Relationships Page 13 hotels guest stay loyalty rates inventory offers reservations
  • 14.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Rates Domain Composite Rate Service Rate Plan Service Rate Service Logical Data Model – Identifying Types Page 14 Rate Plan • id • code • hotelId • effectiveDates • Conditions Rate • id • ratePlanId • productId • hotelId • dateSpan Price • condition • amount Product • id • code • hotelId • features • …
  • 15.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Standardizing Common Data Types • Instead of a Canonical Data Model, we standardize basic building blocks – Feature, Category, Brand – Geospatial – Financial – Time – Contact information Page 15 Address • lines[] • city • subdivision • country • postalCode
  • 16.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Data Types → Microservice Identification Page 16 Hotel Service Rates Service Data Maintenance Apps Inventory Service Offer Service Inventory Domain Rates Domain Hotel Domain Offer Domain Internal / External Client Apps Reservation Service Reservation Domain
  • 17.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Physical Data Models Page 17 Physical Models Java APIs RESTful APIs (JSON) Events (JSON) Cassandra Schemas JSON = primary definition of the data type owned by each service
  • 18.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Key Data Types → RESTful Resource Paths Page 18 Offer Service /offers /reservations Hotel Service Reservation Service Rates Service Inventory Service /hotels /rates /inventory
  • 19.
    Strata + HadoopWorld NYC Sept 26-29, 2016 REST Java API GET /types/<id> Type getTypeById() GET /types?<query parameters> Type[] searchType(TypeSearchCriteria) POST /types/ (JSON body) createType(Type) PUT /types/ (JSON body) updateType(Type) DELETE /types/<id> deleteType(TypeId) Java and RESTful APIs – common pattern Page 19
  • 20.
    Strata + HadoopWorld NYC Sept 26-29, 2016Page 20 Cassandra Data Modeling (an idealized view)
  • 21.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Viewhotels nearPOI Viewhotel Info ShowPOIs nearhotel Shopfor roomsat hotel Viewroom details Booka room Q1 Q2 Q3 Q4 Q5 Viewreservation byconfirmation number Viewhotel reservationsfor adate Find reservationby guestname Q6 Q8 Q7 Viewguest details Q9 Q9 Q9 Cassandra Data Modeling – Access Patterns Page 21
  • 22.
    Strata + HadoopWorld NYC Sept 26-29, 2016 pois_by_hotel hotel_id poi_name description Q3 Q1 Q2 Q4 Q5 amenities_by_room hotel_id room_id amenity_name description K K C↑ K C↑ hotels_by_poi poi_name hotel_id name phone address K C↑ hotels hotel_id name phone address K available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ Cassandra Data Modeling – Chebotko Diagrams Page 22
  • 23.
    Strata + HadoopWorld NYC Sept 26-29, 2016 hotelkeyspace hotels_by_poi poi_name hotel_id name phone address K C↑ pois_by_hotel hotel_id poi_name description amenities_by_room hotel_id room_number amenity_name description K K C↑ K C↑ available_rooms_ by_hotel_date hotel_id date room_number is_available K C↑ C↑ date smallint boolean text text text text address text text smallint text text text text *address* street city state_or_province postal_code country hotels hotel_id name phone *address* text text text text text text text text address K text Cassandra Data Modeling - Physical Page 23
  • 24.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Cassandra Data Modeling - Schemas CREATE KEYSPACE hotel WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3}; CREATE TYPE hotel.address ( street text, city text, state_or_province text, postal_code text, country text ); CREATE TABLE hotel.hotels_by_poi ( poi_name text, hotel_id text, name text, phone text, address frozen<address>, PRIMARY KEY ((poi_name), hotel_id) ) WITH CLUSTERING ORDER BY ( hotel_id ASC) ; Page 24
  • 25.
    Strata + HadoopWorld NYC Sept 26-29, 2016Page 25 And now… Back to reality
  • 26.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Keyspace hotel Access Patterns and Denormalization Page 26 Locate hotel by identifier Find hotels within X miles of point Y Find hotels by city, state, country Find hotels by postal code Hotels by amenity Find hotels by brand hotels_by_id hotels_by_brand hotels_by_postal_code … Hotels by this Hotels by that Hotels by something else
  • 27.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Metadata Page 27 Request Context • Requestor • Tracking ID • Token • Locale Service AMQ Logs ELK Stack EventsIncoming Request
  • 28.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Asynchronous events Page 28 Event • Type • Create • Update • Delete • Request Context • Old entity • New entity Request Context • Requestor • Tracking ID • Token • Locale { "type" : "UPDATE", "trackingId" : "0da7b794-f2c3-…", "requestor": "Legacy CRS", "newEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "22", "totalCount": "25“ }, "oldEntity" : { "hotelId": "AZ123", "productId": "NSK", "date": "2016-05-20", "consumedCount": "20", "totalCount": "25“ } } Entity (old/new) • Id • … Sample Inventory Event
  • 29.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Putting It Together – Diagnostics Page 29 Service C* node node node node Incoming Request Data History Logs Metrics StoreELK StackData Platform Metrics
  • 30.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Metrics StoreELK Stack Putting It Together – Long Term Storage Page 30 Data Platform C* node node node node Long Term Storage
  • 31.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Separating Active and History Data Page 31 Now Time Yesterday’s data is ancient history Rate + Inventory Data
  • 32.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Data Platform - Cloudera History architecture Page 32 Service AMQ Kafka S3 Other subscribers History retrieval History capture Customer Service Apps History Service Spark node node node node Impala*
  • 33.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Microservice Data Challenges No Joins? Data Maintenance Data Integrity Cascading Deletes Transactions Page 33
  • 34.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Distributed Transactions, Anyone? Page 34 Commit the contract Reserve the inventory Booking Client Data Maintenance Apps Inventory Service Reservation Service inventory reservations Data synchronization
  • 35.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Alternatives to Distributed Transactions Approach Example Scope C* Lightweight Transaction Updating inventory counts Data Tier C* Logged Batch Writing to multiple denormalized hotel tables Data Tier Retrying failed calls Data synchronization, reservation processing Service Compensating transactions Verifying reservation processing System Page 35 Eventual consistency Strong consistency
  • 36.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Final Thoughts Data Models > Microservices Events = Streams Use Metadata Everywhere Page 36
  • 37.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Now Available! Page 37 Cassandra: The Definitive Guide, 2nd Edition Completely reworked for Cassandra 3.X: • Data modeling in CQL • SASI indexes • Materialized views • Lightweight transactions • DataStax drivers • New chapters on security, deployment, and integration
  • 38.
    Strata + HadoopWorld NYC Sept 26-29, 2016 Contact Info @choicehotels careers.choicehotels.com @jscarp jeffreyscarpenter Page 38

Editor's Notes

  • #2  Thanks for staying for the last session of the conference I’m Jeff Carpenter, system architect at CHI In this session we’re going to talk about why data modeling is important for both transactional and analytics systems, and how we’ve put this into practice at Choice as we’re building new systems.
  • #3 Overview of what we'll cover
  • #4 Choice Hotels is a technology-centric company. We operate according to a franchise model, and a lot of the value proposition of our IT organization is based on the services we provide to our franchisees. As a result we’re continually innovating and looking to modernize key systems. We’ve divided our IT capabilities into several domains, including guest, franchise relations, hotel property management, and our corporate IT services. This presentation focuses especially on systems in the distribution domain, such as our reservation system, and the business intelligence domain, which includes analytics and reporting systems. In particular, we’re going to focus on the relationship between the distribution domain and the BI domain
  • #5 Key systems in the distribution domain include the Central Reservation System (CRS), our website and mobile apps. Last year we launched new versions of our website and mobile app, and we are currently working on a new reservation system The reservation system interfaces to many of our other IT systems so replacing it is a major undertaking Internal channels like our website and mobile applications allow customers to shop and book rooms External channels as well We interface with property systems so our franchisees can tell us about their room types, rates, and inventory We interface with customer and loyalty systems to credit stays and support reward reservations Reporting and billing systems pull information about reservations
  • #6 Our current reservation system is over 25 years old - written in C and running on a large UNIX box with traditional RDBMS We’re currently making reservations for over 6000 hotels worldwide, and distributing over 50 different channels – everything from our own website and mobile apps to GDS and OTA partners This system is very performant and reliable, servicing over 4000 TPS However, the system scales vertically - we need horizontal scalability for future growth
  • #7 Pulling back the covers a bit, we have the unique opportunity to make major improvements to two important areas of the enterprise at once We’re replacing our legacy Central Resrvation System with a new Distribution Platform which we hope will have the longevity of the previous one We’re also modernizing our business intelligence approach with a new data platform One of the major themes of this talk is the relationship between these two systems, and the role that data modeling plays in it. Specifically, we’re using the data platform to capture changes as they occur in the distribution platform for analytics purposes There are also some use cases in the distribution system where we need to access that historic data for customer service and diagnostic purposes, so we are actually implementing a limited capability to pull some of that data back out. I’m not going to give a detailed presentation of the self-service data platform because my colleague Narasimhan from Choice and Avinash from Clairvoyant have already provided a great talk on that earlier today. Instead we’ll focus on how the distribution platform makes use of capabilities of the data platform
  • #8 Here are some of the tenets of our architecture: We designed for the cloud to run anywhere, in multiple data centers worldwide We wanted a microservices architecture based primarily on RESTful APIs and event publishing We use open source infrastructure as much as feasible Since this system needs to work for the next 25 years, we want a design which is easily extensible to new features and business areas The key architectural –ilities we repeat again and again are scalability, stability and security
  • #9 So when I say we have a microservice based architecture, that could mean a lot of things We use a mixture of synchronous and asynchronous approaches in order to support shopping and booking hotel rooms and notification of various partners In our architecture, a typical microservice exposes a RESTful API which allows it to be accessed by clients. The entity service manages the persistence of a specific data type, and publishes events when data is created, updated, or deleted. We have other types of services which compose the entity services, and message driven services which respond to events and generate other events or interface to external systems. But the bedrock principle I want to call is the data ownership. Every data type is owned by a single service, and it owns the persistence.
  • #10 So, designing a new distribution platform – a greenfield design, so many choices available, the world is new…
  • #11 First, let’s talk about our technology selection - these are a few of the elements in our data stack We use Cassandra as our primary data store We use Amazon S3 and Glacier for medium to long term storage We use the ELK stack for logging We use Spark, Impala and other technologies in our data platform for reporting and analytics We use Karios DB as our metrics store and Grafana to construct operational dashboards For messaging, we use Active MQ when message ordering is required and Kafka for fast streaming between systems
  • #12 With that technology foundation, we recognized at the project outset the important role that data modeling would play in the development of our reservation system, especially since it touches so many other systems In the past, our enterprise consisted of multiple stovepipe systems, each with their own data models. There were efforts to reconcile these in a corporate data dictionary, which was a massive undertaking. As we started doing more SOA style work, we began wrapping many of these systems, and developed a canonical data model as way to enforce a common language across these services. This is a centralized definition. This proved difficult to maintain, and when we started work on the new reservation system, we decided to allow each service to own its own data model. We used the classic levels of data modeling – conceptual, logical, and physical, to drive our identification of data types in the system and the microservices that manage each data type, and then the various physical representations we need to drive software development. The next few slides take us through that process.
  • #13 Let’s introduce some of the key data types within the distribution domain Hotels - descriptive data about the hotels and their products, and policies. Quite static Rates - prices that are charged for the products. these can change many times a day, and could include an automated pricing system Inventory - constantly changes as rooms are booked, cancelled, etc. Data quality and currency is extremely important here so we don’t oversell our hotels Reservations - contract with the customer. Generally only changed when initiated by the customer, infrequent changes (Talk to nuances on inventory buckets, rate plans, packages, rules)
  • #14 An important part of our conceptual data model was defining the boundaries and relationships between domains This includes the distribution domain and its sub-domains, and relationsihps to other domains As we see, the inventory and rates domains reference the hotel domain – the inventory and rates are for products at specific hotels Offers and reservations, in turn reference hotels, inventory and rates As we look to relationships outside the data domain, a reservation can reference customer and loyalty accounts that are managed by other systems. In these cases our reservation system holds references to those external data types, since the system of record is external Another interesting case comes when there are data types that form the boundary of relationships between systems. This occurs in the case of the reservation. Reservations are created and managed by both central reservation systems and by property management systems which reside in the Hotel Management domain. In our case, we have an internal representation of a reservation which forms the basis of our exchange with the property management systems. They also need a copy of the reservation so they can manage the guest stay. While it is possible for a reservation itself to be updated while the guest is on property (for example, adding an extra night) we’ve made a clear boundary so that stay information doesn’t start creeping into our reservation definition.
  • #15 Let’s consider the rates domain, this is a sub-domain within distribution As we begin to model this in UML, we see there are distinct types for rates and rate plans. Rate plans comprise the rules that hotels use to describe how to get access to a particular set of rates The rates describe what customers will be charged on a given day, at a given hotel, in association with a rate plan The rates themselves may consist of multiple price points, for example, a one person rate, a two person rate, an extra person rate. We may have references from these data types to data types outside the rates domain. For example a rate references a product or products to which it applies. This is a unique ID reference We draw boundaries around portions of the data model to be owned by each service. In this example we derive microservices for rate plans and rates. In this way each service represents a bounded context. However, from a deployment perspective, it may make sense to reduce the number of services, especially if rates and rate plans are most frequently accessed together (as they are) Identifying potential services at a fairly low level and then potential compositions helps us make sure we maintain an extensible design.
  • #16 An example of the common building blocks that demonstrates why standardization is important is the concept of an address. A problem with some of our historic systems has been support across various system for a varying number of addresses. Many coded with 2, some coded with three. We’re constructing our new services to support addresses with a variable number of lines, and using validation to control how long the list can be.
  • #17 This summarized the sort of results that arise from identifying services based on the logical data model – we end up with microservices organized around these domains of hotels, rates, inventory, offers, and reservations Each of the services serve as the owners of a specific set of data types, approaching a share-nothing architecture style. We’re then able to build client applications such as our website and mobile apps on top of these services, and build integrations with external partners We also built data maintenance applications to: synchronize of data from other systems – our legacy system as well as some other systems that will stay in operation, such as property management systems Verify data accuracy across systems and across service boundaries Correct data issues caused by defects
  • #18 Once we’ve identified our services, we can approach the physical data modeling associated with each service as an internal concern of that service. This includes the Java and RESTful APIs, events published by the service, and Cassandra schemas for data storage These representations are all derived from the logical data model We decided that the JSON representation of the resource owned by the service was the authoritative definition of the resource from the perspective of external services
  • #19 Services are organized around the RESTful resource paths they own Consider these RESTful resource paths as namespaces – need to manage these as well
  • #20 Our usage of RESTful APIs helped reinforce the focus around data types. While we do not adhere to some of the strictest definitions of what is RESTful, focusing on resources rather than actions helped keep our APIs clean and relatively free of RPC-style interactions The common pattern that emerged for both our Java and RESTful APIs was to have simple CRUD operations The cookie cutter approach was helpful in being able to generate common templates to kickstart development on each new service
  • #27 (TODO: work flow/time or delete) One of the tenets of Cassandra data modeling is to identify access patterns and design tables around those access patterns We followed this pretty strictly at first, but soon ran into cases where adding a table per unique access pattern proved to be too much Take for example hotels and the number of ways by which various clients could search for hotels Since the hotel records are quite large, imagine the impact of all of these tables on our cluster size and storage requirements for 6000+ hotels. We reined this in by designing tables to support multiple queries, select usage of indexes, and doing some filtering at the service layer, which helped us rein in our computing costs. We’re also looking to move to Cassandra 3.X in order to take advantage of materialized views, which will allow us to shift some of the processing burden back to the database
  • #28 Switching gears, the concept of metadata was very important to us – being able to keep a common request context helps us track interactions between services, events, and find key interactions in log files, and so on
  • #30 Putting this all together in the context of a service A service receives an incoming request with data including metadata The data is written to Cassandra An event is generated which is captured as history The operation, metadata and data identifiers are logged The elapsed time for the operation is captured as metrics Now the operations team is able to configure alarms on service state and metrics
  • #31 We have policies in place to ensure all of our application, logging and metrics data is captured in appropriate tools for long term storage and archival in S3 and Glacier
  • #32 We have separated the shopping and booking concerns from our analysis and history uses, which means that in the reservation system, data in the past is not much use. As we insert our data into Cassandra, we set the TTL for when it will no longer be needed, which saves us from developing our own cleanup process and reduces our storage footprint. We still need the historic data for analysis and customer service purposes, though, so we store it in a separate data platform which we feed from the reservation system using asynchronous event processing
  • #33 Let’s talk about how we architected the history features of our platform. Since we’re already capturing the event streams in the data platform, we can reach back into that platform to access data for our customer service applications. These are the applications we use to help answer customer questions about their reservation including what was changed, when it was changed, and who changed it. We also use these applications when things go wrong to diagnose problems I’ll refer you to Narasimhan’s talk for the complete architecture of our self-service data platform. What I’ve highlighted is the elements we use On ingestion, we tie into Active MQ event queues published by our services and bridge the events to Kafka to stream them into our Spark cluster backed by S3. To retrieve historical data, our customer service apps call history services, which make SQL queries via Impala to retrieve the data We work closely between the teams to manage the SLAs for this data retrieval
  • #34 Here are some of the challenges we’ve encountered in working with this kind of architecture When data is spread across multiple services, it can be hard to get a picture of the relationships between data types – you can’t just do a join in Cassandra We’re investigating use of Spark in some environments in order to support ad-hoc searching and exploration There are also challenges to maintaining data in this environment. Teams have to be conscious that changing data in one service may affect other services. What happens if I delete a hotel using the Hotel Service but don’t delete the inventory and rates? We’ve had to put tighter controls around deletes and manage cascading deletes at the application level to prevent these issues. Thankfully, data deletion is more of a maintenance activity and not a regular operational practice Another issue comes when we need to commit changes to multiple data types at the same time. Let’s look at an example
  • #35 One of the challenges of a microservices architecture is keeping changes in sync across service boundaries. One example situation is in booking a reservation. Since the reservation represents our contract with the customer to reserve a specific room at a specific price and with certain conditions, we need to mark a reservation as committed at the same time as we reserve the inventory. This is important so that we don’t accidentally overbook our hotel. Making the situation more complicated, there could be simultaneous bookings and data maintenance activities also trying to access the same inventory Since these types are split across microservice boundaries, there is no transaction mechanism. In fact, since the data is in different rows (and different tables), Cassandra’s lightweight transactions are of no use to us here. We solved this by a layered approach – LWTs to protect inventory counts, retries within the booking service, and compensating processes to detect and cleanup failures
  • #36 Thankfully we have a variety of tools in our toolbox for guaranteeing consistency. Some of these are provided by Cassandra but some of them are architecture approaches.
  • #37 Use data modeling to identify key types and bounded contexts, let that drive the microservice design Events are great for decoupling services and systems, you can leverage the streams for history as well Use metadata across services and infrastructure to allow common thread of debugging and performance analysis