58

I am designing a system to receive data from various types of sensors, and convert and then persist it to be used by various front-end and analytics services later.

I'm trying to design every service to be as independent as possible, but I'm having some trouble. The team has decided on a DTO we would like to use. The outward-facing services (sensor data recipients) will receive the data in their own unique way, then convert it to a JSON object (the DTO) and send it off to the Message Broker. Consumers of the messages will then know exactly how to read the sensor data messages.

The problem is that I'm using the same DTO in a few different services. An update has to be implemented in multiple locations. Obviously, we've designed it in such a way that a few extra or missing fields in the DTO here and there are not much of an issue until the services have been updated, but it's still bugging me and makes me feel like I'm making a mistake. It could easily turn into a headache.

Am I going about architecting the system wrong? If not, what are some ways around this, or at least to ease my worries?

2
  • 1
    What kind of DTOs are you sharing, and what protocol are you using between services? It's okay to share, for instance, the proto file for gRPC or the avro schema for Kafka and generate the DTOs in both services, but I wouldn't share a shared library between two projects. Commented Feb 20, 2018 at 0:43
  • 1
    Encoded JSON strings and AMQP. I would prefer not to use anything language-specific. Commented Feb 20, 2018 at 3:03

5 Answers 5

69

My advice? Do not share these DTOs among the applications in any kind of library. Or at least don't do this right now.

I know, seems very counter-intuitive. You are duplicating code, right? But this is not a business rule, so you can be more flexible.

The service that sends the DTO needs to be rigid in his message contract, like an Rest API. The service can't change the DTO in a way that could break the other services that are already consuming the information from the DTO.

When a new field is added do DTO, you only update the other services that consume this DTO if they need the new field. Otherwise, forget it. Using JSON as content-type you have the flexibility to create and send new attributes without breaks the code of the services that don't map these new fields on his actual versions of DTO.

But if this situation is really bothering you, you can follow the Rule of Three:

There are two "Rules of Three" in reuse: (a) It is three times as difficult to build reusable components as single use components, and (b) a reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library.

So, try to wait a bit more before sharing this DTO among the services.

9
  • 1
    Very much appreciated. This is really one of very few major concerns I have going forward. Not enough to keep me up at night, but enough to worry me. Commented Feb 20, 2018 at 3:04
  • 6
    Duplicated DTOs (in different and very independent services) don't violate DRY. That's it. Commented Feb 20, 2018 at 11:19
  • 5
    Presumably then there's no reason not to copy the DTO source code directly from one project to another, as a one-off operation, although then any parts not required in the new project should probably be deleted. Commented Feb 25, 2018 at 17:44
  • 2
    Even a whole service might be delete causing no major problems to the whole system. Ideally. Commented Feb 25, 2018 at 18:27
  • 6
    To clarify the essence of the answer, when developing microservices, each service should be developed as if they do not know about the other services, except for the actual contracts it may require. Commented Feb 26, 2018 at 12:02
23

When it comes to Microservices, services' development life cycles should be independent too.*

Different SLDC and different dev teams

in a real MS system, there could be several teams involved in the development of the ecosystem, each of which in charge of one or more services. In turn, these teams might be located in different offices, cities, countries, plan... Perhaps, they don't even know each other, what makes sharing knowledge or code very hard (if possible). But this could be very convenient because shared code also implies a sort of sharing reasoning and something important to recall is that, whatever makes sense for a specific team, doesn't have to make it for another team. For example, given the DTO Customer, it could be different depending on the service in play, because customers are interpreted (or seen) differently from each service.

Different needs, different technologies

Isolated SLDCs also allows teams to choose the stack that best suits their needs. Imposing DTOs implemented in a specific technology limits the capacity of the teams to choose.

DTOs are neither business rules nor services contracts

What DTOs are really? Plain objects with no other goal than moving data from one side to another. Bags of getters and setters. It's not the kind of "knowledge" that worth reuse, overall because there's no knowledge at all. Their volatility also makes them bad candidates for coupling.

Contrary to what Dherik has stated, it must be possible for a service to change its DTOs without having to make other services to change at the same time. Services should be tolerant readers, tolerant writers and fail tolerant. Otherwise, they cause coupling in such a way that makes the service architecture a no sense. Once more, and contrary to Dherik's answer, if three services need exactly the very same DTOs, it's likely something went wrong during the services decomposition.

Different business, different interpretations

While there could be (and there will be) cross-cutting concepts among services, it does not mean we have to impose a canonical model to force all services to interpret them in the same way.

Case study

Say our company has three departments, Customer Service, Sales and Shipping. Say each of these releases one or more services.

Customer Service, due to its domain language, implements services around the concept of customers, where customers are persons. For instance, customers are modeled as name, last name, age, gender, email, phone, etc.

Now say, Sales and Shipping model their services according to their respective domain languages as well. In these languages, the concept customer appears too but with a subtle difference. To them, customers are not (necessarily) persons. For Sales, customers are a Document number a Credit Card and a billing address, for Shipping a full name and a shipping address too.

If we force Sales and Shipping to adopt the canonical data model of Customer Service, we are forcing them to deal with unnecessary data that could end up introducing unnecessary complexity if they have to maintain the whole representation and keep the customer data in sync with customer service.

Related links


* Here is where the strengths of this architecture lays on

2
  • Thank you! The case studies are actually what helped me determine whether to share DTOs or not. Now I am sure as to why I did not want to share them. Commented Nov 13, 2019 at 10:11
  • "if three services need exactly the very same DTOs, it's likely something went wrong during the services decomposition". I totally agree Commented Nov 4, 2020 at 6:07
16

I'm trying to design every service to be as independent as possible

You should be publishing events. Events are certain type of messages that represent a solid fact about something that has happened at a particular point in time.

Each service should have a very well defined responsibility, and should have the responsibility of publishing the events related to that responsibility.

Furthermore, you want your events to be represent business related events, not technical events. E.g. prefer OrderCancelled event to an OrderUpdated with status: "CANCELLED".

That way, when a service needs to react to a cancelled order, it just needs to listen to a that particular type of message, which is carrying only data relevant to that event. E.g. an OrderCancelled probably just needs an order_id. Whichever service that needs to react to this has already stored whatever it needs to know about the order in its own data store.

But if the service only had OrderUpdated events to listen to, then it would need to interpret on the flow of events, and it was now dependent on delivery order to conclude correctly when an order was cancelled.

In your case, however, as your are publishing sensor data, it could make sense to have a service, listen to the events and publish a new stream of "business events", e.g. TemperatureThresholdExceeded, TemperatureStabilised.

And be careful about creating too many microservices. Microservices can be a great way of encapsulating complexity, but if you don't discover suitable service boundaries, your complexity is in the service integration. And that is a nightmare to maintain.

It's better to have too few, too large services, than to have too many, too small services.

0
1

Seven Years Later

Background

I'm revisiting this question I asked several years ago. Since then I've gotten a lot more experience developed distributed applications in Python, Typescript, and C# and have gotten my hands on a plethora of other languages I haven't had a chance to deploy something with but have had the pleasure of using nonetheless.

Since then, the world has had some time to reckon with isolated micro services, and the pendulum has had time to swing back toward monolithic architectures a bit more. The idea of Poly lith is an elegant example of this.

Preferred Solution

It depends™

I think, generally, it makes sense to have a core business object module or library that is common to all services. Here you can maintain the shape--and maybe even some behaviors--of the core business domain.

Data transfer and contracts should use these business objects whenever possible. These business objects should make up the majority of your database schema or shape. Most services should know what to do with them and be able to use them. They are the ground-truth of your application. Ideally, DTOs themselves should not be shared unless they become mutual between services (e.g., contracts).

That said, services may need to compose their own objects that encapsulate these objects. Think wrapper classes that have one or more of these objects as class properties. This is useful for API request/response models, for models for messages getting sent between services over message queues, etc. These wrapper include the core business objects, but also add context about what exactly is being done with them or more information about their parents or children.

Technical Hurdles

For stacks written all in the same language, like Polylith or UV or PNPM workspaces, all you need is a dedicated module to house these files. It's easy to import and rely on them in all your other services.

That said, most stacks are composed of at least two languages: the "backend" (Python, Rust, Go, #C, Java) and the frontend (Typescript + the latest JavaScript UI framework). However, there may be even more. Different languages excel at different things and we shouldn't resist the urge to have a multilingual project if we need to (within reason, due to business limitations).

Technical Answers

The answer: JSON Schema and code generation (or an OpenAPI spec), and/or gRPC contracts. Everyone loves to hate codegen, but in my experience I have seen it work very well for basic business object models.

If you can, start with code and generate an OpenAPI spec or JSON Schema if you have a small team. If you have a lot of services or an intense business model, then you will need to take your business model more seriously. You will need to start with a JSON Schema and generate the classes. If you have a big team, changing this schema should be a big deal and you will need to include developers and other stakeholders before you do.

TL;DR

Share your core business models via JSON Schema and codegen as the ground source of truth. Services can use these models inside other models as they need to

3
  • 1
    Please provide arguments for not sharing DTOs. And arguments in general. Commented Apr 22 at 10:47
  • @Basilevs Thanks for the feedback. While not specific to DTOs, my answer does indeed provide arguments for and against sharing things and what to share. At the time I asked this question, I think I was being too specific with the term "DTO." I believe I lacked a fundamental understand of what is a "business model" and what are implementation-specific models. Commented Apr 26 at 19:25
  • in the 4-th paragraph you suggest to share domain, but not DTO. No arguments were provided. Later you suggest to share JSON schema, which is equivalent to DTO. The more I reread, the more confused I get. Commented Apr 26 at 21:14
0

The most brought up reason for not sharing dtos is different bounded context could have different interpretations. However there should only be a single source of truth which should hold data that contains every field needed to serve all its consumers. I would say if you want to share dtos, understanding what is a complete DTO (from business perspective) would be a prerequisite.

Another brought up reason is that services are no longer loosely coupled anymore, as a change triggered by one service forces a change in the other. I cannot imagine what type of change could trigger this. Deletion of fields should never be done as long as there are consumers using them. New fields can be ignored if the consumer is not interested. Also keep in mind that versioning of the library provides good flexibility because each service is free to choose its version of the DTO.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.