”- Tags:: 📚Books , Python, 📖 Domain Driven Design

DDD basics

A value object is any domain object that is uniquely identified by the data it holds; we usually make them immutable: OrderLine is a value object (p. 43)

We use the term entity to describe to describe a domain object that has a long-lived identity (p. 45)

An aggregate is just a domain object that contains other domain objects and lets us treat the whole collection as a single unit. (p. 176)

With an aggregate, we are looking to define consistency boundaries at a proper granularity.

Not everything has to be an object: Domain Service Functions : operations that don’t have a natural home in an entity or value object (p. 47)

Not to be confused with the service layer (see “Application service vs. domain service” below).

We have one final concept to cover: exceptions can be used to express domain concepts too. In our conversations with domain experts, we’ve learned about the possibility that an order cannot be allocated because we are out of stock, and we can capture that by using a domain exception (p. 49)

Dependency Inversion Principle

  1. High-level modules should not depend on low-level modules. Both should depend on abstractions.
  2. Abstractions should not depend on details. Instead, details should depend on abstractions
Link to original
(p. 21)

Note that:

Depends on doesn’t mean imports or calls, necessarily, but rather a more general idea that one module knows about or needs another module.

The authors don’t like a layered architecture to achieve the Dependency Inversion Principle:

and prefer an onion architecture, with the business logic at the center, so that the domain model has no dependencies. This is the opposite of what Django does (the model depends on the ORM):

See Confusing layered architecture.

Building fakes for your abstractions is an excellent way to get design feedback: if it’s hard to fake, the abstraction is probably too complicated. (p. 72).

Repository pattern

Very interesting the inversion of the ORM depending on the model (instead of the other way around) in page 60, with SQLAlchemy, so your model is not contaminated by ORM annotations:

from sqlalchemy import Table, Column, Integer, String, ForeignKey
from sqlalchemy.orm import registry
 
mapper_registry = registry()
 
user_table = Table(
    'user',
    mapper_registry.metadata,
    Column('id', Integer, primary_key=True),
    Column('name', String(50)),
    Column('fullname', String(50)),
    Column('nickname', String(12))
)
 
class User:
    pass
 
mapper_registry.map_imperatively(User, user_table)

An ORM already buys you some decoupling. Changing foreign keys might be hard, but it should be pretty easy to swap between MySQL and Postgres if you ever need to. (p. 74)

The argument for the right hand side of the graph seems a bit weak to me:

Our example code isn’t complex enough to give more than a hint of what the right-hand side of the graph looks like, but the hints are there. Imagine, for example, if we decide one day that we want to change allocations to live on the OrderLine instead of on the Batch object: if we were using Django, say, we’d have to define and think through the database migration before we could run any tests. As it is, because our model is just plain old Python objects, we can change a set() to being a new attribute, without needing to think about the database until later. (p. 77)

Services

Application service vs. domain service

An application service belongs to the service layer and simply orchestrates. A domain service belongs to the domain model and contains business logic that does not belong inside a single entity of the domain (p. 113).

…the service layer forms an API for our system that we can drive in multiple ways (p. 122)

Ideally, the service layer API should not depend directly on domain objects, but on primitives., so that it is fully decoupled from the domain. E.g.,

def allocate(line: OrderLine, repo: AbstractRepository, session) -> str:
def allocate(
        orderid: str, sku: str, qty: int, repo: AbstractRepository, session
) -> str:

And:

In an ideal world, you’ll have all the services you need to be able to test entirely against the service layer, rather than hacking state via repositories or the database. This pays off in your end-to-end tests as well (p. 131)

Testing (TDD)

Functional core, imperative shell

We’re going to separate what we want to do from how to do it. (…) Instead of saying, “Given this actual filesystem, when I run my function, check what actions have happened,” we say, “Given this abstraction of a filesystem, what abstraction of filesystem actions will happen?” (…) We’ll create a “core” of code that has no dependencies on external state and then see how it responds when we give it input from the outside world (this kind of approach was characterized by Gary Bernhardt as Functional Core, Imperative Shell (p. 85)

def test_when_a_file_has_been_renamed_in_the_source():
	src_hashes = {'hash1': 'fn1'}
	dst hashes ={'hash1': 'fn2'}
    expected actions == [('MOVE', '/dst/fn2', '/dst/fn1')]

The original explanation is on this screencast: Functional Core, Imperative Shell.

Against mocks

In the Stubs vs. Mocks, they prefer stubs:

Bob loves using lists to build simple test doubles, even though his coworkers get mad. It means we can write tests like assert foo not in database. (p. 90)

We avoid using mocks in this book and in our production code too. We’re not going to enter into a Holy War, but our instinct is that mocking frameworks, particularly monkeypatching, are a code smell. Instead, we like to clearly identify the responsibilities in our codebase, and to separate those responsibilities into small, focused objects that are easy to replace with a test double. (p. 91)

Patching out the dependency you’re using makes it possible to unit test the code, but it does nothing to improve the design. Using mock.patch won’t let your code work with a —dry-run flag, nor will it help you run against an FTP server. For that, you’ll need to introduce abstractions. (p. 91)

However…

The disadvantage is that we have to make our stateful components explicit and pass them around. David Heinemeier Hansson, the creator of Ruby on Rails, famously described this as “test-induced design damage.” (p. 91)

That DHH idea is in 🗣️ Writing software - DHH. They also refer to the mythical 🗞 Mocks Aren’t Stubs.

Not sure this is a good tradeoff:

Designing for testability really means designing for extensibility. We trade off a little more complexity for a cleaner design that admits novel use cases (p. 92)

The tests act as a record of our design choices and serve to explain the system to us when we return to the code after a long absence (p. 93)

What kind of tests to write

Aim for one end-to-end test per feature

The objective is to demonstrate that the feature works, and that all the moving parts are glued together correctly (…) Ideally, your application will be structured such that all errors that bubble up to your entrypoints (e.g., Flask) are handled in the same way. This means you need to test only the happy path for each feature, and to reserve one end-to-end test for all unhappy paths (and many unhappy path unit tests, of course) (p. 131)

Write the bulk of your tests against the service layer

…it’s important to understand the trade-off between coupling and design feedback (p. 122)

These edge-to-edge tests [service-layer tests] offer a good trade-off between coverage, runtime, and efficiency. Each test tends to cover one code path of a feature and use fakes for I/O. This is the place to exhaustively cover all the edge cases and the ins and outs of your business logic (p. 131)

…split our tests into two broad categories: tests about web stuff, which we implement end to end; and tests about orchestration stuff, which we can test against the service layer in memory (p. 111)

If we restrict ourselves to testing only against the service layer, we won’t have any tests that directly interact with “private” methods or attributes on our model objects, which leaves us freer to refactor them (p. 122)

Note that service-layer tests are still considered unit tests, and thus, we would still have a good number of them, vs. integration vs. e2e test, “a healthy-looking test pyramid”. But note also that service tests would not mock the domain, just the infrastructure dependencies.

Maintain a small core of tests written against your domain model

Most of the time, when we are adding a new feature or fixing a bug, we don’t need to make extensive changes to the domain model. In these cases, we prefer to write tests against services because of the lower coupling and higher coverage (…) Most of the time, when we are adding a new feature or fixing a bug, we don’t need to make extensive changes to the domain model. In these cases, we prefer to write tests against services because of the lower coupling and higher coverage. (p. 123)

Don’t be afraid to delete these tests if the functionality is later covered by tests at the service layer (p. 131)

Unit of Work

p.134:

The UoW acts as a single entrypoint to our persistent storage, and it keeps track of what objects were loaded and of the latest state, and provides a transaction (p. 136). If you are used to Django, this is something that it provides automatically: an HTTP starts a transaction and roll it back if something goes wrong Database transactions | Django documentation | Django.

DEFAULT_SESSION_FACTORY = sessionmaker(bind=create_engine(  
    config.get_postgres_uri(),
))
 
class SqlAlchemyUnitOfWork(AbstractUnitOfWork):
 
    def __init__(self, session_factory=DEFAULT_SESSION_FACTORY):
        self.session_factory = session_factory
 
	def __enter__(self):
        self.session = self.session_factory()  # type: Session  
        self.batches = repository.SqlAlchemyRepository(self.session)  
        return super().__enter__()
 
    def __exit__(self, *args):
        super().__exit__(*args)
        self.session.close()  
 
    def commit(self):  
        self.session.commit()
 
    def rollback(self):  
        self.session.rollback()

In fact:

Your ORM probably already has some perfectly good abstractions around atomicity (…). We’ve made it look easy, but you have to think quite carefully about things like rollbacks, multithreading, and nested transactions. Perhaps just sticking to what Django or Flask-SQLAlchemy gives you. (p. 150)

The UoW is also a realization of the concept Don’t mock what you don’t own · testdouble/contributing-tests Wiki · GitHub of Mockist TDD (or London-school):

 …test doubles should be primarily used to facilitate TDD to invent focused, usable interfaces between the thing you’re testing and the code it will depend on. (…) The prescription implied by “don’t mock what you don’t own” is to introduce your own shim/wrapper/adapter around it. This effectively cordons off the dependency to a single place in your codebase and contextualizes it in the consistent and easy-to-use style you’re trying to promote within your codebase. If there’s anything awkward about how one needs to invoke the dependency (maybe a chaining API, multi-step invocation, repetitive default configuration, etc.), it can be swept under the rug into that common adapter.

Related to this, from London school TDD · testdouble/contributing-tests Wiki · GitHub:

if a “dependency is hard to mock, then it’s definitely hard to use for the object that’ll actually be using it.”

Dependency Injection

In Python, we could even not have dependencies as arguments of the services: we could simply import a default dependency and monkeypatch in tests. However, we make things less explicit and brittle (mocks paths are easy to break).

If we go the route of explicit dependencies, then every entrypoint needs to init and pass dependencies to the services as args:

But we can do better: we can do that in one place only. Here it is called “bootstrap”, but the usual name for this is Composition Root:

Automatic wiring

If we do the manual wiring, depending on the amount of abstractions of your code, you will need to do more and more of this kind of stuff (dependencies of dependencies):

if __name__ == "__main__":
    main(
        service=Service(
            api_client=ApiClient(
                api_key=os.getenv("API_KEY"),
                timeout=int(os.getenv("TIMEOUT")),
            ),
        ),
    )

In such case, a Dependency Injector such as Dependency injection and inversion of control in Python — Dependency Injector 4.40.0 documentation may come in handy:

from dependency_injector import containers, providers
from dependency_injector.wiring import Provide, inject
 
class Container(containers.DeclarativeContainer):
 
    config = providers.Configuration()
 
    api_client = providers.Singleton(
        ApiClient,
        api_key=config.api_key,
        timeout=config.timeout,
    )
 
    service = providers.Factory(
        Service,
        api_client=api_client,
    )
 
@inject
def main(service: Service = Provide[Container.service]) -> None:
    ...
 
if __name__ == "__main__":
    container = Container()
    container.config.api_key.from_env("API_KEY", required=True)
    container.config.timeout.from_env("TIMEOUT", as_=int, default=5)
    container.wire(modules=[__name__])
 
    main()  # <-- dependency is injected automatically
 
    with container.api_client.override(mock.Mock()):
        main()  # <-- overridden dependency is injected automatically

Validation

Syntax validation

Validate in entrypoints!

We tend to validate these rules at the edge of the system. Our rule of thumb is that a message handler should always receive only a message that is well-formed and contains all required information (p. 371)

“Schema” library to make validations:

@dataclass
class Allocate(Command):
 
    _schema = Schema({  
        'orderid': int,
         sku: str,
         qty: And(Use(int), lambda n: n > 0)
     }, ignore_extra_keys=True)
 
    orderid: str
    sku: str
    qty: int
 
    @classmethod
    def from_json(cls, data):  
	   data = json.loads(data)
       return cls(**_schema.validate(data))

Validate as little as possible! Follow the Robustness Principle or Postel’s Law:

Be conservative in what you do, be liberal in what you accept from others.

Link to original

Semantic validation

A request is syntactically valid but makes no sense (e.g., “buying -5 items”). We do so either at the service layer:

def allocate(event, uow):
    line = mode.OrderLine(event.orderid, event.sku, event.qty)
    with uow:
        ensure.product_exists(uow, event)    # <--- HERE
        product = uow.products.get(line.sku)
        product.allocate(line)
        uow.commit()

Don’t confuse this with your business logic checks:

Second, we should try to avoid putting all our business logic into these precondition checks. As a rule of thumb, if a rule can be tested inside our domain model, then it should be tested in the domain model (p. 379)

Pragmatic validation

A request that is syntactically valid, makes sense, but you can’t fulfill for some reason (e.g., “buying a million diamonds”). That goes into the domain model: it is part of your business logic.

Aggregates

From đź“– Domain Driven Design:

An aggregate is a cluster of associated objects that we treat as a unit for the purpose of data changes.

Link to original

We want to draw a boundary around a small number of objects — the smaller, the better, for performance — that have to be consistent with one another, and we need to give this boundary a good name. (p. 158)

…the only repositories we are allowed should be repositories that return aggregates. (p. 162)

Concurrency handling

  • Calculated updates:
UPDATE accounts SET balance = balance - 100 WHERE user_id = 1; (sets balance=200)
  • Version numbers (optimistic concurrency handling)
  • Row-level locking with SELECT … FOR UPDATE (pessimistic concurrency handling)
  • Changing transaction isolation level to SERIALIZABLE (may have perf. problems).

Other notes

If your app is essentially a simple CRUD wrapper around a database and isn’t likely to be anything more than that in the foreseeable future, you don’t need these patterns. Go ahead and use Django, and save yourself a lot of bother. (Location 3545)

The big idea is “messaging.”…The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. (Location 3559)

it’s not the obvious features that make a mess of our codebases: it’s the goop around the edge. It’s reporting, and permissions, and workflows that touch a zillion objects. (Location 3580)

“Make the change easy; then make the easy change”: (Location 4153)

It’s OK to trade performance for consistency on the read side, because stale data is essentially unavoidable. (Location 5326)

In CQS we follow one simple rule: functions should either modify state or answer questions, but never both. (Location 5345)

This justification for CQRS is related to the justification for the Domain Model pattern. If you’re building a simple CRUD app, reads and writes are going to be closely related, so you don’t need a domain model or CQRS. But the more complex (Location 5524)

…by keeping a totally separate, denormalized data store for our view model? (Location 5576)

Keeping the read model up to date is the challenge! Database views (materialized or otherwise) and triggers are a common solution, (Location 5592)

But the entire thrust of our book is about what to do when your app is no longer a simple CRUD app. At that point, Django starts hindering more than it helps. (Location 7386)