0

Suppose I have a SQLAlchemy 2.0 model representing cars. It incorporates a simple validation routine (Reference) for the string attributes:

from sqlalchemy.orm import Mapped from sqlalchemy.orm import mapped_column from sqlalchemy.orm import validates class Base(DeclarativeBase): pass class Car(Base): __tablename__ = 'car' id: Mapped[int] = mapped_column(primary_key=True) make: Mapped[str] = mapped_column() # Tesla model: Mapped[str] = mapped_column() # S color: Mapped[str] = mapped_column() # Black # Ensure that the input data is not null @validates('make', 'model', 'color') # <============ def validate_string(self, key, value): if not value: raise ValueError(f'The {key} must be entered') return value 

Now, let's consider that I need to add 30 additional string attributes to the model and validate them. Given the current validation routine setup, I'll have to add 30 additional attribute names in the validates decorator:

 @validates('make', 'model', 'color', 'str_attr_4', 'str_attr_5', 'str_attr_6', ..., 'str_attr_33') 

I find it very tedious and I'm sure there's a smarter way to proceed.

I have a solution in mind, but I'm struggling to implement it: since we already know the type of each attribute thanks to the typing hints provided in Mapped, it would be great if something like the following were possible:

class Car(Base): __tablename__ = 'car' id: Mapped[int] = mapped_column(primary_key=True) make: Mapped[str] = mapped_column() # Tesla model: Mapped[str] = mapped_column() # S color: Mapped[str] = mapped_column() # Black str_attr_4: Mapped[str] = mapped_column() str_attr_5: Mapped[str] = mapped_column() str_attr_6: Mapped[str] = mapped_column() ... str_attr_33: Mapped[str] = mapped_column() # Ensure that the input data is not null @validates(*[attr_name for attr_name in dir(self) if attr_name is mapped to str]) # <============ def validate_string(self, key, value): if not value: raise ValueError(f'The {key} must be entered') return value 

I made several attempts and a lot of research to construct the *[attr_name for attr_name in dir(self) if attr_name is mapped to str] expression, but I couldn't get it to work.

Do you have any suggestions on how to achieve this, or perhaps other ideas to simplify the validation routine?

3
  • When the method decorator is executed you are declaring the class Car itself so self is not available. That is just a fact from Python. I think you would need a metaclass to do something like this. If you combine mapped_column and dataclassess ( to define a well defined and typed constructor) I think you can accomplish something like this. Commented Feb 22, 2024 at 3:29
  • I mean you could try to hack something together yourself with a metaclass ( difficult and maybe incompatible with sqlalchemy) or try to use the sqlalchemy 2.0 dataclasses integration to accomplish something like this. declarative-dataclass-mapping Commented Feb 22, 2024 at 3:32
  • Thank you for help @IanWilson. You're completely right. I received the following answer from Miguel Grinberg that supports yours: I'm not sure there is a clean way to do this. You are trying to iterate on a class that does not exist yet (...). Plus the following advice: I also want to note that the common thing to do is to validate data at the point of entry, so that the server only works with sanitized and validated information. Normally you validate forms and API payloads, first thing once they are received. Validating on assignment to a model feels too late (...). Commented Feb 22, 2024 at 8:29

1 Answer 1

2

This is possible, but tricky to implement. The TLDR is that you need to add this event, after Base is defined but before your model declaration(s).

import inspect import typing import sqlalchemy as sa from sqlalchemy import orm ... class Base(orm.DeclarativeBase): pass @sa.event.listens_for(Base, 'instrument_class', propagate=True) def receive_instrument_class(mapper: orm.Mapper, class_) -> None: # Select the class (or classes) which require the validator. if class_.__name__ == 'Car': # Find columns of type str. annotations = inspect.get_annotations(class_) string_columns = [ k for k, v in annotations.items() if isinstance(v, typing._GenericAlias) and v.__args__ == (str,) ] # Apply the validates decorator to the validation method. class_.validate_string = orm.validates(*string_columns)(class_.validate_string) 

Note that orm.validates only works in cases where values have been explicitly provided: if you omit a value when constructing a model, validates will not be called, so for for the use case in the question it probably isn't the right choice. Consider validating at the system boundary as mentioned in your comment.

There are two elements to making this work:

  • getting the names of the columns to be validated
  • applying the validates decorator with the list of names

I'll deal with these in reverse order, as the specifics of applying the validator limit the column data that we can access.

The validates decorator works by storing its arguments as an attribute on the decorated method. The attribute names to be validated and the validating method are then stored in the mapper's validators attribute just after the class is instrumented - this is why we have to apply the decorator in the instrument class listener: any later in the mapping process and it will have no effect.

At this point in the mapping process, there are no columns defined in the mapper or in the model class's __dict__. However the model class's __annotations__ attribute maps column names to type hints, so we can use this to identify columns associated with Mapped[str] (more complex type declarations are left as an exercise for the reader).

(It may be that there are better ways to work with the type declarations - I had a look at this Q&A but it didn't look like there was a stable way to do it, but typing isn't an area I'm very familiar with).

Here's a complete example:

import inspect import typing import sqlalchemy as sa from sqlalchemy import orm from sqlalchemy.orm import Mapped, mapped_column class Base(orm.DeclarativeBase): pass @sa.event.listens_for(Base, 'instrument_class', propagate=True) def receive_instrument_class(mapper: orm.Mapper, class_) -> None: print('Instrument class') if class_.__name__ == 'Car': # Find columns of type str. annotations = inspect.get_annotations(class_) string_columns = [ k for k, v in annotations.items() if isinstance(v, typing._GenericAlias) and v.__args__ == (str,) ] # Apply the validates decorator to the validation method. class_.validate_string = orm.validates(*string_columns)(class_.validate_string) class Car(Base): __tablename__ = 'car' id: Mapped[int] = mapped_column(primary_key=True) make: Mapped[str] model: Mapped[str] color: Mapped[str] str_attr_4: Mapped[str] str_attr_5: Mapped[str] str_attr_6: Mapped[str] ... str_attr_33: Mapped[str] def validate_string(self, key, value): if '🤡' in value: raise ValueError(f'Value {repr(value)} for {key} is not allowed' ) return value engine = sa.create_engine('sqlite://', echo=True) Base.metadata.create_all(engine) Session = orm.sessionmaker(engine) with Session.begin() as s: car = Car(make='Tesla', model='S', color='Black', str_attr_6='El🤡n') s.add(car) 
Sign up to request clarification or add additional context in comments.

2 Comments

Impressive... It works just fine. Nonetheless, it is a bit complex. I think I'll definitely validate my data on the application-side. Thank you very much for your help
Yes - it was fun trying to make it work, but the typing stuff in particular is dubious. In production I would probably use marshmallow, pydantic or similar to handle the validation.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.