211

The standard library in 3.7 can recursively convert a dataclass into a dict (example from the docs):

from dataclasses import dataclass, asdict from typing import List @dataclass class Point: x: int y: int @dataclass class C: mylist: List[Point] p = Point(10, 20) assert asdict(p) == {'x': 10, 'y': 20} c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert asdict(c) == tmp 

I am looking for a way to turn a dict back into a dataclass when there is nesting. Something like C(**tmp) only works if the fields of the data class are simple types and not themselves dataclasses. I am familiar with jsonpickle, which however comes with a prominent security warning.


EDIT:

Answers have suggested the following libraries:

  • dacite
  • mashumaro (I used for a while, works well but I quickly ran into tricky corner cases)
  • pydantic (works very well, excellent documentation and fewer corner cases)
8
  • The question this is marked as a duplicate of is indeed asking the same, but the answer given there does not work for this particular example. I've left a comment there and still looking for a more general answer. Commented Nov 21, 2018 at 9:51
  • Could you make that difference explicit here? It looks like you may have to add an elif to that if that checks for various hints. I'm not sure how you would generalize it to arbitrary type hints though (Dict and Tuple in addition to List, for example) Commented Nov 21, 2018 at 14:40
  • 5
    asdict is losing information. It would not be possible to do this in the general case. Commented Nov 26, 2018 at 18:44
  • 8
    Specifically, asdict doesn't store any information about what class the dict was produced from. Given class A: x: int and class B: x: int, should {'x': 5} be used to create an instance of A or B? You seem to be making the assumption that the list of attribute names uniquely defines a list, and that there is an existing mapping of names to data classes that could be used to select the correct class. Commented Nov 26, 2018 at 18:50
  • 4
    I would recommend you to check out this library. Commented Nov 28, 2018 at 8:48

18 Answers 18

186

I'm the author of dacite - the tool that simplifies creation of data classes from dictionaries.

This library has only one function from_dict - this is a quick example of usage:

from dataclasses import dataclass from dacite import from_dict @dataclass class User: name: str age: int is_active: bool data = { 'name': 'john', 'age': 30, 'is_active': True, } user = from_dict(data_class=User, data=data) assert user == User(name='john', age=30, is_active=True) 

Moreover dacite supports following features:

  • nested structures
  • (basic) types checking
  • optional fields (i.e. typing.Optional)
  • unions
  • collections
  • values casting and transformation
  • remapping of fields names

... and it's well tested - 100% code coverage!

To install dacite, simply use pip (or pipenv):

$ pip install dacite 
Sign up to request clarification or add additional context in comments.

11 Comments

awesome! How can we propose to add this functionality to the python standard library? :-)
I can't understand why the heck Python bring Dataclasses but not added possibility to create them from dictionary including nested classes.
It's also pretty slow, unfortunately. though the validation is quite a nice feature.
@rv.kvetch - you can try this branch github.com/konradhalas/dacite/tree/feature/… it has multiple performance improvements
As of today, I get the opposite results to the claim that pedantic dataclasses are faster than dacite (above comments and the link to their benchmark). Ran the code linked for benchmark as is and dacite is about 200 ms quicker at 0.4 while pendantic was at 0.6
|
88

All it takes is a five-liner:

def dataclass_from_dict(klass, d): if isinstance(d, list): (inner,) = klass.__args__ return [dataclass_from_dict(inner, i) for i in data] try: fieldtypes = {f.name:f.type for f in dataclasses.fields(klass)} return klass(**{f:dataclass_from_dict(fieldtypes[f],d[f]) for f in d}) except: return d # Not a dataclass field 

Sample usage:

from dataclasses import dataclass, asdict @dataclass class Point: x: float y: float @dataclass class Line: a: Point b: Point line = Line(Point(1,2), Point(3,4)) assert line == dataclass_from_dict(Line, asdict(line)) 

Full code, including to/from json, here at gist: https://gist.github.com/gatopeich/1efd3e1e4269e1e98fae9983bb914f22

15 Comments

This should be the accepted answer. Five lines of code, no external dependencies. +1
Ok, tested this intensively, and now I know why it is not. This code is buggy. Nevertheless, it is the best concept to this specific question and debugged works really well.
Shouldn't the except be catching something specific like a KeyError?
Also breaks for Optional fields.
it doesn't handle List
|
69
+100

Below is the CPython implementation of asdict – or specifically, the internal recursive helper function _asdict_inner that it uses:

# Source: https://github.com/python/cpython/blob/master/Lib/dataclasses.py def _asdict_inner(obj, dict_factory): if _is_dataclass_instance(obj): result = [] for f in fields(obj): value = _asdict_inner(getattr(obj, f.name), dict_factory) result.append((f.name, value)) return dict_factory(result) elif isinstance(obj, tuple) and hasattr(obj, '_fields'): # [large block of author comments] return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj]) elif isinstance(obj, (list, tuple)): # [ditto] return type(obj)(_asdict_inner(v, dict_factory) for v in obj) elif isinstance(obj, dict): return type(obj)((_asdict_inner(k, dict_factory), _asdict_inner(v, dict_factory)) for k, v in obj.items()) else: return copy.deepcopy(obj) 

asdict simply calls the above with some assertions, and dict_factory=dict by default.

How can this be adapted to create an output dictionary with the required type-tagging, as mentioned in the comments?


1. Adding type information

My attempt involved creating a custom return wrapper inheriting from dict:

class TypeDict(dict): def __init__(self, t, *args, **kwargs): super(TypeDict, self).__init__(*args, **kwargs) if not isinstance(t, type): raise TypeError("t must be a type") self._type = t @property def type(self): return self._type 

Looking at the original code, only the first clause needs to be modified to use this wrapper, as the other clauses only handle containers of dataclass-es:

# only use dict for now; easy to add back later def _todict_inner(obj): if is_dataclass_instance(obj): result = [] for f in fields(obj): value = _todict_inner(getattr(obj, f.name)) result.append((f.name, value)) return TypeDict(type(obj), result) elif isinstance(obj, tuple) and hasattr(obj, '_fields'): return type(obj)(*[_todict_inner(v) for v in obj]) elif isinstance(obj, (list, tuple)): return type(obj)(_todict_inner(v) for v in obj) elif isinstance(obj, dict): return type(obj)((_todict_inner(k), _todict_inner(v)) for k, v in obj.items()) else: return copy.deepcopy(obj) 

Imports:

from dataclasses import dataclass, fields, is_dataclass # thanks to Patrick Haugh from typing import * # deepcopy import copy 

Functions used:

# copy of the internal function _is_dataclass_instance def is_dataclass_instance(obj): return is_dataclass(obj) and not is_dataclass(obj.type) # the adapted version of asdict def todict(obj): if not is_dataclass_instance(obj): raise TypeError("todict() should be called on dataclass instances") return _todict_inner(obj) 

Tests with the example dataclasses:

c = C([Point(0, 0), Point(10, 4)]) print(c) cd = todict(c) print(cd) # {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} print(cd.type) # <class '__main__.C'> 

Results are as expected.


2. Converting back to a dataclass

The recursive routine used by asdict can be re-used for the reverse process, with some relatively minor changes:

def _fromdict_inner(obj): # reconstruct the dataclass using the type tag if is_dataclass_dict(obj): result = {} for name, data in obj.items(): result[name] = _fromdict_inner(data) return obj.type(**result) # exactly the same as before (without the tuple clause) elif isinstance(obj, (list, tuple)): return type(obj)(_fromdict_inner(v) for v in obj) elif isinstance(obj, dict): return type(obj)((_fromdict_inner(k), _fromdict_inner(v)) for k, v in obj.items()) else: return copy.deepcopy(obj) 

Functions used:

def is_dataclass_dict(obj): return isinstance(obj, TypeDict) def fromdict(obj): if not is_dataclass_dict(obj): raise TypeError("fromdict() should be called on TypeDict instances") return _fromdict_inner(obj) 

Test:

c = C([Point(0, 0), Point(10, 4)]) cd = todict(c) cf = fromdict(cd) print(c) # C(mylist=[Point(x=0, y=0), Point(x=10, y=4)]) print(cf) # C(mylist=[Point(x=0, y=0), Point(x=10, y=4)]) 

Again as expected.

4 Comments

TL;DR, +1 for the comprehensiveness for the answer.
+0: +1 for trying it, but -1 because it's basically a bad idea in the first place.
@wim I'd agree tbh - can't see it as much more than a theoretical exercise (which at least shows that dataclass plays well with existing object types).
I'm going to accept this as it's the most comprehensive answer that helps future users understand the core of the issue. I ended up with something closer to @Martijn's suggestion as I did indeed want JSON. Thank you everyone for your answers
58

Using no additional modules, you can make use of the __post_init__ function to automatically convert the dict values to the correct type. This function is called after __init__.

from dataclasses import dataclass, asdict @dataclass class Bar: fee: str far: str @dataclass class Foo: bar: Bar def __post_init__(self): if isinstance(self.bar, dict): self.bar = Bar(**self.bar) foo = Foo(bar=Bar(fee="La", far="So")) d= asdict(foo) print(d) # {'bar': {'fee': 'La', 'far': 'So'}} o = Foo(**d) print(o) # Foo(bar=Bar(fee='La', far='So')) 

This solution has the added benefit of being able to use non-dataclass objects. As long as its str function can be converted back, it's fair game. For example, it can be used to keep str fields as IP4Address internally.

4 Comments

This is the simplest solution and doesn't rely on custom functions or external libraries so much more portable. It also works well with collections and nesting multiple classes with collections. Should be marked as the correct answer.
Good solution! But how do you manage typing properly? If you add d2 = {'bar': {'fee': 'La', 'far': 'So'}} and o2 = Foo(**d2) you will get "dict[str, str]" is incompatible with "Bar" until you change to Foo to bar: Bar|dict; but if you do, and add fee = o2.bar.fee, you will get Argument of type "dict[str, str]" cannot be assigned to parameter "bar" of type "Bar" in function "__init__" Cannot access member "fee" for type "dict[Unknown, Unknown]" Member "fee" is unknown
A solution to the typing issue in my previous comment is not to define __post_init__ but to define __init__: def __init__(self, bar:Bar|dict) -> None: if isinstance(bar, dict): self.bar = Bar(**bar) else: self.bar = bar
This doesn't work for nested dataclass fields. OP specifically asked for it.
18

You can use mashumaro for creating dataclass object from a dict according to the scheme. Mixin from this library adds convenient from_dict and to_dict methods to dataclasses:

from dataclasses import dataclass from typing import List from mashumaro import DataClassDictMixin @dataclass class Point(DataClassDictMixin): x: int y: int @dataclass class C(DataClassDictMixin): mylist: List[Point] p = Point(10, 20) tmp = {'x': 10, 'y': 20} assert p.to_dict() == tmp assert Point.from_dict(tmp) == p c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert c.to_dict() == tmp assert C.from_dict(tmp) == c 

2 Comments

wow, this is great. If the dependencies on msgpack and pyyaml were optional, I could see this being included in the standard library at some point. It's such a no-brainer to add serialization to dataclasses, it's probably one of the most common reasons to use them in the first place.
@GiorgioBalestrieri Better late than never, but since version 3.0 msgpack and pyyaml are optional.
16

A possible solution that I haven't seen mentioned yet is to use dataclasses-json. This library provides conversions of dataclass instances to/from JSON, but also to/from dict (like dacite and mashumaro, which were suggested in earlier answers).

dataclasses-json requires decorating the classes with @dataclass_json in addition to @dataclass. The decorated classes then get a couple of member functions for conversions to/from JSON and to/from dict:

  • from_dict(...)
  • from_json(...)
  • to_dict(...)
  • to_json(...)

Here is a slightly modified version of the original code in the question. I've added the required @dataclass_json decorators and asserts for the conversion from dicts to instances of Point and C:

from dataclasses import dataclass, asdict from dataclasses_json import dataclass_json from typing import List @dataclass_json @dataclass class Point: x: int y: int @dataclass_json @dataclass class C: mylist: List[Point] p = Point(10, 20) assert asdict(p) == {'x': 10, 'y': 20} assert p == Point.from_dict({'x': 10, 'y': 20}) c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert asdict(c) == tmp assert c == C.from_dict(tmp) 

1 Comment

Probably the simplest implementation for those looking for the easiest implementation.
12

If your goal is to produce JSON from and to existing, predefined dataclasses, then just write custom encoder and decoder hooks. Do not use dataclasses.asdict() here, instead record in JSON a (safe) reference to the original dataclass.

jsonpickle is not safe because it stores references to arbitrary Python objects and passes in data to their constructors. With such references I can get jsonpickle to reference internal Python data structures and create and execute functions, classes and modules at will. But that doesn't mean you can't handle such references unsafely. Just verify that you only import (not call) and then verify that the object is an actual dataclass type, before you use it.

The framework can be made generic enough but still limited only to JSON-serialisable types plus dataclass-based instances:

import dataclasses import importlib import sys def dataclass_object_dump(ob): datacls = type(ob) if not dataclasses.is_dataclass(datacls): raise TypeError(f"Expected dataclass instance, got '{datacls!r}' object") mod = sys.modules.get(datacls.__module__) if mod is None or not hasattr(mod, datacls.__qualname__): raise ValueError(f"Can't resolve '{datacls!r}' reference") ref = f"{datacls.__module__}.{datacls.__qualname__}" fields = (f.name for f in dataclasses.fields(ob)) return {**{f: getattr(ob, f) for f in fields}, '__dataclass__': ref} def dataclass_object_load(d): ref = d.pop('__dataclass__', None) if ref is None: return d try: modname, hasdot, qualname = ref.rpartition('.') module = importlib.import_module(modname) datacls = getattr(module, qualname) if not dataclasses.is_dataclass(datacls) or not isinstance(datacls, type): raise ValueError return datacls(**d) except (ModuleNotFoundError, ValueError, AttributeError, TypeError): raise ValueError(f"Invalid dataclass reference {ref!r}") from None 

This uses JSON-RPC-style class hints to name the dataclass, and on loading this is verified to still be a data class with the same fields. No type checking is done on the values of the fields (as that's a whole different kettle of fish).

Use these as the default and object_hook arguments to json.dump[s]() and json.load[s]():

>>> print(json.dumps(c, default=dataclass_object_dump, indent=4)) { "mylist": [ { "x": 0, "y": 0, "__dataclass__": "__main__.Point" }, { "x": 10, "y": 4, "__dataclass__": "__main__.Point" } ], "__dataclass__": "__main__.C" } >>> json.loads(json.dumps(c, default=dataclass_object_dump), object_hook=dataclass_object_load) C(mylist=[Point(x=0, y=0), Point(x=10, y=4)]) >>> json.loads(json.dumps(c, default=dataclass_object_dump), object_hook=dataclass_object_load) == c True 

or create instances of the JSONEncoder and JSONDecoder classes with those same hooks.

Instead of using fully qualifying module and class names, you could also use a separate registry to map permissible type names; check against the registry on encoding, and again on decoding to ensure you don't forget to register dataclasses as you develop.

Comments

3

I know there's probably tons of JSON serialization libraries out there by now actually, and to be honest I might have stumbled upon this article a bit late. However, a newer (and well-tested) option also available is the dataclass-wizard library. This has recently (as of two weeks ago in any case) moved to the Production/Stable status as of the v0.18.0 release.

It has pretty solid support for typing generics from the typing module, as well as other niche use cases such as dataclasses in Union types and patterned dates and times. Other nice-to-have features that I have personally found quite useful, such as auto key casing transforms (i.e. camel to snake) and implicit type casts (i.e. string to annotated int) are implemented as well.

Ideal usage is with the JSONWizard Mixin class, which provides useful class methods such as:

  • from_json
  • from_dict / from_list
  • to_dict
  • to_json / list_to_json

Here's a pretty self-explanatory usage that has been tested in Python 3.7+ with the included __future__ import:

from __future__ import annotations from dataclasses import dataclass from dataclass_wizard import JSONWizard @dataclass class C(JSONWizard): my_list: list[Point] @dataclass class Point(JSONWizard): x: int y: int # Serialize Point instance p = Point(10, 20) tmp = {'x': 10, 'y': 20} assert p.to_dict() == tmp assert Point.from_dict(tmp) == p c = C([Point(0, 0), Point(10, 4)]) # default case transform is 'camelCase', though this can be overridden # with a custom Meta config supplied for the main dataclass. tmp = {'myList': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert c.to_dict() == tmp assert C.from_dict(tmp) == c 

NB: It's worth noting that, technically, you only need to sub-class the main dataclass, i.e. the model being serialized; the nested dataclasses can be left alone if desired.

If a class inheritance model is not desired altogether, the other option is to use exported helper functions such as fromdict, asdict to convert dataclass instance to/from Python dict objects as needed.

3 Comments

how does your library differ from dataclasses-json? the latter seems to be earlier and has more stars.
I do certainly like dataclasses-json which is a similar library that can be used to achieve the same result. I actually added another answer that explains similarities and usage between the two. I believe the main difference is that dataclasses-json relies on external libraries such as marshmallow to generate a schema; whereas the dataclass-wizard only requires Python stdlib. As a result, it's slightly faster; if curious, you can check out some benchmarks I added here.
An important thing to note is that dataclasses-json supports a few additional features that this library doesn't support currently. For example, converting a dataclass to JSON schema. However, I do plan to add support for this in a future release, as I agree this can be a common use case.
2

I really think that concept presented by gatopeich in this answer is the best approach for this question.

I've fixed and civilized his code. This is a correct function to load dataclass back from dictionary:

def dataclass_from_dict(cls: type, src: t.Mapping[str, t.Any]) -> t.Any: field_types_lookup = { field.name: field.type for field in dataclasses.fields(cls) } constructor_inputs = {} for field_name, value in src.items(): try: constructor_inputs[field_name] = dataclass_from_dict(field_types_lookup[field_name], value) except TypeError as e: # type error from fields() call in recursive call # indicates that field is not a dataclass, this is how we are # breaking the recursion. If not a dataclass - no need for loading constructor_inputs[field_name] = value except KeyError: # similar, field not defined on dataclass, pass as plain field value constructor_inputs[field_name] = value return cls(**constructor_inputs) 

Then you can test with following:

@dataclass class Point: x: float y: float @dataclass class Line: a: Point b: Point p1, p2 = Point(1,1), Point(2,2) line = Line(p1, p1) assert line == dataclass_from_dict(Line, asdict(line)) 

2 Comments

Worth noting that this doesn't cast optional fields (e.g. a: Point | None = None).
Works as far as my use-case is concerned. If you're trying to deserialise enum values, you can add field_types_lookup[field_name](value) if issubclass(field_types_lookup[field_name], Enum) else value to the TypeError case.
2

A possible alternative might be a lightweight chili library:

from dataclasses import dataclass, asdict from typing import List from chili import init_dataclass @dataclass class Point: x: int y: int @dataclass class C: mylist: List[Point] p = Point(10, 20) assert asdict(p) == {'x': 10, 'y': 20} c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert asdict(c) == tmp assert c == init_dataclass(tmp, C) 

Chili supports almost the entire typing module, including custom Generic types. You can read more here: https://github.com/kodemore/chili

Installation can be done through pip or poetry by simply running:

pip install chili

or

poetry add chili

It has only one dependency, which is typing extensions.

2 Comments

This is fantastic. Under-rated :)
Note that this doesn't work with frozen dataclasses.
2

Here is my own implementation based on the methods already proposed. I find it cleaner and more robust. Notably, it deals with by-product field values (i.e. init=False), InitVar fields and tp.Generic type hints:

import typing as tp from dataclasses import dataclass, field, fields, is_dataclass ValueT = tp.TypeVar("ValueT") def dataclass_from_dict(cls: tp.Type[ValueT], src: tp.Mapping[str, tp.Any]) -> ValueT: kwargs = {} fields_lookup = {field.name: field for field in fields(cls)} for field_name, value in src.items(): try: field = fields_lookup[field_name] if not field.init: continue except KeyError: annotations = { k:v for c in cls.mro()[:-1][::-1] for k,v in tp.get_type_hints(c).items()} field = annotations[field_name] assert isinstance(field, InitVar) if tp.get_origin(field) is tp.Union and type(None) in tp.get_args(field): field_type = tp.get_args(field.type)[0] else: field_type = tp.get_origin(field.type) or field.type if is_dataclass(field_type) and value is not None: kwargs[field_name] = dataclass_from_dict(field_type, value) elif isinstance(value, dict): _, subfield_type = tp.get_args(field.type) kwargs[field_name] = {} for key, data in value.items(): if isinstance(data, dict): kwargs[field_name][key] = dataclass_from_dict( subfield_type, data) else: kwargs[field_name][key] = data else: kwargs[field_name] = value return cls(**kwargs) 

1 Comment

Now fix it to work with invalid dict (for example if dataclass has new version and loads from dict from old version), load default value or default_factory if value is not correct, and to work with from __future__ import annotations because type hint may be a string in such case, then your answer will be the most correct.
1

Validobj does just that. Compared to other libraries, it provides a simpler interface (just one function at the moment) and emphasizes informative error messages. For example, given a schema like

import dataclasses from typing import Optional, List @dataclasses.dataclass class User: name: str phone: Optional[str] = None tasks: List[str] = dataclasses.field(default_factory=list) 

One gets an error like

>>> import validobj >>> validobj.parse_input({ ... 'phone': '555-1337-000', 'address': 'Somewhereville', 'nme': 'Zahari'}, User ... ) Traceback (most recent call last): ... WrongKeysError: Cannot process value into 'User' because fields do not match. The following required keys are missing: {'name'}. The following keys are unknown: {'nme', 'address'}. Alternatives to invalid value 'nme' include: - name All valid options are: - name - phone - tasks 

for a typo on a given field.

Comments

1

Simple solution that supports lists as well (and can be extended for other generic uses)

from dataclasses import dataclass, asdict, fields, is_dataclass from typing import List from types import GenericAlias def asdataclass(klass, d): if not is_dataclass(klass): return d values = {} for f in fields(klass): if isinstance(f.type, GenericAlias) and f.type.__origin__ == list: values[f.name] = [asdataclass(f.type.__args__[0], d2) for d2 in d[f.name]] else: values[f.name] = asdataclass(f.type,d[f.name]) return klass(**values) @dataclass class Point: x: int y: int @dataclass class C: mylist: list[Point] title: str = "" c = C([Point(0, 0), Point(10, 4)]) assert c == asdataclass(C, asdict(c)) 

Based on https://stackoverflow.com/a/54769644/871166

1 Comment

This fails if the class contains a field that does not exist in the dictionary. It can easily be fixed by checking to make sure the field name exists in the dictionary's keys, and returning None if it does not.
1
from dataclasses import dataclass, is_dataclass @dataclass class test2: a: str = 'name' b: int = 222 @dataclass class test: a: str = 'name' b: int = 222 t: test2 = None a = test(a = 2222222222, t=test2(a="ssss")) print(a) def dataclass_from_dict(schema: any, data: dict): data_updated = { key: ( data[key] if not is_dataclass(schema.__annotations__[key]) else dataclass_from_dict(schema.__annotations__[key], data[key]) ) for key in data.keys() } return schema(**data_updated) print(dataclass_from_dict(test, {'a': 1111111, 't': {'a': 'nazwa'} })) 

1 Comment

Could you add some explanations for your code and structure? Helps put your solution to OP's question into context of the error
0

undictify is a library which could be of help. Here is a minimal usage example:

import json from dataclasses import dataclass from typing import List, NamedTuple, Optional, Any from undictify import type_checked_constructor @type_checked_constructor(skip=True) @dataclass class Heart: weight_in_kg: float pulse_at_rest: int @type_checked_constructor(skip=True) @dataclass class Human: id: int name: str nick: Optional[str] heart: Heart friend_ids: List[int] tobias_dict = json.loads(''' { "id": 1, "name": "Tobias", "heart": { "weight_in_kg": 0.31, "pulse_at_rest": 52 }, "friend_ids": [2, 3, 4, 5] }''') tobias = Human(**tobias_dict) 

Comments

0

I would like to suggest using the Composite Pattern to solve this, the main advantage is that you could continue adding classes to this pattern and have them behave the same way.

from dataclasses import dataclass from typing import List @dataclass class CompositeDict: def as_dict(self): retval = dict() for key, value in self.__dict__.items(): if key in self.__dataclass_fields__.keys(): if type(value) is list: retval[key] = [item.as_dict() for item in value] else: retval[key] = value return retval @dataclass class Point(CompositeDict): x: int y: int @dataclass class C(CompositeDict): mylist: List[Point] c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert c.as_dict() == tmp 

as a side note, you could employ a factory pattern within the CompositeDict class that would handle other cases like nested dicts, tuples and such, which would save much boilerplate.

2 Comments

This solution is poor, it's too much complicated, you should rather use external library like "dacite".
@jurass if the OP wanted to use a library he wouldn't ask this question
0
from validated_dc import ValidatedDC from dataclasses import dataclass from typing import List, Union @dataclass class Foo(ValidatedDC): foo: int @dataclass class Bar(ValidatedDC): bar: Union[Foo, List[Foo]] foo = {'foo': 1} instance = Bar(bar=foo) print(instance.get_errors()) # None print(instance) # Bar(bar=Foo(foo=1)) list_foo = [{'foo': 1}, {'foo': 2}] instance = Bar(bar=list_foo) print(instance.get_errors()) # None print(instance) # Bar(bar=[Foo(foo=1), Foo(foo=2)]) 

validated_dc:
https://github.com/EvgeniyBurdin/validated_dc

And see a more detailed example:
https://github.com/EvgeniyBurdin/validated_dc/blob/master/examples/detailed.py

Comments

0

I came across this apischema library. It provides both serialization and deserialization of data-dataclasses.

Here is an example:

from apischema import deserialize, serialize # Define a schema with standard dataclasses @dataclass class Resource: id: UUID name: str tags: set[str] = field(default_factory=set) # Get some data uuid = uuid4() data = {"id": str(uuid), "name": "wyfo", "tags": ["some_tag"]} # Deserialize data resource = deserialize(Resource, data) assert resource == Resource(uuid, "wyfo", {"some_tag"}) # Serialize objects assert serialize(Resource, resource) == data 

This even works with Python 3.12 generic (data)classes. This uses dataclass type hinting to validate input and then generate object from it.

1 Comment

I'll admit, apischema is impressively fast! That said, using the example above, I found that dataclass-wizard currently performs close to 2x faster (see benchmarks). With the planned v1 release, performance is expected to improve slightly as well. It's worth noting that Dataclass Wizard does not support generic (data)classes yet. However, there's an open issue for this, and I plan to add support in a future update. (Disclaimer: I'm the author of dataclass-wizard.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.