Giter VIP home page Giter VIP logo

mashumaro's Introduction

logo
Fast and well tested serialization library

Build Status Coverage Status Latest Version Python Version License

In Python, you often need to dump and load objects based on the schema you have. It can be a dataclass model, a list of third-party generic classes or whatever. Mashumaro not only lets you save and load things in different ways, but it also does it super quick.

Key features

  • ๐Ÿš€ One of the fastest libraries
  • โ˜๏ธ Mature and time-tested
  • ๐Ÿ‘ถ Easy to use out of the box
  • โš™๏ธ Highly customizable
  • ๐ŸŽ‰ Built-in support for JSON, YAML, TOML, MessagePack
  • ๐Ÿ“ฆ Built-in support for almost all Python types including typing-extensions
  • ๐Ÿ“ JSON Schema generation

Table of contents

Introduction

This library provides two fundamentally different approaches to converting your data to and from various formats. Each of them is useful in different situations:

  • Codecs
  • Mixins

Codecs are represented by a set of decoder / encoder classes and decode / encode functions for each supported format. You can use them to convert data of any python built-in and third-party type to JSON, YAML, TOML, MessagePack or a basic form accepted by other serialization formats. For example, you can convert a list of datetime objects to JSON array containing string-represented datetimes and vice versa.

Mixins are primarily for dataclass models. They are represented by mixin classes that add methods for converting to and from JSON, YAML, TOML, MessagePack or a basic form accepted by other serialization formats. If you have a root dataclass model, then it will be the easiest way to make it serializable. All you have to do is inherit a particular mixin class.

In addition to serialization functionality, this library also provides JSON Schema builder that can be used in places where interoperability matters.

Installation

Use pip to install:

$ pip install mashumaro

The current version of mashumaro supports Python versions 3.8 โ€” 3.13.

It's not recommended to use any version of Python that has reached its end of life and is no longer receiving security updates or bug fixes from the Python development team. For convenience, there is a table below that outlines the last version of mashumaro that can be installed on unmaintained versions of Python.

Python Version Last Version of mashumaro Python EOL
3.7 3.9.1 2023-06-27
3.6 3.1.1 2021-12-23

Changelog

This project follows the principles of Semantic Versioning. Changelog is available on GitHub Releases page.

Supported data types

There is support for generic types from the standard typing module:

for standard generic types on PEP 585 compatible Python (3.9+):

for special primitives from the typing module:

for standard interpreter types from types module:

for enumerations based on classes from the standard enum module:

for common built-in types:

for built-in datetime oriented types (see more details):

for pathlike types:

for other less popular built-in types:

for backported types from typing-extensions:

for arbitrary types:

Usage example

Suppose we're developing a financial application and we operate with currencies and stocks:

from dataclasses import dataclass
from enum import Enum

class Currency(Enum):
    USD = "USD"
    EUR = "EUR"

@dataclass
class CurrencyPosition:
    currency: Currency
    balance: float

@dataclass
class StockPosition:
    ticker: str
    name: str
    balance: int

Now we want a dataclass for portfolio that will be serialized to and from JSON. We inherit DataClassJSONMixin that adds this functionality:

from mashumaro.mixins.json import DataClassJSONMixin

...

@dataclass
class Portfolio(DataClassJSONMixin):
    currencies: list[CurrencyPosition]
    stocks: list[StockPosition]

Let's create a portfolio instance and check methods from_json and to_json:

portfolio = Portfolio(
    currencies=[
        CurrencyPosition(Currency.USD, 238.67),
        CurrencyPosition(Currency.EUR, 361.84),
    ],
    stocks=[
        StockPosition("AAPL", "Apple", 10),
        StockPosition("AMZN", "Amazon", 10),
    ]
)

portfolio_json = portfolio.to_json()
assert Portfolio.from_json(portfolio_json) == portfolio

If we need to serialize something different from a root dataclass, we can use codecs. In the following example we create a JSON decoder and encoder for a list of currencies:

from mashumaro.codecs.json import JSONDecoder, JSONEncoder

...

decoder = JSONDecoder(list[CurrencyPosition])
encoder = JSONEncoder(list[CurrencyPosition])

currencies = [
    CurrencyPosition(Currency.USD, 238.67),
    CurrencyPosition(Currency.EUR, 361.84),
]
currencies_json = encoder.encode(currencies)
assert decoder.decode(currencies_json) == currencies

How does it work?

This library works by taking the schema of the data and generating a specific decoder and encoder for exactly that schema, taking into account the specifics of serialization format. This is much faster than inspection of data types on every call of decoding or encoding at runtime.

These specific decoders and encoders are generated by codecs and mixins:

  • When using codecs, these methods are compiled during the creation of the decoder or encoder.
  • When using serialization mixins, these methods are compiled during import time (or at runtime in some cases) and are set as attributes to your dataclasses. To minimize the import time, you can explicitly enable lazy compilation.

Benchmark

  • macOS 14.0 Sonoma
  • Apple M1
  • 16GB RAM
  • Python 3.12.0

Benchmark using pyperf with GitHub Issue model. Please note that the following charts use logarithmic scale, as it is convenient for displaying very large ranges of values.

Note

Benchmark results may vary depending on the specific configuration and parameters used for serialization and deserialization. However, we have made an attempt to use the available options that can speed up and smooth out the differences in how libraries work.

To run benchmark in your environment:

git clone [email protected]:Fatal1ty/mashumaro.git
cd mashumaro
python3 -m venv env && source env/bin/activate
pip install -e .
pip install -r requirements-dev.txt
./benchmark/run.sh

Supported serialization formats

This library has built-in support for multiple popular formats:

There are preconfigured codecs and mixin classes. However, you're free to override some settings if necessary.

Important

As for codecs, you are offered to choose between convenience and efficiency. When you need to decode or encode typed data more than once, it's highly recommended to create and reuse a decoder or encoder specifically for that data type. For one-time use with default settings it may be convenient to use global functions that create a disposable decoder or encoder under the hood. Remember that you should not use these convenient global functions more that once for the same data type if performance is important to you.

Basic form

Basic form denotes a python object consisting only of basic data types supported by most serialization formats. These types are: str, int, float, bool, list, dict.

This is also a starting point you can play with for a comprehensive transformation of your data.

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs import BasicDecoder, BasicEncoder
# or from mashumaro.codecs.basic import BasicDecoder, BasicEncoder

decoder = BasicDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = BasicEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions are recommended to be used as follows:

import mashumaro.codecs.basic as basic_codec

basic_codec.decode(..., <shape_type>)
basic_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro import DataClassDictMixin
# or from mashumaro.mixins.dict import DataClassDictMixin

@dataclass
class MyModel(DataClassDictMixin):
    ...

MyModel.from_dict(...)
MyModel(...).to_dict()

Tip

You don't need to inherit DataClassDictMixin along with other serialization mixins because it's a base class for them.

JSON

JSON is a lightweight data-interchange format. You can choose between standard library json for compatibility and third-party dependency orjson for better performance.

json library

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs.json import JSONDecoder, JSONEncoder

decoder = JSONDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = JSONEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions can be used as follows:

from mashumaro.codecs.json import json_decode, json_encode

json_decode(..., <shape_type>)
json_encode(..., <shape_type>)

Convenient function aliases are recommended to be used as follows:

import mashumaro.codecs.json as json_codec

json_codec.decode(...<shape_type>)
json_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro.mixins.json import DataClassJSONMixin

@dataclass
class MyModel(DataClassJSONMixin):
    ...

MyModel.from_json(...)
MyModel(...).to_json()

orjson library

In order to use orjson library, it must be installed manually or using an extra option for mashumaro:

pip install mashumaro[orjson]

The following data types will be handled by orjson library by default:

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs.orjson import ORJSONDecoder, ORJSONEncoder

decoder = ORJSONDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = ORJSONEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions can be used as follows:

from mashumaro.codecs.orjson import json_decode, json_encode

json_decode(..., <shape_type>)
json_encode(..., <shape_type>)

Convenient function aliases are recommended to be used as follows:

import mashumaro.codecs.orjson as json_codec

json_codec.decode(...<shape_type>)
json_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro.mixins.orjson import DataClassORJSONMixin

@dataclass
class MyModel(DataClassORJSONMixin):
    ...

MyModel.from_json(...)
MyModel(...).to_json()
MyModel(...).to_jsonb()

YAML

YAML is a human-friendly data serialization language for all programming languages. In order to use this format, the pyyaml package must be installed. You can install it manually or using an extra option for mashumaro:

pip install mashumaro[yaml]

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs.yaml import YAMLDecoder, YAMLEncoder

decoder = YAMLDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = YAMLEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions can be used as follows:

from mashumaro.codecs.yaml import yaml_decode, yaml_encode

yaml_decode(..., <shape_type>)
yaml_encode(..., <shape_type>)

Convenient function aliases are recommended to be used as follows:

import mashumaro.codecs.yaml as yaml_codec

yaml_codec.decode(...<shape_type>)
yaml_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro.mixins.yaml import DataClassYAMLMixin

@dataclass
class MyModel(DataClassYAMLMixin):
    ...

MyModel.from_yaml(...)
MyModel(...).to_yaml()

TOML

TOML is config file format for humans. In order to use this format, the tomli and tomli-w packages must be installed. In Python 3.11+, tomli is included as tomlib standard library module and is used for this format. You can install the missing packages manually or using an extra option for mashumaro:

pip install mashumaro[toml]

The following data types will be handled by tomli/ tomli-w library by default:

Fields with value None will be omitted on serialization because TOML doesn't support null values.

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs.toml import TOMLDecoder, TOMLEncoder

decoder = TOMLDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = TOMLEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions can be used as follows:

from mashumaro.codecs.toml import toml_decode, toml_encode

toml_decode(..., <shape_type>)
toml_encode(..., <shape_type>)

Convenient function aliases are recommended to be used as follows:

import mashumaro.codecs.toml as toml_codec

toml_codec.decode(...<shape_type>)
toml_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro.mixins.toml import DataClassTOMLMixin

@dataclass
class MyModel(DataClassTOMLMixin):
    ...

MyModel.from_toml(...)
MyModel(...).to_toml()

MessagePack

MessagePack is an efficient binary serialization format. In order to use this mixin, the msgpack package must be installed. You can install it manually or using an extra option for mashumaro:

pip install mashumaro[msgpack]

The following data types will be handled by msgpack library by default:

Efficient decoder and encoder can be used as follows:

from mashumaro.codecs.msgpack import MessagePackDecoder, MessagePackEncoder

decoder = MessagePackDecoder(<shape_type>, ...)
decoder.decode(...)

encoder = MessagePackEncoder(<shape_type>, ...)
encoder.encode(...)

Convenient functions can be used as follows:

from mashumaro.codecs.msgpack import msgpack_decode, msgpack_encode

msgpack_decode(..., <shape_type>)
msgpack_encode(..., <shape_type>)

Convenient function aliases are recommended to be used as follows:

import mashumaro.codecs.msgpack as msgpack_codec

msgpack_codec.decode(...<shape_type>)
msgpack_codec.encode(..., <shape_type>)

Mixin can be used as follows:

from mashumaro.mixins.msgpack import DataClassMessagePackMixin

@dataclass
class MyModel(DataClassMessagePackMixin):
    ...

MyModel.from_msgpack(...)
MyModel(...).to_msgpack()

Customization

Customization options of mashumaro are extensive and will most likely cover your needs. When it comes to non-standard data types and non-standard serialization support, you can do the following:

  • Turn an existing regular or generic class into a serializable one by inheriting the SerializableType class
  • Write different serialization strategies for an existing regular or generic type that is not under your control using SerializationStrategy class
  • Define serialization / deserialization methods:
    • for a specific dataclass field by using field options
    • for a specific data type used in the dataclass by using Config class
  • Alter input and output data with serialization / deserialization hooks
  • Separate serialization scheme from a dataclass in a reusable manner using dialects
  • Choose from predefined serialization engines for the specific data types, e.g. datetime and NamedTuple

SerializableType interface

If you have a custom class or hierarchy of classes whose instances you want to serialize with mashumaro, the first option is to implement SerializableType interface.

User-defined types

Let's look at this not very practicable example:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableType

class Airport(SerializableType):
    def __init__(self, code, city):
        self.code, self.city = code, city

    def _serialize(self):
        return [self.code, self.city]

    @classmethod
    def _deserialize(cls, value):
        return cls(*value)

    def __eq__(self, other):
        return self.code, self.city == other.code, other.city

@dataclass
class Flight(DataClassDictMixin):
    origin: Airport
    destination: Airport

JFK = Airport("JFK", "New York City")
LAX = Airport("LAX", "Los Angeles")

input_data = {
    "origin": ["JFK", "New York City"],
    "destination": ["LAX", "Los Angeles"]
}
my_flight = Flight.from_dict(input_data)
assert my_flight == Flight(JFK, LAX)
assert my_flight.to_dict() == input_data

You can see how Airport instances are seamlessly created from lists of two strings and serialized into them.

By default _deserialize method will get raw input data without any transformations before. This should be enough in many cases, especially when you need to perform non-standard transformations yourself, but let's extend our example:

class Itinerary(SerializableType):
    def __init__(self, flights):
        self.flights = flights

    def _serialize(self):
        return self.flights

    @classmethod
    def _deserialize(cls, flights):
        return cls(flights)

@dataclass
class TravelPlan(DataClassDictMixin):
    budget: float
    itinerary: Itinerary

input_data = {
    "budget": 10_000,
    "itinerary": [
        {
            "origin": ["JFK", "New York City"],
            "destination": ["LAX", "Los Angeles"]
        },
        {
            "origin": ["LAX", "Los Angeles"],
            "destination": ["SFO", "San Fransisco"]
        }
    ]
}

If we pass the flight list as is into Itinerary._deserialize, our itinerary will have something that we may not expect โ€” list[dict] instead of list[Flight]. The solution is quite simple. Instead of calling Flight._deserialize yourself, just use annotations:

class Itinerary(SerializableType, use_annotations=True):
    def __init__(self, flights):
        self.flights = flights

    def _serialize(self) -> list[Flight]:
        return self.flights

    @classmethod
    def _deserialize(cls, flights: list[Flight]):
        return cls(flights)

my_plan = TravelPlan.from_dict(input_data)
assert isinstance(my_plan.itinerary.flights[0], Flight)
assert isinstance(my_plan.itinerary.flights[1], Flight)
assert my_plan.to_dict() == input_data

Here we add annotations to the only argument of _deserialize method and to the return value of _serialize method as well. The latter is needed for correct serialization.

Important

The importance of explicit passing use_annotations=True when defining a class is that otherwise implicit using annotations might break compatibility with old code that wasn't aware of this feature. It will be enabled by default in the future major release.

User-defined generic types

The great thing to note about using annotations in SerializableType is that they work seamlessly with generic and variadic generic types. Let's see how this can be useful:

from datetime import date
from typing import TypeVar
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableType

KT = TypeVar("KT")
VT = TypeVar("VT")

class DictWrapper(dict[KT, VT], SerializableType, use_annotations=True):
    def _serialize(self) -> dict[KT, VT]:
        return dict(self)

    @classmethod
    def _deserialize(cls, value: dict[KT, VT]) -> 'DictWrapper[KT, VT]':
        return cls(value)

@dataclass
class DataClass(DataClassDictMixin):
    x: DictWrapper[date, str]
    y: DictWrapper[str, date]

input_data = {
    "x": {"2022-12-07": "2022-12-07"},
    "y": {"2022-12-07": "2022-12-07"}
}
obj = DataClass.from_dict(input_data)
assert obj == DataClass(
    x=DictWrapper({date(2022, 12, 7): "2022-12-07"}),
    y=DictWrapper({"2022-12-07": date(2022, 12, 7)})
)
assert obj.to_dict() == input_data

You can see that formatted date is deserialized to date object before passing to DictWrapper._deserialize in a key or value according to the generic parameters.

If you have generic dataclass types, you can use SerializableType for them as well, but it's not necessary since they're supported out of the box.

SerializationStrategy

If you want to add support for a custom third-party type that is not under your control, you can write serialization and deserialization logic inside SerializationStrategy class, which will be reusable and so well suited in case that third-party type is widely used. SerializationStrategy is also good if you want to create strategies that are slightly different from each other, because you can add the strategy differentiator in the __init__ method.

Third-party types

To demonstrate how SerializationStrategy works let's write a simple strategy for datetime serialization in different formats. In this example we will use the same strategy class for two dataclass fields, but a string representing the date and time will be different.

from dataclasses import dataclass, field
from datetime import datetime
from mashumaro import DataClassDictMixin, field_options
from mashumaro.types import SerializationStrategy

class FormattedDateTime(SerializationStrategy):
    def __init__(self, fmt):
        self.fmt = fmt

    def serialize(self, value: datetime) -> str:
        return value.strftime(self.fmt)

    def deserialize(self, value: str) -> datetime:
        return datetime.strptime(value, self.fmt)

@dataclass
class DateTimeFormats(DataClassDictMixin):
    short: datetime = field(
        metadata=field_options(
            serialization_strategy=FormattedDateTime("%d%m%Y%H%M%S")
        )
    )
    verbose: datetime = field(
        metadata=field_options(
            serialization_strategy=FormattedDateTime("%A %B %d, %Y, %H:%M:%S")
        )
    )

formats = DateTimeFormats(
    short=datetime(2019, 1, 1, 12),
    verbose=datetime(2019, 1, 1, 12),
)
dictionary = formats.to_dict()
# {'short': '01012019120000', 'verbose': 'Tuesday January 01, 2019, 12:00:00'}
assert DateTimeFormats.from_dict(dictionary) == formats

Similarly to SerializableType, SerializationStrategy could also take advantage of annotations:

from dataclasses import dataclass
from datetime import datetime
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializationStrategy

class TsSerializationStrategy(SerializationStrategy, use_annotations=True):
    def serialize(self, value: datetime) -> float:
        return value.timestamp()

    def deserialize(self, value: float) -> datetime:
        # value will be converted to float before being passed to this method
        return datetime.fromtimestamp(value)

@dataclass
class Example(DataClassDictMixin):
    dt: datetime

    class Config:
        serialization_strategy = {
            datetime: TsSerializationStrategy(),
        }

example = Example.from_dict({"dt": "1672531200"})
print(example)
# Example(dt=datetime.datetime(2023, 1, 1, 3, 0))
print(example.to_dict())
# {'dt': 1672531200.0}

Here the passed string value "1672531200" will be converted to float before being passed to deserialize method thanks to the float annotation.

Important

As well as for SerializableType, the value of use_annotatons will be True by default in the future major release.

Third-party generic types

To create a generic version of a serialization strategy you need to follow these steps:

  • inherit Generic[...] type with the number of parameters matching the number of parameters of the target generic type
  • Write generic annotations for serialize method's return type and for deserialize method's argument type
  • Use the origin type of the target generic type in the serialization_strategy config section (typing.get_origin might be helpful)

There is no need to add use_annotations=True here because it's enabled implicitly for generic serialization strategies.

For example, there is a third-party multidict package that has a generic MultiDict type. A generic serialization strategy for it might look like this:

from dataclasses import dataclass
from datetime import date
from pprint import pprint
from typing import Generic, List, Tuple, TypeVar
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializationStrategy

from multidict import MultiDict

T = TypeVar("T")

class MultiDictSerializationStrategy(SerializationStrategy, Generic[T]):
    def serialize(self, value: MultiDict[T]) -> List[Tuple[str, T]]:
        return [(k, v) for k, v in value.items()]

    def deserialize(self, value: List[Tuple[str, T]]) -> MultiDict[T]:
        return MultiDict(value)


@dataclass
class Example(DataClassDictMixin):
    floats: MultiDict[float]
    date_lists: MultiDict[List[date]]

    class Config:
        serialization_strategy = {
            MultiDict: MultiDictSerializationStrategy()
        }

example = Example(
    floats=MultiDict([("x", 1.1), ("x", 2.2)]),
    date_lists=MultiDict(
        [("x", [date(2023, 1, 1), date(2023, 1, 2)]),
         ("x", [date(2023, 2, 1), date(2023, 2, 2)])]
    ),
)
pprint(example.to_dict())
# {'date_lists': [['x', ['2023-01-01', '2023-01-02']],
#                 ['x', ['2023-02-01', '2023-02-02']]],
#  'floats': [['x', 1.1], ['x', 2.2]]}
assert Example.from_dict(example.to_dict()) == example

Field options

In some cases creating a new class just for one little thing could be excessive. Moreover, you may need to deal with third party classes that you are not allowed to change. You can use dataclasses.field function to configure some serialization aspects through its metadata parameter. Next section describes all supported options to use in metadata mapping.

If you don't want to remember the names of the options you can use field_options helper function:

from dataclasses import dataclass, field
from mashumaro import field_options

@dataclass
class A:
    x: int = field(metadata=field_options(...))

serialize option

This option allows you to change the serialization method. When using this option, the serialization behaviour depends on what type of value the option has. It could be either Callable[[Any], Any] or str.

A value of type Callable[[Any], Any] is a generic way to specify any callable object like a function, a class method, a class instance method, an instance of a callable class or even a lambda function to be called for serialization.

A value of type str sets a specific engine for serialization. Keep in mind that all possible engines depend on the data type that this option is used with. At this moment there are next serialization engines to choose from:

Applicable data types Supported engines Description
NamedTuple, namedtuple as_list, as_dict How to pack named tuples. By default as_list engine is used that means your named tuple class instance will be packed into a list of its values. You can pack it into a dictionary using as_dict engine.
Any omit Skip the field during serialization

Tip

You can pass a field value as is without changes on serialization using pass_through.

Example:

from datetime import datetime
from dataclasses import dataclass, field
from typing import NamedTuple
from mashumaro import DataClassDictMixin

class MyNamedTuple(NamedTuple):
    x: int
    y: float

@dataclass
class A(DataClassDictMixin):
    dt: datetime = field(
        metadata={
            "serialize": lambda v: v.strftime('%Y-%m-%d %H:%M:%S')
        }
    )
    t: MyNamedTuple = field(metadata={"serialize": "as_dict"})

deserialize option

This option allows you to change the deserialization method. When using this option, the deserialization behaviour depends on what type of value the option has. It could be either Callable[[Any], Any] or str.

A value of type Callable[[Any], Any] is a generic way to specify any callable object like a function, a class method, a class instance method, an instance of a callable class or even a lambda function to be called for deserialization.

A value of type str sets a specific engine for deserialization. Keep in mind that all possible engines depend on the data type that this option is used with. At this moment there are next deserialization engines to choose from:

Applicable data types Supported engines Description
datetime, date, time ciso8601, pendulum How to parse datetime string. By default native fromisoformat of corresponding class will be used for datetime, date and time fields. It's the fastest way in most cases, but you can choose an alternative.
NamedTuple, namedtuple as_list, as_dict How to unpack named tuples. By default as_list engine is used that means your named tuple class instance will be created from a list of its values. You can unpack it from a dictionary using as_dict engine.

Tip

You can pass a field value as is without changes on deserialization using pass_through.

Example:

from datetime import datetime
from dataclasses import dataclass, field
from typing import List, NamedTuple
from mashumaro import DataClassDictMixin
import ciso8601
import dateutil

class MyNamedTuple(NamedTuple):
    x: int
    y: float

@dataclass
class A(DataClassDictMixin):
    x: datetime = field(
        metadata={"deserialize": "pendulum"}
    )

class B(DataClassDictMixin):
    x: datetime = field(
        metadata={"deserialize": ciso8601.parse_datetime_as_naive}
    )

@dataclass
class C(DataClassDictMixin):
    dt: List[datetime] = field(
        metadata={
            "deserialize": lambda l: list(map(dateutil.parser.isoparse, l))
        }
    )

@dataclass
class D(DataClassDictMixin):
    x: MyNamedTuple = field(metadata={"deserialize": "as_dict"})

serialization_strategy option

This option is useful when you want to change the serialization logic for a dataclass field depending on some defined parameters using a reusable serialization scheme. You can find an example in the SerializationStrategy chapter.

Tip

You can pass a field value as is without changes on serialization / deserialization using pass_through.

alias option

This option can be used to assign field aliases:

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options

@dataclass
class DataClass(DataClassDictMixin):
    a: int = field(metadata=field_options(alias="FieldA"))
    b: int = field(metadata=field_options(alias="#invalid"))

x = DataClass.from_dict({"FieldA": 1, "#invalid": 2})  # DataClass(a=1, b=2)

Config options

If inheritance is not an empty word for you, you'll fall in love with the Config class. You can register serialize and deserialize methods, define code generation options and other things just in one place. Or in some classes in different ways if you need flexibility. Inheritance is always on the first place.

There is a base class BaseConfig that you can inherit for the sake of convenience, but it's not mandatory.

In the following example you can see how the debug flag is changed from class to class: ModelA will have debug mode enabled but ModelB will not.

from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

class BaseModel(DataClassDictMixin):
    class Config(BaseConfig):
        debug = True

class ModelA(BaseModel):
    a: int

class ModelB(BaseModel):
    b: int

    class Config(BaseConfig):
        debug = False

Next section describes all supported options to use in the config.

debug config option

If you enable the debug option the generated code for your data class will be printed.

code_generation_options config option

Some users may need functionality that wouldn't exist without extra cost such as valuable cpu time to execute additional instructions. Since not everyone needs such instructions, they can be enabled by a constant in the list, so the fastest basic behavior of the library will always remain by default. The following table provides a brief overview of all the available constants described below.

Constant Description
TO_DICT_ADD_OMIT_NONE_FLAG Adds omit_none keyword-only argument to to_* methods.
TO_DICT_ADD_BY_ALIAS_FLAG Adds by_alias keyword-only argument to to_* methods.
ADD_DIALECT_SUPPORT Adds dialect keyword-only argument to from_* and to_* methods.
ADD_SERIALIZATION_CONTEXT Adds context keyword-only argument to to_* methods.

serialization_strategy config option

You can register custom SerializationStrategy, serialize and deserialize methods for specific types just in one place. It could be configured using a dictionary with types as keys. The value could be either a SerializationStrategy instance or a dictionary with serialize and deserialize values with the same meaning as in the field options.

from dataclasses import dataclass
from datetime import datetime, date
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
from mashumaro.types import SerializationStrategy

class FormattedDateTime(SerializationStrategy):
    def __init__(self, fmt):
        self.fmt = fmt

    def serialize(self, value: datetime) -> str:
        return value.strftime(self.fmt)

    def deserialize(self, value: str) -> datetime:
        return datetime.strptime(value, self.fmt)

@dataclass
class DataClass(DataClassDictMixin):

    x: datetime
    y: date

    class Config(BaseConfig):
        serialization_strategy = {
            datetime: FormattedDateTime("%Y"),
            date: {
                # you can use specific str values for datetime here as well
                "deserialize": "pendulum",
                "serialize": date.isoformat,
            },
        }

instance = DataClass.from_dict({"x": "2021", "y": "2021"})
# DataClass(x=datetime.datetime(2021, 1, 1, 0, 0), y=Date(2021, 1, 1))
dictionary = instance.to_dict()
# {'x': '2021', 'y': '2021-01-01'}

Note that you can register different methods for multiple logical types which are based on the same type using NewType and Annotated, see Extending existing types for details.

It's also possible to define a generic (de)serialization method for a generic type by registering a method for its origin type. Although this technique is widely used when working with third-party generic types using generic strategies, it can also be applied in simple scenarios:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class C(DataClassDictMixin):
    ints: list[int]
    floats: list[float]

    class Config:
        serialization_strategy = {
            list: {  # origin type for list[int] and list[float] is list
                "serialize": lambda x: list(map(str, x)),
            }
        }

assert C([1], [2.2]).to_dict() == {'ints': ['1'], 'floats': ['2.2']}

aliases config option

Sometimes it's better to write the field aliases in one place. You can mix aliases here with aliases in the field options, but the last ones will always take precedence.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

@dataclass
class DataClass(DataClassDictMixin):
    a: int
    b: int

    class Config(BaseConfig):
        aliases = {
            "a": "FieldA",
            "b": "FieldB",
        }

DataClass.from_dict({"FieldA": 1, "FieldB": 2})  # DataClass(a=1, b=2)

serialize_by_alias config option

All the fields with aliases will be serialized by them by default when this option is enabled. You can mix this config option with by_alias keyword argument.

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options
from mashumaro.config import BaseConfig

@dataclass
class DataClass(DataClassDictMixin):
    field_a: int = field(metadata=field_options(alias="FieldA"))

    class Config(BaseConfig):
        serialize_by_alias = True

DataClass(field_a=1).to_dict()  # {'FieldA': 1}

allow_deserialization_not_by_alias config option

When using aliases, the deserializer defaults to requiring the keys to match what is defined as the alias. If the flexibility to deserialize aliased and unaliased keys is required then the config option allow_deserialization_not_by_alias can be set to enable the feature.

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig


@dataclass
class AliasedDataClass(DataClassDictMixin):
    foo: int = field(metadata={"alias": "alias_foo"})
    bar: int = field(metadata={"alias": "alias_bar"})

    class Config(BaseConfig):
        allow_deserialization_not_by_alias = True


alias_dict = {"alias_foo": 1, "alias_bar": 2}
t1 = AliasedDataClass.from_dict(alias_dict)

no_alias_dict = {"foo": 1, "bar": 2}
# This would raise `mashumaro.exceptions.MissingField`
# if allow_deserialization_not_by_alias was False
t2 = AliasedDataClass.from_dict(no_alias_dict)
assert t1 == t2

omit_none config option

All the fields with None values will be skipped during serialization by default when this option is enabled. You can mix this config option with omit_none keyword argument.

from dataclasses import dataclass
from typing import Optional
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

@dataclass
class DataClass(DataClassDictMixin):
    x: Optional[int] = 42

    class Config(BaseConfig):
        omit_none = True

DataClass(x=None).to_dict()  # {}

omit_default config option

When this option enabled, all the fields that have values equal to the defaults or the default_factory results will be skipped during serialization.

from dataclasses import dataclass, field
from typing import List, Optional, Tuple
from mashumaro import DataClassDictMixin, field_options
from mashumaro.config import BaseConfig

@dataclass
class Foo:
    foo: str

@dataclass
class DataClass(DataClassDictMixin):
    a: int = 42
    b: Tuple[int, ...] = field(default=(1, 2, 3))
    c: List[Foo] = field(default_factory=lambda: [Foo("foo")])
    d: Optional[str] = None

    class Config(BaseConfig):
        omit_default = True

DataClass(a=42, b=(1, 2, 3), c=[Foo("foo")]).to_dict()  # {}

namedtuple_as_dict config option

Dataclasses are a great way to declare and use data models. But it's not the only way. Python has a typed version of namedtuple called NamedTuple which looks similar to dataclasses:

from typing import NamedTuple

class Point(NamedTuple):
    x: int
    y: int

the same with a dataclass will look like this:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

At first glance, you can use both options. But imagine that you need to create a bunch of instances of the Point class. Due to how dataclasses work you will have more memory consumption compared to named tuples. In such a case it could be more appropriate to use named tuples.

By default, all named tuples are packed into lists. But with namedtuple_as_dict option you have a drop-in replacement for dataclasses:

from dataclasses import dataclass
from typing import List, NamedTuple
from mashumaro import DataClassDictMixin

class Point(NamedTuple):
    x: int
    y: int

@dataclass
class DataClass(DataClassDictMixin):
    points: List[Point]

    class Config:
        namedtuple_as_dict = True

obj = DataClass.from_dict({"points": [{"x": 0, "y": 0}, {"x": 1, "y": 1}]})
print(obj.to_dict())  # {"points": [{"x": 0, "y": 0}, {"x": 1, "y": 1}]}

If you want to serialize only certain named tuple fields as dictionaries, you can use the corresponding serialization and deserialization engines.

allow_postponed_evaluation config option

PEP 563 solved the problem of forward references by postponing the evaluation of annotations, so you can write the following code:

from __future__ import annotations
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class A(DataClassDictMixin):
    x: B

@dataclass
class B(DataClassDictMixin):
    y: int

obj = A.from_dict({'x': {'y': 1}})

You don't need to write anything special here, forward references work out of the box. If a field of a dataclass has a forward reference in the type annotations, building of from_* and to_* methods of this dataclass will be postponed until they are called once. However, if for some reason you don't want the evaluation to be possibly postponed, you can disable it using allow_postponed_evaluation option:

from __future__ import annotations
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class A(DataClassDictMixin):
    x: B

    class Config:
        allow_postponed_evaluation = False

# UnresolvedTypeReferenceError: Class A has unresolved type reference B
# in some of its fields

@dataclass
class B(DataClassDictMixin):
    y: int

In this case you will get UnresolvedTypeReferenceError regardless of whether class B is declared below or not.

dialect config option

This option is described below in the Dialects section.

orjson_options config option

This option changes default options for orjson.dumps encoder which is used in DataClassORJSONMixin. For example, you can tell orjson to handle non-str dict keys as the built-in json.dumps encoder does. See orjson documentation to read more about these options.

import orjson
from dataclasses import dataclass
from typing import Dict
from mashumaro.config import BaseConfig
from mashumaro.mixins.orjson import DataClassORJSONMixin

@dataclass
class MyClass(DataClassORJSONMixin):
    x: Dict[int, int]

    class Config(BaseConfig):
        orjson_options = orjson.OPT_NON_STR_KEYS

assert MyClass({1: 2}).to_json() == {"1": 2}

discriminator config option

This option is described in the Class level discriminator section.

lazy_compilation config option

By using this option, the compilation of the from_* and to_* methods will be deferred until they are called first time. This will reduce the import time and, in certain instances, may enhance the speed of deserialization by leveraging the data that is accessible after the class has been created.

Caution

If you need to save a reference to from_* or to_* method, you should do it after the method is compiled. To be safe, you can always use lambda function:

from_dict = lambda x: MyModel.from_dict(x)
to_dict = lambda x: x.to_dict()

sort_keys config option

When set, the keys on serialized dataclasses will be sorted in alphabetical order.

Unlike the sort_keys option in the standard library's json.dumps function, this option acts at class creation time and has no effect on the performance of serialization.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

@dataclass
class SortedDataClass(DataClassDictMixin):
    foo: int
    bar: int

    class Config(BaseConfig):
        sort_keys = True

t = SortedDataClass(1, 2)
assert t.to_dict() == {"bar": 2, "foo": 1}

forbid_extra_keys config option

When set, the deserialization of dataclasses will fail if the input dictionary contains keys that are not present in the dataclass.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

@dataclass
class DataClass(DataClassDictMixin):
    a: int

    class Config(BaseConfig):
        forbid_extra_keys = True

DataClass.from_dict({"a": 1, "b": 2})  # ExtraKeysError: Extra keys: {'b'}

It plays well with aliases and allow_deserialization_not_by_alias options.

Passing field values as is

In some cases it's needed to pass a field value as is without any changes during serialization / deserialization. There is a predefined pass_through object that can be used as serialization_strategy or serialize / deserialize options:

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, pass_through

class MyClass:
    def __init__(self, some_value):
        self.some_value = some_value

@dataclass
class A1(DataClassDictMixin):
    x: MyClass = field(
        metadata={
            "serialize": pass_through,
            "deserialize": pass_through,
        }
    )

@dataclass
class A2(DataClassDictMixin):
    x: MyClass = field(
        metadata={
            "serialization_strategy": pass_through,
        }
    )

@dataclass
class A3(DataClassDictMixin):
    x: MyClass

    class Config:
        serialization_strategy = {
            MyClass: pass_through,
        }

@dataclass
class A4(DataClassDictMixin):
    x: MyClass

    class Config:
        serialization_strategy = {
            MyClass: {
                "serialize": pass_through,
                "deserialize": pass_through,
            }
        }

my_class_instance = MyClass(42)

assert A1.from_dict({'x': my_class_instance}).x == my_class_instance
assert A2.from_dict({'x': my_class_instance}).x == my_class_instance
assert A3.from_dict({'x': my_class_instance}).x == my_class_instance
assert A4.from_dict({'x': my_class_instance}).x == my_class_instance

a1_dict = A1(my_class_instance).to_dict()
a2_dict = A2(my_class_instance).to_dict()
a3_dict = A3(my_class_instance).to_dict()
a4_dict = A4(my_class_instance).to_dict()

assert a1_dict == a2_dict == a3_dict == a4_dict == {"x": my_class_instance}

Extending existing types

There are situations where you might want some values of the same type to be treated as their own type. You can create new logical types with NewType, Annotated or TypeAliasType and register serialization strategies for them:

from typing import Mapping, NewType, Annotated
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

SessionID = NewType("SessionID", str)
AccountID = Annotated[str, "AccountID"]

type DeviceID = str

@dataclass
class Context(DataClassDictMixin):
    account_sessions: Mapping[AccountID, SessionID]
    account_devices: list[DeviceID]

    class Config:
        serialization_strategy = {
            AccountID: {
                "deserialize": lambda x: ...,
                "serialize": lambda x: ...,
            },
            SessionID: {
                "deserialize": lambda x: ...,
                "serialize": lambda x: ...,
            },
            DeviceID: {
                "deserialize": lambda x: ...,
                "serialize": lambda x: ...,
            }
        }

Although using NewType is usually the most reliable way to avoid logical errors, you have to pay for it with notable overhead. If you are creating dataclass instances manually, then you know that type checkers will enforce you to enclose a value in your "NewType" callable, which leads to performance degradation:

python -m timeit -s "from typing import NewType; MyInt = NewType('MyInt', int)" "MyInt(42)"
10000000 loops, best of 5: 31.1 nsec per loop

python -m timeit -s "from typing import NewType; MyInt = NewType('MyInt', int)" "42"
50000000 loops, best of 5: 4.35 nsec per loop

However, when you create dataclass instances using the from_* method provided by one of the mixins or using one of the decoders, there will be no performance degradation, because the value won't be enclosed in the callable in the generated code. Therefore, if performance is more important to you than catching logical errors by type checkers, and you are actively creating or changing dataclasses manually, then you should take a closer look at using Annotated.

Field aliases

In some cases it's better to have different names for a field in your dataclass and in its serialized view. For example, a third-party legacy API you are working with might operate with camel case style, but you stick to snake case style in your code base. Or you want to load data with keys that are invalid identifiers in Python. Aliases can solve this problem.

There are multiple ways to assign an alias:

  • Using Alias(...) annotation in a field type
  • Using alias parameter in field metadata
  • Using aliases parameter in a dataclass config

By default, aliases only affect deserialization, but it can be extended to serialization as well. If you want to serialize all the fields by aliases you have two options to do so:

Here is an example with Alias annotation in a field type:

from dataclasses import dataclass
from typing import Annotated
from mashumaro import DataClassDictMixin
from mashumaro.types import Alias

@dataclass
class DataClass(DataClassDictMixin):
    foo_bar: Annotated[int, Alias("fooBar")]

obj = DataClass.from_dict({"fooBar": 42})  # DataClass(foo_bar=42)
obj.to_dict()  # {"foo_bar": 42}  # no aliases on serialization by default

The same with field metadata:

from dataclasses import dataclass, field
from mashumaro import field_options

@dataclass
class DataClass:
    foo_bar: str = field(metadata=field_options(alias="fooBar"))

And with a dataclass config:

from dataclasses import dataclass
from mashumaro.config import BaseConfig

@dataclass
class DataClass:
    foo_bar: str

    class Config(BaseConfig):
        aliases = {"foo_bar": "fooBar"}

Tip

If you want to deserialize all the fields by its names along with aliases, there is a config option for that.

Dialects

Sometimes it's needed to have different serialization and deserialization methods depending on the data source where entities of the dataclass are stored or on the API to which the entities are being sent or received from. There is a special Dialect type that may contain all the differences from the default serialization and deserialization methods. You can create different dialects and use each of them for the same dataclass depending on the situation.

Suppose we have the following dataclass with a field of type date:

@dataclass
class Entity(DataClassDictMixin):
    dt: date

By default, a field of date type serializes to a string in ISO 8601 format, so the serialized entity will look like {'dt': '2021-12-31'}. But what if we have, for example, two sensitive legacy Ethiopian and Japanese APIs that use two different formats for dates โ€” dd/mm/yyyy and yyyyๅนดmmๆœˆddๆ—ฅ? Instead of creating two similar dataclasses we can have one dataclass and two dialects:

from dataclasses import dataclass
from datetime import date, datetime
from mashumaro import DataClassDictMixin
from mashumaro.config import ADD_DIALECT_SUPPORT
from mashumaro.dialect import Dialect
from mashumaro.types import SerializationStrategy

class DateTimeSerializationStrategy(SerializationStrategy):
    def __init__(self, fmt: str):
        self.fmt = fmt

    def serialize(self, value: date) -> str:
        return value.strftime(self.fmt)

    def deserialize(self, value: str) -> date:
        return datetime.strptime(value, self.fmt).date()

class EthiopianDialect(Dialect):
    serialization_strategy = {
        date: DateTimeSerializationStrategy("%d/%m/%Y")
    }

class JapaneseDialect(Dialect):
    serialization_strategy = {
        date: DateTimeSerializationStrategy("%Yๅนด%mๆœˆ%dๆ—ฅ")
    }

@dataclass
class Entity(DataClassDictMixin):
    dt: date

    class Config:
        code_generation_options = [ADD_DIALECT_SUPPORT]

entity = Entity(date(2021, 12, 31))
entity.to_dict(dialect=EthiopianDialect)  # {'dt': '31/12/2021'}
entity.to_dict(dialect=JapaneseDialect)   # {'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}
Entity.from_dict({'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}, dialect=JapaneseDialect)

serialization_strategy dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope. You can register custom SerializationStrategy, serialize and deserialize methods for the specific types.

serialize_by_alias dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope.

omit_none dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope.

omit_default dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope.

namedtuple_as_dict dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope.

no_copy_collections dialect option

By default, all collection data types are serialized as a copy to prevent mutation of the original collection. As an example, if a dataclass contains a field of type list[str], then it will be serialized as a copy of the original list, so you can safely mutate it after. The downside is that copying is always slower than using a reference to the original collection. In some cases we know beforehand that mutation doesn't take place or is even desirable, so we can benefit from avoiding unnecessary copies by setting no_copy_collections to a sequence of origin collection data types. This is applicable only for collections containing elements that do not require conversion.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
from mashumaro.dialect import Dialect

class NoCopyDialect(Dialect):
    no_copy_collections = (list, dict, set)

@dataclass
class DataClass(DataClassDictMixin):
    simple_list: list[str]
    simple_dict: dict[str, str]
    simple_set: set[str]

    class Config(BaseConfig):
        dialect = NoCopyDialect

obj = DataClass(["foo"], {"bar": "baz"}, {"foobar"})
data = obj.to_dict()

assert data["simple_list"] is obj.simple_list
assert data["simple_dict"] is obj.simple_dict
assert data["simple_set"] is obj.simple_set

This option is enabled for list and dict in the default dialects that belong to mixins and codecs for the following formats:

Changing the default dialect

You can change the default serialization and deserialization methods not only in the serialization_strategy config option but also using the dialect config option. If you have multiple dataclasses without a common parent class the default dialect can help you to reduce the number of code lines written:

@dataclass
class Entity(DataClassDictMixin):
    dt: date

    class Config:
        dialect = JapaneseDialect

entity = Entity(date(2021, 12, 31))
entity.to_dict()  # {'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}
assert Entity.from_dict({'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}) == entity

Default dialect can also be set when using codecs:

from mashumaro.codecs import BasicDecoder, BasicEncoder

@dataclass
class Entity:
    dt: date

decoder = BasicDecoder(Entity, default_dialect=JapaneseDialect)
encoder = BasicEncoder(Entity, default_dialect=JapaneseDialect)

entity = Entity(date(2021, 12, 31))
encoder.encode(entity) # {'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}
assert decoder.decode({'dt': '2021ๅนด12ๆœˆ31ๆ—ฅ'}) == entity

Discriminator

There is a special Discriminator class that allows you to customize how a union of dataclasses or their hierarchy will be deserialized. It has the following parameters that affects class selection rules:

  • field โ€” optional name of the input dictionary key (also known as tag) by which all the variants can be distinguished
  • include_subtypes โ€” allow to deserialize subclasses
  • include_supertypes โ€” allow to deserialize superclasses
  • variant_tagger_fn โ€” a custom function used to generate tag values associated with a variant

By default, each variant that you want to discriminate by tags should have a class-level attribute containing an associated tag value. This attribute should have a name defined by field parameter. The tag value coule be in the following forms:

  • without annotations: type = 42
  • annotated as ClassVar: type: ClassVar[int] = 42
  • annotated as Final: type: Final[int] = 42
  • annotated as Literal: type: Literal[42] = 42
  • annotated as StrEnum: type: ResponseType = ResponseType.OK

Note

Keep in mind that by default only Final, Literal and StrEnum fields are processed during serialization.

However, it is possible to use discriminator without the class-level attribute. You can provide a custom function that generates one or many variant tag values. This function should take a class as the only argument and return either a single value of the basic type like str or int or a list of them to associate multiple tags with a variant. The common practice is to use a class name as a single tag value:

variant_tagger_fn = lambda cls: cls.__name__

Next, we will look at different use cases, as well as their pros and cons.

Subclasses distinguishable by a field

Often you have a base dataclass and multiple subclasses that are easily distinguishable from each other by the value of a particular field. For example, there may be different events, messages or requests with a discriminator field "event_type", "message_type" or just "type". You could've listed all of them within Union type, but it would be too verbose and impractical. Moreover, deserialization of the union would be slow, since we need to iterate over each variant in the list until we find the right one.

We can improve subclass deserialization using Discriminator as annotation within Annotated type. We will use field parameter and set include_subtypes to True.

Important

The discriminator field should be accessible from the __dict__ attribute of a specific descendant, i.e. defined at the level of that descendant. A descendant class without a discriminator field will be ignored, but its descendants won't.

Suppose we have a hierarchy of client events distinguishable by a class attribute "type":

from dataclasses import dataclass
from ipaddress import IPv4Address
from mashumaro import DataClassDictMixin

@dataclass
class ClientEvent(DataClassDictMixin):
    pass

@dataclass
class ClientConnectedEvent(ClientEvent):
    type = "connected"
    client_ip: IPv4Address

@dataclass
class ClientDisconnectedEvent(ClientEvent):
    type = "disconnected"
    client_ip: IPv4Address

We use base dataclass ClientEvent for a field of another dataclass:

from typing import Annotated, List
# or from typing_extensions import Annotated
from mashumaro.types import Discriminator


@dataclass
class AggregatedEvents(DataClassDictMixin):
    list: List[
        Annotated[
            ClientEvent, Discriminator(field="type", include_subtypes=True)
        ]
    ]

Now we can deserialize events based on "type" value:

events = AggregatedEvents.from_dict(
    {
        "list": [
            {"type": "connected", "client_ip": "10.0.0.42"},
            {"type": "disconnected", "client_ip": "10.0.0.42"},
        ]
    }
)
assert events == AggregatedEvents(
    list=[
        ClientConnectedEvent(client_ip=IPv4Address("10.0.0.42")),
        ClientDisconnectedEvent(client_ip=IPv4Address("10.0.0.42")),
    ]
)

Subclasses without a common field

In rare cases you have to deal with subclasses that don't have a common field name which they can be distinguished by. Since Discriminator can be initialized without "field" parameter you can use it with only include_subclasses enabled. The drawback is that we will have to go through all the subclasses until we find the suitable one. It's almost like using Union type but with subclasses support.

Suppose we're making a brunch. We have some ingredients:

@dataclass
class Ingredient(DataClassDictMixin):
    name: str

@dataclass
class Hummus(Ingredient):
    made_of: Literal["chickpeas", "beet", "artichoke"]
    grams: int

@dataclass
class Celery(Ingredient):
    pieces: int

Let's create a plate:

@dataclass
class Plate(DataClassDictMixin):
    ingredients: List[
        Annotated[Ingredient, Discriminator(include_subtypes=True)]
    ]

And now we can put our ingredients on the plate:

plate = Plate.from_dict(
    {
        "ingredients": [
            {
                "name": "hummus from the shop",
                "made_of": "chickpeas",
                "grams": 150,
            },
            {"name": "celery from my garden", "pieces": 5},
        ]
    }
)
assert plate == Plate(
    ingredients=[
        Hummus(name="hummus from the shop", made_of="chickpeas", grams=150),
        Celery(name="celery from my garden", pieces=5),
    ]
)

In some cases it's necessary to fall back to the base class if there is no suitable subclass. We can set include_supertypes to True:

@dataclass
class Plate(DataClassDictMixin):
    ingredients: List[
        Annotated[
            Ingredient,
            Discriminator(include_subtypes=True, include_supertypes=True),
        ]
    ]

plate = Plate.from_dict(
    {
        "ingredients": [
            {
                "name": "hummus from the shop",
                "made_of": "chickpeas",
                "grams": 150,
            },
            {"name": "celery from my garden", "pieces": 5},
            {"name": "cumin"}  # <- new unknown ingredient
        ]
    }
)
assert plate == Plate(
    ingredients=[
        Hummus(name="hummus from the shop", made_of="chickpeas", grams=150),
        Celery(name="celery from my garden", pieces=5),
        Ingredient(name="cumin"),  # <- unknown ingredient added
    ]
)

Class level discriminator

It may often be more convenient to specify a Discriminator once at the class level and use that class without Annotated type for subclass deserialization. Depending on the Discriminator parameters, it can be used as a replacement for subclasses distinguishable by a field as well as for subclasses without a common field. The only difference is that you can't use include_supertypes=True because it would lead to a recursion error.

Reworked example will look like this:

from dataclasses import dataclass
from ipaddress import IPv4Address
from typing import List
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
from mashumaro.types import Discriminator

@dataclass
class ClientEvent(DataClassDictMixin):
    class Config(BaseConfig):
        discriminator = Discriminator(  # <- add discriminator
            field="type",
            include_subtypes=True,
        )

@dataclass
class ClientConnectedEvent(ClientEvent):
    type = "connected"
    client_ip: IPv4Address

@dataclass
class ClientDisconnectedEvent(ClientEvent):
    type = "disconnected"
    client_ip: IPv4Address

@dataclass
class AggregatedEvents(DataClassDictMixin):
    list: List[ClientEvent]  # <- use base class here

And now we can deserialize events based on "type" value as we did earlier:

events = AggregatedEvents.from_dict(
    {
        "list": [
            {"type": "connected", "client_ip": "10.0.0.42"},
            {"type": "disconnected", "client_ip": "10.0.0.42"},
        ]
    }
)
assert events == AggregatedEvents(
    list=[
        ClientConnectedEvent(client_ip=IPv4Address("10.0.0.42")),
        ClientDisconnectedEvent(client_ip=IPv4Address("10.0.0.42")),
    ]
)

What's more interesting is that you can now deserialize subclasses simply by calling the superclass from_* method, which is very useful:

disconnected_event = ClientEvent.from_dict(
    {"type": "disconnected", "client_ip": "10.0.0.42"}
)
assert disconnected_event == ClientDisconnectedEvent(IPv4Address("10.0.0.42"))

The same is applicable for subclasses without a common field:

@dataclass
class Ingredient(DataClassDictMixin):
    name: str

    class Config:
        discriminator = Discriminator(include_subtypes=True)

...

celery = Ingredient.from_dict({"name": "celery from my garden", "pieces": 5})
assert celery == Celery(name="celery from my garden", pieces=5)

Working with union of classes

Deserialization of union of types distinguishable by a particular field will be much faster using Discriminator because there will be no traversal of all classes and an attempt to deserialize each of them. Usually this approach can be used when you have multiple classes without a common superclass or when you only need to deserialize some of the subclasses. In the following example we will use include_supertypes=True to deserialize two subclasses out of three:

from dataclasses import dataclass
from typing import Annotated, Literal, Union
# or from typing_extensions import Annotated
from mashumaro import DataClassDictMixin
from mashumaro.types import Discriminator

@dataclass
class Event(DataClassDictMixin):
    pass

@dataclass
class Event1(Event):
    code: Literal[1] = 1
    ...

@dataclass
class Event2(Event):
    code: Literal[2] = 2
    ...

@dataclass
class Event3(Event):
    code: Literal[3] = 3
    ...

@dataclass
class Message(DataClassDictMixin):
    event: Annotated[
        Union[Event1, Event2],
        Discriminator(field="code", include_supertypes=True),
    ]

event1_msg = Message.from_dict({"event": {"code": 1, ...}})
event2_msg = Message.from_dict({"event": {"code": 2, ...}})
assert isinstance(event1_msg.event, Event1)
assert isinstance(event2_msg.event, Event2)

# raises InvalidFieldValue:
Message.from_dict({"event": {"code": 3, ...}})

Again, it's not necessary to have a common superclass. If you have a union of dataclasses without a field that they can be distinguishable by, you can still use Discriminator, but deserialization will almost be the same as for Union type without Discriminator except that it could be possible to deserialize subclasses with include_subtypes=True.

Important

When both include_subtypes and include_supertypes are enabled, all subclasses will be attempted to be deserialized first, superclasses โ€” at the end.

In the following example you can see how priority works โ€” first we try to deserialize ChickpeaHummus, and if it fails, then we try Hummus:

@dataclass
class Hummus(DataClassDictMixin):
    made_of: Literal["chickpeas", "artichoke"]
    grams: int

@dataclass
class ChickpeaHummus(Hummus):
    made_of: Literal["chickpeas"]

@dataclass
class Celery(DataClassDictMixin):
    pieces: int

@dataclass
class Plate(DataClassDictMixin):
    ingredients: List[
        Annotated[
            Union[Hummus, Celery],
            Discriminator(include_subtypes=True, include_supertypes=True),
        ]
    ]

plate = Plate.from_dict(
    {
        "ingredients": [
            {"made_of": "chickpeas", "grams": 100},
            {"made_of": "artichoke", "grams": 50},
            {"pieces": 4},
        ]
    }
)
assert plate == Plate(
    ingredients=[
        ChickpeaHummus(made_of='chickpeas', grams=100),  # <- subclass
        Hummus(made_of='artichoke', grams=50),  # <- superclass
        Celery(pieces=4),
    ]
)

Using a custom variant tagger function

Sometimes it is impractical to have a class-level attribute with a tag value, especially when you have a lot of classes. We can have a custom tagger function instead. This method is applicable for all scenarios of using the discriminator, but for demonstration purposes, let's focus only on one of them.

Suppose we want to use the middle part of Client*Event as a tag value:

from dataclasses import dataclass
from ipaddress import IPv4Address
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
from mashumaro.types import Discriminator


def client_event_tagger(cls):
    # not the best way of doing it, it's just a demo
    return cls.__name__[6:-5].lower()

@dataclass
class ClientEvent(DataClassDictMixin):
    class Config(BaseConfig):
        discriminator = Discriminator(
            field="type",
            include_subtypes=True,
            variant_tagger_fn=client_event_tagger,
        )

@dataclass
class ClientConnectedEvent(ClientEvent):
    client_ip: IPv4Address

@dataclass
class ClientDisconnectedEvent(ClientEvent):
    client_ip: IPv4Address

We can now deserialize subclasses as we did it earlier without variant tagger:

disconnected_event = ClientEvent.from_dict(
    {"type": "disconnected", "client_ip": "10.0.0.42"}
)
assert disconnected_event == ClientDisconnectedEvent(IPv4Address("10.0.0.42"))

If we need to associate multiple tags with a single variant, we can return a list of tags:

def client_event_tagger(cls):
    name = cls.__name__[6:-5]
    return [name.lower(), name.upper()]

Code generation options

Add omit_none keyword argument

If you want to have control over whether to skip None values on serialization you can add omit_none parameter to to_* methods using the code_generation_options list. The default value of omit_none parameter depends on whether the omit_none config option or omit_none dialect option is enabled.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig, TO_DICT_ADD_OMIT_NONE_FLAG

@dataclass
class Inner(DataClassDictMixin):
    x: int = None
    # "x" won't be omitted since there is no TO_DICT_ADD_OMIT_NONE_FLAG here

@dataclass
class Model(DataClassDictMixin):
    x: Inner
    a: int = None
    b: str = None  # will be omitted

    class Config(BaseConfig):
        code_generation_options = [TO_DICT_ADD_OMIT_NONE_FLAG]

Model(x=Inner(), a=1).to_dict(omit_none=True)  # {'x': {'x': None}, 'a': 1}

Add by_alias keyword argument

If you want to have control over whether to serialize fields by their aliases you can add by_alias parameter to to_* methods using the code_generation_options list. The default value of by_alias parameter depends on whether the serialize_by_alias config option is enabled.

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options
from mashumaro.config import BaseConfig, TO_DICT_ADD_BY_ALIAS_FLAG

@dataclass
class DataClass(DataClassDictMixin):
    field_a: int = field(metadata=field_options(alias="FieldA"))

    class Config(BaseConfig):
        code_generation_options = [TO_DICT_ADD_BY_ALIAS_FLAG]

DataClass(field_a=1).to_dict()  # {'field_a': 1}
DataClass(field_a=1).to_dict(by_alias=True)  # {'FieldA': 1}

Add dialect keyword argument

Support for dialects is disabled by default for performance reasons. You can enable it using a ADD_DIALECT_SUPPORT constant:

from dataclasses import dataclass
from datetime import date
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig, ADD_DIALECT_SUPPORT

@dataclass
class Entity(DataClassDictMixin):
    dt: date

    class Config(BaseConfig):
        code_generation_options = [ADD_DIALECT_SUPPORT]

Add context keyword argument

Sometimes it's needed to pass a "context" object to the serialization hooks that will take it into account. For example, you could want to have an option to remove sensitive data from the serialization result if you need to. You can add context parameter to to_* methods that will be passed to __pre_serialize__ and __post_serialize__ hooks. The type of this context as well as its mutability is up to you.

from dataclasses import dataclass
from typing import Dict, Optional
from uuid import UUID
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig, ADD_SERIALIZATION_CONTEXT

class BaseModel(DataClassDictMixin):
    class Config(BaseConfig):
        code_generation_options = [ADD_SERIALIZATION_CONTEXT]

@dataclass
class Account(BaseModel):
    id: UUID
    username: str
    name: str

    def __pre_serialize__(self, context: Optional[Dict] = None):
        return self

    def __post_serialize__(self, d: Dict, context: Optional[Dict] = None):
        if context and context.get("remove_sensitive_data"):
            d["username"] = "***"
            d["name"] = "***"
        return d

@dataclass
class Session(BaseModel):
    id: UUID
    key: str
    account: Account

    def __pre_serialize__(self, context: Optional[Dict] = None):
        return self

    def __post_serialize__(self, d: Dict, context: Optional[Dict] = None):
        if context and context.get("remove_sensitive_data"):
            d["key"] = "***"
        return d


foo = Session(
    id=UUID('03321c9f-6a97-421e-9869-918ff2867a71'),
    key="VQ6Q9bX4c8s",
    account=Account(
        id=UUID('4ef2baa7-edef-4d6a-b496-71e6d72c58fb'),
        username="john_doe",
        name="John"
    )
)
assert foo.to_dict() == {
    'id': '03321c9f-6a97-421e-9869-918ff2867a71',
    'key': 'VQ6Q9bX4c8s',
    'account': {
        'id': '4ef2baa7-edef-4d6a-b496-71e6d72c58fb',
        'username': 'john_doe',
        'name': 'John'
    }
}
assert foo.to_dict(context={"remove_sensitive_data": True}) == {
    'id': '03321c9f-6a97-421e-9869-918ff2867a71',
    'key': '***',
    'account': {
        'id': '4ef2baa7-edef-4d6a-b496-71e6d72c58fb',
        'username': '***',
        'name': '***'
    }
}

Generic dataclasses

Along with user-defined generic types implementing SerializableType interface, generic and variadic generic dataclasses can also be used. There are two applicable scenarios for them.

Generic dataclass inheritance

If you have a generic dataclass and want to serialize and deserialize its instances depending on the concrete types, you can use inheritance for that:

from dataclasses import dataclass
from datetime import date
from typing import Generic, Mapping, TypeVar, TypeVarTuple
from mashumaro import DataClassDictMixin

KT = TypeVar("KT")
VT = TypeVar("VT", date, str)
Ts = TypeVarTuple("Ts")

@dataclass
class GenericDataClass(Generic[KT, VT, *Ts]):
    x: Mapping[KT, VT]
    y: Tuple[*Ts, KT]

@dataclass
class ConcreteDataClass(
    GenericDataClass[str, date, *Tuple[float, ...]],
    DataClassDictMixin,
):
    pass

ConcreteDataClass.from_dict({"x": {"a": "2021-01-01"}, "y": [1, 2, "a"]})
# ConcreteDataClass(x={'a': datetime.date(2021, 1, 1)}, y=(1.0, 2.0, 'a'))

You can override TypeVar field with a concrete type or another TypeVar. Partial specification of concrete types is also allowed. If a generic dataclass is inherited without type overriding the types of its fields remain untouched.

Generic dataclass in a field type

Another approach is to specify concrete types in the field type hints. This can help to have different versions of the same generic dataclass:

from dataclasses import dataclass
from datetime import date
from typing import Generic, TypeVar
from mashumaro import DataClassDictMixin

T = TypeVar('T')

@dataclass
class GenericDataClass(Generic[T], DataClassDictMixin):
    x: T

@dataclass
class DataClass(DataClassDictMixin):
    date: GenericDataClass[date]
    str: GenericDataClass[str]

instance = DataClass(
    date=GenericDataClass(x=date(2021, 1, 1)),
    str=GenericDataClass(x='2021-01-01'),
)
dictionary = {'date': {'x': '2021-01-01'}, 'str': {'x': '2021-01-01'}}
assert DataClass.from_dict(dictionary) == instance

GenericSerializableType interface

There is a generic alternative to SerializableType called GenericSerializableType. It makes it possible to decide yourself how to serialize and deserialize input data depending on the types provided:

from dataclasses import dataclass
from datetime import date
from typing import Dict, TypeVar
from mashumaro import DataClassDictMixin
from mashumaro.types import GenericSerializableType

KT = TypeVar("KT")
VT = TypeVar("VT")

class DictWrapper(Dict[KT, VT], GenericSerializableType):
    __packers__ = {date: lambda x: x.isoformat(), str: str}
    __unpackers__ = {date: date.fromisoformat, str: str}

    def _serialize(self, types) -> Dict[KT, VT]:
        k_type, v_type = types
        k_conv = self.__packers__[k_type]
        v_conv = self.__packers__[v_type]
        return {k_conv(k): v_conv(v) for k, v in self.items()}

    @classmethod
    def _deserialize(cls, value, types) -> "DictWrapper[KT, VT]":
        k_type, v_type = types
        k_conv = cls.__unpackers__[k_type]
        v_conv = cls.__unpackers__[v_type]
        return cls({k_conv(k): v_conv(v) for k, v in value.items()})

@dataclass
class DataClass(DataClassDictMixin):
    x: DictWrapper[date, str]
    y: DictWrapper[str, date]

input_data = {
    "x": {"2022-12-07": "2022-12-07"},
    "y": {"2022-12-07": "2022-12-07"},
}
obj = DataClass.from_dict(input_data)
assert obj == DataClass(
    x=DictWrapper({date(2022, 12, 7): "2022-12-07"}),
    y=DictWrapper({"2022-12-07": date(2022, 12, 7)}),
)
assert obj.to_dict() == input_data

As you can see, the code turns out to be massive compared to the alternative but in rare cases such flexibility can be useful. You should think twice about whether it's really worth using it.

Serialization hooks

In some cases you need to prepare input / output data or do some extraordinary actions at different stages of the deserialization / serialization lifecycle. You can do this with different types of hooks.

Before deserialization

For doing something with a dictionary that will be passed to deserialization you can use __pre_deserialize__ class method:

@dataclass
class A(DataClassJSONMixin):
    abc: int

    @classmethod
    def __pre_deserialize__(cls, d: Dict[Any, Any]) -> Dict[Any, Any]:
        return {k.lower(): v for k, v in d.items()}

print(DataClass.from_dict({"ABC": 123}))    # DataClass(abc=123)
print(DataClass.from_json('{"ABC": 123}'))  # DataClass(abc=123)

After deserialization

For doing something with a dataclass instance that was created as a result of deserialization you can use __post_deserialize__ class method:

@dataclass
class A(DataClassJSONMixin):
    abc: int

    @classmethod
    def __post_deserialize__(cls, obj: 'A') -> 'A':
        obj.abc = 456
        return obj

print(DataClass.from_dict({"abc": 123}))    # DataClass(abc=456)
print(DataClass.from_json('{"abc": 123}'))  # DataClass(abc=456)

Before serialization

For doing something before serialization you can use __pre_serialize__ method:

@dataclass
class A(DataClassJSONMixin):
    abc: int
    counter: ClassVar[int] = 0

    def __pre_serialize__(self) -> 'A':
        self.counter += 1
        return self

obj = DataClass(abc=123)
obj.to_dict()
obj.to_json()
print(obj.counter)  # 2

Note that you can add an additional context argument using the corresponding code generation option.

After serialization

For doing something with a dictionary that was created as a result of serialization you can use __post_serialize__ method:

@dataclass
class A(DataClassJSONMixin):
    user: str
    password: str

    def __post_serialize__(self, d: Dict[Any, Any]) -> Dict[Any, Any]:
        d.pop('password')
        return d

obj = DataClass(user="name", password="secret")
print(obj.to_dict())  # {"user": "name"}
print(obj.to_json())  # '{"user": "name"}'

Note that you can add an additional context argument using the corresponding code generation option.

JSON Schema

You can build JSON Schema not only for dataclasses but also for any other supported data types. There is support for the following standards:

Building JSON Schema

For simple one-time cases it's recommended to start from using a configurable build_json_schema function. It returns JSONSchema object that can be serialized to json or to dict:

from dataclasses import dataclass, field
from typing import List
from uuid import UUID

from mashumaro.jsonschema import build_json_schema


@dataclass
class User:
    id: UUID
    name: str = field(metadata={"description": "User name"})


print(build_json_schema(List[User]).to_json())
Click to show the result
{
    "type": "array",
    "items": {
        "type": "object",
        "title": "User",
        "properties": {
            "id": {
                "type": "string",
                "format": "uuid"
            },
            "name": {
                "type": "string",
                "description": "User name"
            }
        },
        "additionalProperties": false,
        "required": [
            "id",
            "name"
        ]
    }
}

Additional validation keywords (see below) can be added using annotations:

from typing import Annotated, List
from mashumaro.jsonschema import build_json_schema
from mashumaro.jsonschema.annotations import Maximum, MaxItems

print(
    build_json_schema(
        Annotated[
            List[Annotated[int, Maximum(42)]],
            MaxItems(4)
        ]
    ).to_json()
)
Click to show the result
{
    "type": "array",
    "items": {
        "type": "integer",
        "maximum": 42
    },
    "maxItems": 4
}

The $schema keyword can be added by setting with_dialect_uri to True:

print(build_json_schema(str, with_dialect_uri=True).to_json())
Click to show the result
{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "string"
}

By default, Draft 2022-12 dialect is being used, but you can change it to another one by setting dialect parameter:

from mashumaro.jsonschema import OPEN_API_3_1

print(
    build_json_schema(
        str, dialect=OPEN_API_3_1, with_dialect_uri=True
    ).to_json()
)
Click to show the result
{
    "$schema": "https://spec.openapis.org/oas/3.1/dialect/base",
    "type": "string"
}

All dataclass JSON Schemas can or can not be placed in the definitions section, depending on the all_refs parameter, which default value comes from a dialect used (False for Draft 2022-12, True for OpenAPI Specification 3.1.0):

print(build_json_schema(List[User], all_refs=True).to_json())
Click to show the result
{
    "type": "array",
    "$defs": {
        "User": {
            "type": "object",
            "title": "User",
            "properties": {
                "id": {
                    "type": "string",
                    "format": "uuid"
                },
                "name": {
                    "type": "string"
                }
            },
            "additionalProperties": false,
            "required": [
                "id",
                "name"
            ]
        }
    },
    "items": {
        "$ref": "#/$defs/User"
    }
}

The definitions section can be omitted from the final document by setting with_definitions parameter to False:

print(
    build_json_schema(
        List[User], dialect=OPEN_API_3_1, with_definitions=False
    ).to_json()
)
Click to show the result
{
    "type": "array",
    "items": {
        "$ref": "#/components/schemas/User"
    }
}

Reference prefix can be changed by using ref_prefix parameter:

print(
    build_json_schema(
        List[User],
        all_refs=True,
        with_definitions=False,
        ref_prefix="#/components/responses",
    ).to_json()
)
Click to show the result
{
    "type": "array",
    "items": {
        "$ref": "#/components/responses/User"
    }
}

The omitted definitions could be found later in the Context object that you could have created and passed to the function, but it could be easier to use JSONSchemaBuilder for that. For example, you might found it handy to build OpenAPI Specification step by step passing your models to the builder and get all the registered definitions later. This builder has reasonable defaults but can be customized if necessary.

from mashumaro.jsonschema import JSONSchemaBuilder, OPEN_API_3_1

builder = JSONSchemaBuilder(OPEN_API_3_1)

@dataclass
class User:
    id: UUID
    name: str

@dataclass
class Device:
    id: UUID
    model: str

print(builder.build(List[User]).to_json())
print(builder.build(List[Device]).to_json())
print(builder.get_definitions().to_json())
Click to show the result
{
    "type": "array",
    "items": {
        "$ref": "#/components/schemas/User"
    }
}
{
    "type": "array",
    "items": {
        "$ref": "#/components/schemas/Device"
    }
}
{
    "User": {
        "type": "object",
        "title": "User",
        "properties": {
            "id": {
                "type": "string",
                "format": "uuid"
            },
            "name": {
                "type": "string"
            }
        },
        "additionalProperties": false,
        "required": [
            "id",
            "name"
        ]
    },
    "Device": {
        "type": "object",
        "title": "Device",
        "properties": {
            "id": {
                "type": "string",
                "format": "uuid"
            },
            "model": {
                "type": "string"
            }
        },
        "additionalProperties": false,
        "required": [
            "id",
            "model"
        ]
    }
}

JSON Schema constraints

Apart from required keywords, that are added automatically for certain data types, you're free to use additional validation keywords. They're presented by the corresponding classes in mashumaro.jsonschema.annotations:

Number constraints:

String constraints:

Array constraints:

Object constraints:

Extending JSON Schema

Using a Config class it is possible to override some parts of the schema. Currently, you can do the following:

  • override some field schemas using the "properties" key
  • change additionalProperties using the "additionalProperties" key
from dataclasses import dataclass
from mashumaro.jsonschema import build_json_schema

@dataclass
class FooBar:
    foo: str
    bar: int

    class Config:
        json_schema = {
            "properties": {
                "foo": {
                    "type": "string",
                    "description": "bar"
                }
            },
            "additionalProperties": True,
        }

print(build_json_schema(FooBar).to_json())
Click to show the result
{
    "type": "object",
    "title": "FooBar",
    "properties": {
        "foo": {
            "type": "string",
            "description": "bar"
        },
        "bar": {
            "type": "integer"
        }
    },
    "additionalProperties": true,
    "required": [
        "foo",
        "bar"
    ]
}

You can also change the "additionalProperties" key to a specific schema by passing it a JSONSchema instance instead of a bool value.

JSON Schema and custom serialization methods

Mashumaro provides different ways to override default serialization methods for dataclass fields or specific data types. In order for these overrides to be reflected in the schema, you need to make sure that the methods have annotations of the return value type.

from dataclasses import dataclass, field
from mashumaro.config import BaseConfig
from mashumaro.jsonschema import build_json_schema

def str_as_list(s: str) -> list[str]:
    return list(s)

def int_as_str(i: int) -> str:
    return str(i)

@dataclass
class FooBar:
    foo: str = field(metadata={"serialize": str_as_list})
    bar: int

    class Config(BaseConfig):
        serialization_strategy = {
            int: {
                "serialize": int_as_str
            }
        }

print(build_json_schema(FooBar).to_json())
Click to show the result
{
    "type": "object",
    "title": "FooBar",
    "properties": {
        "foo": {
            "type": "array",
            "items": {
                "type": "string"
            }
        },
        "bar": {
            "type": "string"
        }
    },
    "additionalProperties": false,
    "required": [
        "foo",
        "bar"
    ]
}

mashumaro's People

Contributors

dependabot[bot] avatar fatal1ty avatar gshank avatar kianmeng avatar matthew-chambers-pushly avatar mishamsk avatar peterallenwebb avatar ra80533 avatar sirkonst avatar to-bee avatar ydylla avatar zupo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mashumaro's Issues

Feature Request: Optional Serde By Alias

Hello I have a feature request that would be very beneficial to us. We are moving models in and out of a document store where attributes contribute to the size of the document. Because of this we want to shorten attributes when saving the document and rehydrate them coming out.

But there are places in our application where we don't have access to the deserialized dataclass and instead just have a dict. We want to abstract this so we don't have to know what the aliased fields are when working directly against a dict, and instead would like to have the ability to deserialize the dict based on either the alias or the unaliased keys.

FEATURE REQUEST: MIXED ALIAS DESERIALIZATION

Here is an example of a case where I'd like to build a new user entity from a dict. Then I'd like to deserialize it into my UserEntity. But while building my new user dict I don't want to have to know what the aliased fields are:

@dataclass
class UserEntity(DataClassDictMixin):
  name: str = field(metadata=field_options(alias="n"))
  age: str = field(metadata=field_options(alias="a"))

new_user = { "name": "Bill", "age": 30 }
new_user_ent = UserEntity.from_dict(new_user)

This doesn't work with DataClasDictMixin as far as I'm aware, so we created a helper mixin that translates unaliased fields to aliased fields before deserializing in the __pre_deserialization__ method:

ALLOW_UNALIASED_DESERIALIZATION = True

@classmethod
def __pre_deserialize__(cls, d: Dict[Any, Any]) -> Dict[Any, Any]:
    if cls.ALLOW_UNALIASED_DESERIALIZATION:
        alias_revmap = cls.get_alias_revmap()
        for field, alias in alias_revmap.items():
            if field in d:
                d[alias] = d[field]
                d.pop(field)
    return d

@classmethod
def get_alias_map(cls) -> Dict[str, str]:
    """
    If aliases are defined. Returns a map of the aliased fields to the non-aliased field names
    """
    if not hasattr(cls, "__alias_map__"):
        cls.__alias_map__ = {}
        for key, field in cls.__dataclass_fields__.items():
            alias = field.metadata.get("alias")
            if alias:
                cls.__alias_map__[alias] = key
    return cls.__alias_map__

@classmethod
def get_alias_revmap(cls) -> Dict[str, str]:
    """
    If aliases are defined. Returns a map of the non-aliased fields to the alias field names
    """
    if not hasattr(cls, "__alias_revmap__"):
        alias_map = cls.get_alias_map()
        cls.__alias_revamp__ = {
            v: k for k, v in alias_map.items()
        }
    return cls.__alias_revamp__

FEATURE REQUEST: OPTIONALLY DISABLE ALIAS SERIALIZATION

When serializing entities, aliases are not used by default. Once again we have a situation where we'd like to optionally opt out of alias serialization. This can be done with the TO_DICT_ADD_BY_ALIAS_FLAG, but perhaps an oversight is that you cannot make alias serialization the default and the opt out of it. If it is optional, it's only opt in.

@dataclass
class MyEntity(DataClassDictMixin):
    a: str = field(metadata=field_options(alias="FieldA"))
    b: str = field(metadata=field_options(alias="FieldB"))

    class Config(BaseConfig):
        serialize_by_alias = True
        code_generation_options = [TO_DICT_ADD_BY_ALIAS_FLAG]


def scratch():
    me = MyEntity(
        a="Hello",
        b="World"
    )
    print(me.to_dict())
    print(me.to_dict(by_alias=False))


if __name__ == "__main__":
    scratch()

In this sample I would expect print(me.to_dict()) to use aliases during serialization, and for print(me.to_dict(by_alias=False)) to opt out of the default. But even though I set the serialize_by_alias flag it will still use the unaliased fields during serialization because I've also added the TO_DICT_ADD_BY_ALIAS_FLAG.

I tried removing the TO_DICT_ADD_BY_ALIAS_FLAG entirely, but when you do that you no longer have access to by_alias in the to_dict() and it will throw this exception:

to_dict() got an unexpected keyword argument 'by_alias'

I tried to get around this by overwriting the to_dict method in both my helper mixin and entity class, but that code never gets hit.

    def to_dict(self, **kwargs) -> dict:
        print("HERE")
        return super().to_dict(**kwargs)

Whether as an instance method or a class method.


Thank you for taking the time to read through my feature request.

Python 3.8 support

There is an issue with the InitVar test for me, trying to port it to Python 3.8.

datetime parsing does not handle generic ISO-8601 strings

Currently the generated from_dict code calls datetime.fromisoformat():

elif origin_type in (datetime.datetime, datetime.date, datetime.time):
return f'{value_name} if use_datetime else ' \
f'datetime.{origin_type.__name__}.' \
f'fromisoformat({value_name})'

fromisoformat is only designed to invert the strings generated by datetime.isoformat():

This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of datetime.isoformat(). A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.

According to mashumaro's documentation:

use_datetime: False  # False - load datetime oriented objects from ISO 8601 formatted string, True - keep untouched

I believe it should be within mashumaro's scope to handle generic ISO-8601 strings

to_dict and from_dict do not support None as value for an Optional nested in Tuple

  • mashumaro version: 3.0
  • Python version: 3.6.9 (PyEnv, CPython)
  • Operating System: Ubuntu 20.04

Description

I'm trying to set an optional value in a Tuple (in the example Tuple[Optional[int], int]) and export it in YAML, but it results in a TypeError. By looking at the generated code (see below), we can see the Optional is ignored by the builder.

What I Did

from dataclasses import dataclass, field
from typing import Tuple, Optional

from mashumaro.mixins.yaml import DataClassYAMLMixin

@dataclass
class Foo(DataClassYAMLMixin):
    bar: Tuple[Optional[int], int] = field(default_factory=lambda: (None, 42))

print(Foo().to_dict())

Output:

Traceback (most recent call last):
  File "test2.py", line 17, in <module>
    print(Foo().to_dict())
  File "<string>", line 7, in to_dict
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Generated code for to_dict:

__main__.Foo:
def to_dict(self):
    kwargs = {}
    value = getattr(self, 'bar')
    if value is None:
        kwargs['bar'] = None
    else:
        kwargs['bar'] = [int(value[0]), int(value[1])]
        # should have been [int(value[0]) if value[0] is not None else None, int(value[1])]
    return kwargs
setattr(cls, 'to_dict', to_dict)

EDIT: from_dict also suffers from the same issue

Removing use_enum & use_datetime from from_dict() in favor of auto detection?

Hi,
why does the from_dict() function has use_enum & use_datetime as options, instead of an automatic isinstance({value_name}, {type_name(origin_type)}) check?

I think an instance check right at the beginning near the value is None check would also be useful for other special types like UUID, IPv4Address or Path where otherwise the current from_dict function would fail if it already contains a valid instance of the origin type.

Are you open for a pull request which removes the two options? Do you think it would even be possible? Or is it not inline with your design goals.

TestCase for CodeBuilder.defaults property

It seems to me as if the defaults property of the CodeBuilder is nurtured by
self.cls.__dict__ (via namepsace property).

If I look at an actual class I got:
cls.__dataclass_fields__ holding a mapping from field names to dataclass Field instances, which in turn has either a default or a default_factory.

Maybe the default property should read values from the cls.__dataclass_fields__ just as normally instanciated dataclass? (or at least not raise a MissingError, as cls(**kwargs) would fill them in afterwards, and only break if default is a NoneType)

field serde specification does not work?

Following the README, I'm trying to add a custom serde for a numpy array:

from dataclasses import dataclass
from dataclasses import field
import pickle

from mashumaro import DataClassMessagePackMixin
import numpy as np

@dataclass
class Numpy(DataClassMessagePackMixin):
    data: np.ndarray = field(metadata={'serialize': pickle.dumps, 'deserialize': pickle.loads})

It does not seem to work, as mashumaru complains:

UnserializableField: Field "data" of type numpy.ndarray in __main__.Numpy is not serializable

Python 3.8.10, mashumaro 2.6.2

P.S. (also consider adding mashumaro.__version__ property, currently I could not figure out how to check the mashumaro version without trying to pip install it again)

String fields nether validated nor casted

I am decalring class with str field. I expect, that after parsing it will contain only str data. Possibilites are: cast any other types to str or raise a validation error during parsing. None of this happends

E.G, this code works and obj.id will be a list though it is declared as str

@dataclass
class MashumaroTodo(DataClassDictMixin):
    id: str


obj = MashumaroTodo.from_dict({"id": ["invalid data"]})

generate schema from models?

is there any way to generate a data dict of the model structure (field + type)? This would allow to generate typescript definitions for example ala poor mans openAPI generator type tool?

Support Union type

I read in the TODO list that the support of Union types is planned. Is there any timeline?

Example use case:

@dataclass 
class CustomShape(DataClassYAMLMixin):
    name: str
    num_corners: int

@dataclass
class ShapeCollection(DataClassYAMLMixin):
    shapes: List[Union[str, CustomShape]]
# shapes.yaml
shapes:
  - triangle
  - name: square
    num_corners: 4

Expected parsed structure:

with open(file="shapes.yaml", mode='r') as f:
    shapes: ShapeCollection = ShapeCollection.from_yaml(data=f)

print(shapes)
# ShapeCollection(shapes=["triangle, CustomShape(name=square, num_corners: 4)"])

(btw: congrats for the project, it makes the most out from the combination of dataclasses and YAML)

Alias option won't work with slots enabled

Hi!
Just noticed that if i enable slots (to reduce memory usage) alias option doesn't work

@dataclass(slots=True)
class DataClass(DataClassJSONMixin):
    a: int = field(metadata=field_options(alias="FieldA"))
    b: int = field(metadata=field_options(alias="#invalid"))

x = DataClass.from_dict({"FieldA": 1, "#invalid": 2})  # DataClass(a=1, b=2)
print(x)
x.to_dict()  # {"a": 1, "b": 2}  # no aliases on serialization by default

got error

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    x = DataClass.from_dict({"FieldA": 1, "#invalid": 2})  # DataClass(a=1, b=2)
  File "<string>", line 26, in from_dict
TypeError: DataClass.__init__() missing 2 required positional arguments: 'a' and 'b'

don't know if there is some workaround, probably better to mention that in docs

Using `InitVar` with mashumaro

First up, thanks for the library. It's providing really useful for something I started working on.

I've run into one problem. I'd like some properties of the dataclass to be ignored for most purposes. InitVar appears ideal for this usecase, https://docs.python.org/3/library/dataclasses.html#init-only-variables

If a field is an InitVar, it is considered a pseudo-field called an init-only field. As it is not a true field, it is not returned by the module-level fields() function. Init-only fields are added as parameters to the generated init() method, and are passed to the optional post_init() method. They are not otherwise used by dataclasses.

However, adding an InitVar seems to trip mashuramo. It complains that it's not serializable. I would expect InitVars to be ignored, because they are explicitly not intended to be part of the final serialisation.

class Organization(object):
     api_token: InitVar[str] = None
raise UnserializableField(fname, ftype, parent)
mashumaro.exceptions.UnserializableField: Field "api_token" of type dataclasses.InitVar in models.Organization is not serializable
  • Are there any work arounds at the moment?
  • Does the above make sense, should InitVars be ignored by mashuramo's validation?

Possible to serialize a top-level list/array?

JSON allows an array at top level (instead of an object). It would be nice if we have eg a List[MyDataClass] to be able to serialize this directly without a wrapper. Is this possible in mashumaro?

Currently I'm working around this like follows:

json = f"[{','.join([item.to_json() for item in my_list])}]"

Problem with function type_name, exact in the __qualname__ attribute

Hello. Recently, while adapting my project code to the mashumaro library, I encountered a strange bug. Here is the code itself to play the bug:

from enum import Enum
from typing import Set
from dataclasses import dataclass
from mashumaro import DataClassJSONMixin

class testing:
    
    @staticmethod
    def pets():
        
        class PetType(Enum):
            CAT = 'CAT'
            MOUSE = 'MOUSE'

        @dataclass(unsafe_hash=True)
        class Pet(DataClassJSONMixin):
            name: str
            age: int
            pet_type: PetType

        @dataclass
        class Person(DataClassJSONMixin):
            first_name: str
            second_name: str
            age: int
            pets: Set[Pet]


        tom = Pet(name='Tom', age=5, pet_type=PetType.CAT)
        jerry = Pet(name='Jerry', age=3, pet_type=PetType.MOUSE)
        john = Person(first_name='John', second_name='Smith', age=18, pets={tom, jerry})

        dump = john.to_json()
        person = Person.from_json(dump)

testing.pets()

I made this code based on an official example. Here is the runtime error itself:

Traceback (most recent call last):
  File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/iiec_run.py", line 31, in <module>
    start(fakepyfile,mainpyfile)
  File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/iiec_run.py", line 30, in start
    exec(open(mainpyfile).read(),  __main__.__dict__)
  File "<string>", line 36, in <module>
  File "<string>", line 16, in pets
  File "/data/user/0/ru.iiec.pydroid3/files/arm-linux-androideabi/lib/python3.8/site-packages/mashumaro/serializer/base/dict.py", line 21, in __init_subclass__
    raise exc
  File "/data/user/0/ru.iiec.pydroid3/files/arm-linux-androideabi/lib/python3.8/site-packages/mashumaro/serializer/base/dict.py", line 13, in __init_subclass__
    builder.add_from_dict()
  File "/data/user/0/ru.iiec.pydroid3/files/arm-linux-androideabi/lib/python3.8/site-packages/mashumaro/serializer/base/metaprogramming.py", line 323, in add_from_dict
    self.compile()
  File "/data/user/0/ru.iiec.pydroid3/files/arm-linux-androideabi/lib/python3.8/site-packages/mashumaro/serializer/base/metaprogramming.py", line 191, in compile
    exec(self.lines.as_text(), globals(), self.__dict__)
  File "<string>", line 43
    kwargs['pet_type'] = value if use_enum else __main__.testing.pets.<locals>.PetType(value)
                                                                      ^
SyntaxError: invalid syntax

[Program finished]

The problem seems to be the __qualname__ attribute, which is in the function type_name, of the file helpers.py. I tried to exclude the closure names (name.<locals>) that gives the qualname attribute, but failed.

Serialization strategy for TypeVar does not work in BaseConfig

I find when I define a TypeVar variables of that type are always serialized with the first type listed in the TypeVar definition.

In an attempt to fix that I tired to set a global serialization strategy, however that strategy was never selected, rather, the selection seems to be also based on the first type listed in the TypeVar. Quite a bit of the code in metaprograming.py seems to handle a TypeVar so I wonder if I have something else missing?

The basic code is:

Numeric = TypeVar('Numeric', float, int)  # <== here float is first in the list, seems to be selected as the key for serialization strategy.

class NumericStrategy(SerializationStrategy):
    def serialize(self, value):
        return int(value)

    def deserialize(self, value):
        return int(value)

@dataclass
class foo(DataClassYAMLMixin):
    bar: Numeric = 1
    class Config(BaseConfig):
            serialization_strategy = {
                Numeric: NumericStrategy(),  # <== this is the line I think should activate the strategy
                int: NumericStrategy(),
                #float: NumericStrategy(),  # <== it seems that only this line would activate the strategy
            }

Mashumaro Doesn't Understand Imports In Imported File

my_enum.py:

class MyEnum(str, Enum):
  a = "A"
  b = "B"

my_base_class.py

from my_enum import MyEnum
class MyBaseClass(DataClassDictMixin):
  my_enum: MyEnum

my_class.py

from my_base_class import MyBaseClass

class MyClass(DataClassDictMixin, MyBaseClass):
  hello: str

main.py

from my_class import MyClass
print("hello")

Gives an error like:

name 'MyEnum' is not defined
Traceback (most recent call last):
name 'MyEnum' is not defined

  ...

  File "....", line 5, in <module>
    class MyClass(MyBaseClass):
  File "..../.pyenv/versions/3.7.7/lib/python3.7/abc.py", line 126, in __new__
    cls = super().__new__(mcls, name, bases, namespace, **kwargs)
  File "..../lib/python3.7/site-packages/mashumaro/serializer/base/dict.py", line 23, in __init_subclass__
    raise exc
  File "..../lib/python3.7/site-packages/mashumaro/serializer/base/dict.py", line 19, in __init_subclass__
    builder.add_to_dict()
  File "..../lib/python3.7/site-packages/mashumaro/serializer/base/metaprogramming.py", line 430, in add_to_dict
    for fname, ftype in self.field_types.items():
  File "..../lib/python3.7/site-packages/mashumaro/serializer/base/metaprogramming.py", line 169, in field_types
    return self.__get_field_types()
  File "..../lib/python3.7/site-packages/mashumaro/serializer/base/metaprogramming.py", line 144, in __get_field_types
    for fname, ftype in typing.get_type_hints(self.cls, globalns).items():
  File "..../.pyenv/versions/3.7.7/lib/python3.7/typing.py", line 982, in get_type_hints
    value = _eval_type(value, base_globals, localns)
  File "..../.pyenv/versions/3.7.7/lib/python3.7/typing.py", line 263, in _eval_type
    return t._evaluate(globalns, localns)
  File "..../.pyenv/versions/3.7.7/lib/python3.7/typing.py", line 468, in _evaluate
    eval(self.__forward_code__, globalns, localns),
  File "<string>", line 1, in <module>
  
NameError: name 'MyEnum' is not defined

When stepping through the debugger it looks like MyEnum is not in the globalns or localns when it tries to build a builder for the subclass.

How can I fix this without needing to import all parent class dependencies as imports to the subclasses?

Validation Fails on List[Optional[int]] types

Description

Mashumaro seems to fail on types of type: List[Optional[int]] for values that contain None. For example: [2, None].
It appears what might be happening is that mashumaro is trying to convert each element to an int and upon failure will raise.

Is this possibly a bug?
Thanks for the great library by the way!

Here is a minimal working example:

from __future__ import annotations

from dataclasses import dataclass
from typing import List, Union

from mashumaro import DataClassJSONMixin


@dataclass
class MyModel(DataClassJSONMixin):
    my_list: List[Union[int, None]]


@dataclass
class MyModel2(DataClassJSONMixin):
    my_list: List[Union[int, None, str]]


if __name__ == '__main__':
    # Works fine
    model_1_a = MyModel.from_dict({'my_list': [120, 1]})
    print(model_1_a)
    # prints: MyModel(my_list=[120, 1])

    # prints:
    # Field "my_list" of type typing.List[typing.Union[int, NoneType]] in __main__.MyModel has invalid value [None, 1]
    # The underlying error appears to be:
    # TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
    try:
        model_1_b = MyModel.from_dict({'my_list': [None, 1]})
    except Exception as error:
        print(error)

    model_2_a = MyModel2.from_dict({'my_list': [None, 1]})
    print(model_2_a)
    # prints: MyModel2(my_list=[None, 1])

Temporary solution

A temporary workaround, for those that may have also encountered this is to define your own deserializer using field(metadata={'deserializer': ...}):

from __future__ import annotations

from dataclasses import dataclass, field
from typing import List, Union, Optional

from mashumaro import DataClassJSONMixin


def _deserialize_list_optional_int(items: List[Optional[int]]) -> List[Optional[int]]:
    return [
        int(item) if item is not None else None
        for item in items
    ]


@dataclass
class MyModel(DataClassJSONMixin):
    my_list: List[Union[int, None]] = field(metadata={
        'deserialize': _deserialize_list_optional_int})


if __name__ == '__main__':
    # Works fine
    model_1_a = MyModel.from_dict({'my_list': [120, 1]})
    print(model_1_a)
    # prints: MyModel(my_list=[120, 1])

    model_1_b = MyModel.from_dict({'my_list': [None, 1]})
    print(model_1_b)

PEP-563 breaks SerializationStrategy

  1. SerializationStrategy example with DateTimeFormats works fine in py3.7
    https://github.com/Fatal1ty/mashumaro#user-defined-classes

  2. however introduction of PEP-563 via from __future__ import annotations
    https://www.python.org/dev/peps/pep-0563/
    breaks usage of SerializationStrategy with error trace:

Traceback (most recent call last):
  File ".../src/verify/code/mashu/mashu_pep563.py", line 22, in <module>
    class DateTimeFormats(DataClassDictMixin):
  File "/usr/lib/python3.7/site-packages/mashumaro/serializer/base/dict.py", line 19, in __init_subclass__
    raise exc
  File "/usr/lib/python3.7/site-packages/mashumaro/serializer/base/dict.py", line 15, in __init_subclass__
    builder.add_to_dict()
  File "/usr/lib/python3.7/site-packages/mashumaro/serializer/base/metaprogramming.py", line 163, in add_to_dict
    for fname, ftype in self.fields.items():
  File "/usr/lib/python3.7/site-packages/mashumaro/serializer/base/metaprogramming.py", line 53, in fields
    for fname, ftype in typing.get_type_hints(self.cls).items():
  File "/usr/lib/python3.7/typing.py", line 973, in get_type_hints
    value = _eval_type(value, base_globals, localns)
  File "/usr/lib/python3.7/typing.py", line 260, in _eval_type
    return t._evaluate(globalns, localns)
  File "/usr/lib/python3.7/typing.py", line 466, in _evaluate
    is_argument=self.__forward_is_argument__)
  File "/usr/lib/python3.7/typing.py", line 139, in _type_check
    raise TypeError(f"{msg} Got {arg!r:.100}.")
TypeError: Forward references must evaluate to types. Got <__main__.FormattedDateTime object at 0x7faec3540f60>.

mashumaro explodes when used as a vendor'd install

I have a project where I am installing mashumaro in a "vendor" subdirectory (via pdistx) for use by my code; because of the environment the code is running in, installing mashumaro globally is not an option, neither is updating PYTHON_PATH or similar. This means that my import line in my code ends up needing to look like: from vendor.mashumaro import DataClassJSONMixin

Setting up the vendor directory (see reproduction steps below) and running the following script:

from dataclasses import dataclass
from vendor.mashumaro import DataClassJSONMixin

@dataclass
class TestClassTwo(DataClassJSONMixin):
    field1: int

@dataclass
class TestClassOne(DataClassJSONMixin):
    field1: TestClassTwo

...results in the following stack trace:

Traceback (most recent call last):
  File "D:\Users\username\Documents\testdir\test2.py", line 10, in <module>
    class TestClassOne(DataClassJSONMixin):
  File "D:\Users\username\Documents\testdir\vendor\mashumaro\serializer\base\dict.py", line 14, in __init_subclass__
    builder.add_from_dict()
  File "D:\Users\username\Documents\testdir\vendor\mashumaro\serializer\base\metaprogramming.py", line 229, in add_from_dict
    self._from_dict_set_value(fname, ftype, metadata, alias)
  File "D:\Users\username\Documents\testdir\vendor\mashumaro\serializer\base\metaprogramming.py", line 250, in _from_dict_set_value
    unpacked_value = self._unpack_field_value(fname=fname, ftype=ftype, parent=self.cls, metadata=metadata)
  File "D:\Users\username\Documents\testdir\vendor\mashumaro\serializer\base\metaprogramming.py", line 800, in _unpack_field_value
    raise UnserializableField(fname, ftype, parent)
vendor.mashumaro.exceptions.UnserializableField: Field "field1" of type TestClassTwo in TestClassOne is not serializable

This appears to happen because of the following hardcoded class name in meta/helpers.py:

DataClassDictMixinPath = "mashumaro.serializer.base.dict.DataClassDictMixin"

...which is no longer correct when the code is run in this way (the correct path would be vendor.mashumaro.serializer.base.dict.DataClassDictMixin).

I'm not sure what the best way to fix this is, but in the extremely short term I'm able to patch the class name for my specific use case. It would be great if I was able to vendor an install without having to edit it afterwards!

Reproduction steps:

  1. install pdistx if you don't already have it installed
  2. Create an empty directory for the test
  3. In that directory, create a requirements.txt file, as such:
mashumaro==2.9
msgpack==1.0.3
pyyaml==6.0
typing-extensions==4.0.0
  1. execute pdistx vendor -r requirements.txt vendor. This will create a 'vendor' subdirectory with mashumaro and its dependencies
  2. Create a test script using the test script at the beginning of this issue, execute it, and observe the stack trace.

Thanks for the excellent tool!

How to omit a field on `from_dict`

I am de-serializing a dictionary from a MongoDB query, and I want to dis-regard the _id field on the incoming record. Is there a way to do that?

Inconsistent ordering of Optional[Union[...]] types

We have a NodeConfig class that has a 'unique_key' attribute defined: unique_key: Optional[Union[str, List[str]]] = None

I have two identical tests in two different test directories ('test' and 'tests'). One of them is failing because "unique_key": "id" is converted to "unique_key": ["i", "d"], which implies that the List is processed first. The other one is not failing.

When I dump out the compiled code, I see a difference in the order:

< raise InvalidFieldValue('unique_key',typing.Union[str, typing.List[str], None],value,cls)
`

            raise InvalidFieldValue('unique_key',typing.Union[typing.List[str], str, None],value,cls)

`
I'm not sure where to look in the code for the order of the Union types.

Support for dataclasses.field default_factory argument

Hi @Fatal1ty,

Thanks for making this library, I've found it to be easy to use and very performant.

Today I ran into an issue where deserialization of a subclass inheriting a field with default_factory failed. Here's a minimal example:

from dataclasses import dataclass
from mashumaro import DataClassJSONMixin

@dataclass()
class A(DataClassJSONMixin):
    foo: List[str] = field(default_factory=list)
 
@dataclass()
class B(A):
    pass

print(A())  # A(sentences=[])
print(B())  # B(sentences=[])
print(A.from_dict({}))  # A(sentences=[])
print(B.from_dict({}))  # Exception
~/.pyenv/versions/3.8.2/lib/python3.8/site-packages/mashumaro/serializer/base/metaprogramming.py in from_dict(cls, d, use_bytes, use_enum, use_datetime)

MissingField: Field "foo" of type typing.List[str] is missing in __main__.B instance

I think this occurs because when checking for default values of ancestors, only field.default is extracted, ignoring field.default_factory --

d[field.name] = field.default
.

broken serialization for subclass of MutableMapping

Commit 501b648 broke Mashumaro for our project (dbt). The problem appears to be that Python reports None as the origin for one of our classes that inherits from a class that also inherits from MutableMapping.

There is a test at https://github.com/gshank/mashumaro/blob/broken_config/tests/test_mutable_mapping.py. This test passes on Python 3.6 but is broken in Python 3.8.

I made an attempt to fix it but just went around in circles. Putting back the 'is_dataclass' calls that were removed in the breaking commit fixed this test, but broke two other tests: tests/test_data_types.py::test_dataclass_field_without_mixin and tests/test_data_types.py::test_serializable_type_dataclass.

cannot install with conda for python 3.9

I'm trying to install mashumaro for python3.9 with conda. I use the following command:

conda create -n test -c conda-forge python=3.9 mashumaro

and I get the following errors:

Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                                                                    

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package python conflicts for:
mashumaro -> backports-datetime-fromisoformat -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.6,<3.7|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=3.8,<3.9.0a0|>=3.7|>=3.9,<3.10.0a0|>=3.5,<3.6.0a0']
mashumaro -> python[version='>=3.6']
python=3.9The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.34=0
  - feature:|@/linux-64::__glibc==2.34=0

Your installed version is: 2.34

This will also fail for python 3.10. However, I could install mashumaro for python3.6, 3.7, 3.8 without this problem, and I could install python 3.9 solely without issue.

Please add a Changelog

I couldn't find any information in the Readme about any breaking changes in the 3.0 release and there does not appear to be a changelog. Can you add a CHANGELOG.md file that contains the most important updates as well as a list of breaking changes for every release?

I like the format proposed by https://keepachangelog.com

Error when trying pip install

Hello,

I encounter this issue when using pip install

long_description=open('README.md').read(),
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe3 in position 13: illegal multibyte sequence

Please try to make all computer be able to install this package.

Thank you so much!

to_dict() fails for dictionaries with tuple keys

We have some dictionaries which have tuple keys. 'from_dict' works okay with a pre-constructed dictionary, but to_dict() fails with the error: TypeError: unhashable type: 'list'

`#!/usr/bin/env python
from typing import Union, Tuple, Dict
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableType

@DataClass
class TestClass(DataClassDictMixin):
name: str
patches: Dict[Tuple[str, str], str]

dct = {
'name': 'testing',
'patches': {
('one', 'name'): 'test1',
('two', 'order'): 'test2',
('three', 'change'): 'test3',
}
}

obj = TestClass.from_dict(dct)
print(obj)

new_dct = obj.to_dict()
print(new_dct)

`

__slots__ are ignored

Hey there! First, thanks for this great library, I am really enjoying using it!

Sometimes I like to use __slots__ with dataclasses to prevent me from accidentally setting a value to a (typo-ed) new attribute instead of to an existing attribute.

However, if I use the DataClassDictMixin from mashumaro, it seems like __slots__ are ignored. I.e. If I set a value to a non-existing attribute, I do not get AttributeError, but a new attribute is created and value is set to it, as if __slots__ would not exist.

Here's a short pytest case to showcase the problem:

import pytest
from dataclasses import dataclass
from mashumaro import DataClassDictMixin


@dataclass
class Foo:
    __slots__ = ["number"]
    number: int


@dataclass
class Bar(DataClassDictMixin):
    __slots__ = ["number"]
    number: int


def test_slots():

    foo = Foo(1)
    with pytest.raises(AttributeError) as err:
        foo.new_attribute = 2
    assert str(err.value) == "'Foo' object has no attribute 'new_attribute'"

    bar = Bar(1)
    bar.new_attribute = 2
    # -> should also fail with "'Bar' object has no attribute 'new_attribute'",
    # but it doesn't

Deserialize json to union of different classes by parameter

Hi!
I'm trying to understand if it is possible to deserialize nested dataclass based on some value in main dataclass.
Don't know how to explain it more correctly, probably my code will say more than me:

I've such json:

{
    "pointList": [{
		"r": {some data for pointType 1},
		"x": 1,
		"y": 1,
		"pointType": 1
	}, {
		"p": {some data for pointType 4},
		"x": 2,
		"y": 2,
		"pointType": 4
	}
    ]
}

So i defined dataclass for each pointType i have. serialization worked like a charm, but how can I deserialize such json, choosing correct dataclass for each point?

class BaseRequest(DataClassJSONMixin):
    class Config(BaseConfig):
        code_generation_options = [TO_DICT_ADD_OMIT_NONE_FLAG]

@dataclass(slots=True)
class MapPoint(BaseRequest):
    x: int
    y: int
    pointType: int

@dataclass(slots=True)
class Point4(MapPoint):
    r: Point4Data = None
    pointType: int = 4

@dataclass(slots=True)
class Point1(MapPoint):
    p: Point1Data = None
    pointType: int = 1

@dataclass(slots=True)
class MapData(BaseRequest):
    pointList: List[Union[Point1,Point4]]

Fo example make some dict where i can set class for each pointType and pass it to deserialization function. Or this is imposible and i want too much? :)

Datamapper example produces error on list items

Great work on this package. Seems to be the only package to support the schema changes i want to perform on json coming in from an external api.

However the provided example in /examples/json_remapping.py does not work for list items. Suppose i have a dataclasses setup like below (note the companies attribute)

@dataclass
class Company(DataClassJSONMixin):
    id: int
    name: str

    __remapping__ = {
        "ID": "id",
        "NAME": "name",
    }

@dataclass
class User(DataClassJSONMixin):
    id: int
    username: str
    email: str
    companies: List[Company]

    __remapping__ = {
        "ID": "id",
        "USERNAME": "username",
        "EMAIL": "email",
        "COMPANIES": ("companies", Company.__remapping__),
    }

We will receive a AttributeError: 'list' object has no attribute 'items' because the remapper does not seem to be prepared for list items:

def remapper(d: Dict[str, Any], rules: RemappingRules) -> Dict[str, Any]:
    result = {}
    for key, value in d.items():
        mapped_key = rules.get(key, key)
        if isinstance(mapped_key, tuple):
            value = remapper(value, mapped_key[1])
            result[mapped_key[0]] = value
        else:
            result[mapped_key] = value
    return result

How should i go about changing the remapper to also support list items?

to_dict() should take an options object or pass through kwargs

  • I want to interact with outside data types that need camelKeys, but want to use snake_keys for Python objects.
  • I want these rules to be applied recursively so that whenever we convert a child DataClassDictMixinthe behavior is carries through.

I could get most of this behavior by overloading to_dict, and intercepting the input or output. The problem is that in some scenarios I want the camelCase behavior and in some scenarios I don't. An example would be JSON serialization where it's enabled, and dict serialization where it's not. If I add a flag to to_dict to control the behavior it won't pass through to nested instances. We could convert the method to take an Options dict, or we could pass through kwargs to recursive invocations.

Nested class serialization breaks with future annotations import

Minimal code to reproduce:

from __future__ import annotations # <-- Brings troubles
from dataclasses import dataclass
from mashumaro import DataClassDictMixin


@dataclass
class Root(DataClassDictMixin):
    @dataclass
    class Nested(DataClassDictMixin):
        x: int

    nested: Nested


cfg = Root.from_dict({"nested": {"x": 1}}) 

print(cfg)
Traceback [python==3.9.5 / mashumaro==2.9.1]:
Traceback (most recent call last):
  File "bug.py", line 15, in <module>
    cfg = Root.from_dict({"nested": {"x": 1}})
  File "<path-to-venv>/lib/python3.9/site-packages/mashumaro/serializer/base/dict.py", line 50, in from_dict
    builder.add_from_dict()
  File "<path-to-venv>/lib/python3.9/site-packages/mashumaro/serializer/base/metaprogramming.py", line 282, in add_from_dict
    for fname, ftype in self.field_types.items():
  File "<path-to-venv>/lib/python3.9/site-packages/mashumaro/serializer/base/metaprogramming.py", line 178, in field_types
    return self.__get_field_types()
  File "<path-to-venv>/lib/python3.9/site-packages/mashumaro/serializer/base/metaprogramming.py", line 152, in __get_field_types
    raise UnresolvedTypeReferenceError(self.cls, name) from None
mashumaro.exceptions.UnresolvedTypeReferenceError: Class Root has unresolved type reference Nested in some of its fields

Code runs fine without from __future__ import annotations line.

Note: with DataClassYAMLMixin you get a rather cryptic error message:

Traceback (most recent call last):
  File "bug-yml.py", line 14, in <module>
    cfg = Root.from_yaml("""
  File "<path-to-venv>/lib/python3.9/site-packages/mashumaro/serializer/yaml.py", line 51, in from_yaml
    return cls.from_dict(
  File "<string>", line 10, in from_dict
TypeError: __init__() missing 1 required positional argument: 'nested'

Member variables of forward-referenced class types are ignored during serialization

Member variables of forward-referenced class types are ignored during serialization.

For example, consider the following case:

from __future__ import annotations
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class A(DataClassDictMixin):
    a:B

@dataclass
class B(DataClassDictMixin):
    b:int

a = A(B(1))
print(a.to_dict())

This gives us the expected:

{'a': {'b': 1}}

On the other hand, running the following case does not give the expected results.

@dataclass
class Base(DataClassDictMixin):
    pass

@dataclass
class A1(Base):
    a:B1

@dataclass
class B1(Base):
    b:int


a = A1(B1(1))
print(a.to_dict())

Result:

{}

At this time, no exception or error message will be generated.

Expected:

{'a': {'b': 1}}

Thanks for the excellent tool!

Broken serialization when using Dict + serialization_strategy

I'm hitting a weird bug. When I use a Dict field type with a serialization_strategy, the serialize method is also passed the entire dict, not individual items of the dict. The result is broken serialization as every value contains all values.

Confusing, right? I'll try to explain with an example.

from enum import Enum
from typing import Set
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from typing import Dict
from decimal import Decimal
from mashumaro.config import BaseConfig
from mashumaro.types import RoundedDecimal


@dataclass()
class Foo(DataClassDictMixin):
    bar: Dict[str, Decimal]

    class Config(BaseConfig):
        serialization_strategy = {
            Decimal: RoundedDecimal(),
        }


foo = Foo(bar={"a": 1, "b": 2})
print(foo.to_dict())
assert foo.to_dict() == {"bar": {"a": "1", "b": "2"}}

This fails, because foo.to_dict() does not return {"bar": {"a": "1", "b": "2"}} but it returns {'bar': {'a': "{'a': 1, 'b': 2}", 'b': "{'a': 1, 'b': 2}"}}. See it there? Every value in the returned dict contains all values of the dict.

Skip default value members on serialization

Hi!

Is there a way to skip variables that are default values from serialization/dict conversion?

exampe:

@dataclass
class MyClass(DataClassJSONMixin):
    name: str = None
    soul: str = None


c = MyClass()
c.name='itseme'

print(c.to_json())

output: {"name": "itseme", "soul": null}
desired output : {"name": "itseme"}

thanks!

Remapping json output and input

Is the function of remapping of the input and output of the JSON is supported?

I see dict_param but it does not work as expected.

@DataClass
class User(DataClassJSONMixin):
id: int
username: str
email: str

assert User.from_json(json.dumps({"ID":1})) == User(id=1) # FAILED

self-referencing/forward-references dataclasses are not supported

Mashumaro does not currently support self-referencing classes, the code-generation fails when it attempts to reflect on the field's type (which is a forward reference)

For example, consider the following case:

import dataclasses
from typing import Optional

@dataclasses.dataclass
class Node:
    value: str
    next: Optional['Node'] = None

@dataclasses.dataclass
class LinkedList:
    head: Optional[Node] = None

a = Node("A")
b = Node("B")
c = Node("C")

a.next = b
b.next = c

linked_list = LinkedList(head=a)

print("list", dataclasses.asdict(linked_list))
print("A", dataclasses.asdict(a))
print("B", dataclasses.asdict(b))
print("C", dataclasses.asdict(c))

This gives us the expected:

list {'head': {'value': 'A', 'next': {'value': 'B', 'next': {'value': 'C', 'next': None}}}}
A {'value': 'A', 'next': {'value': 'B', 'next': {'value': 'C', 'next': None}}}
B {'value': 'B', 'next': {'value': 'C', 'next': None}}
C {'value': 'C', 'next': None}

The equivalent classes using mashumaro:

import dataclasses
from typing import Optional

from mashumaro import DataClassJSONMixin

@dataclasses.dataclass
class Node(DataClassJSONMixin):
    value: str
    next: Optional['Node'] = None

@dataclasses.dataclass
class LinkedList(DataClassJSONMixin):
    head: Optional[Node] = None

a = Node("A")
b = Node("B")
c = Node("C")

a.next = b
b.next = c

linked_list = LinkedList(head=a)

print("list", linked_list.to_dict())
print("A", a.to_dict())
print("B", b.to_dict())
print("C", c.to_dict())

throws an error during Mashumaro's code generation:

Traceback (most recent call last):
  File "mashumaro_test.py", line 7, in <module>
    class Node(DataClassJSONMixin):
  File "mashumaro/serializer/base/dict.py", line 19, in __init_subclass__
    raise exc
  File "mashumaro/serializer/base/dict.py", line 15, in __init_subclass__
    builder.add_to_dict()
  File "mashumaro/serializer/base/metaprogramming.py", line 201, in add_to_dict
    for fname, ftype in self.fields.items():
  File "mashumaro/serializer/base/metaprogramming.py", line 78, in fields
    return self.__get_fields()
  File "mashumaro/serializer/base/metaprogramming.py", line 69, in __get_fields
    for fname, ftype in typing.get_type_hints(self.cls).items():
  File "/usr/lib/python3.9/typing.py", line 1410, in get_type_hints
    value = _eval_type(value, base_globals, localns)
  File "/usr/lib/python3.9/typing.py", line 279, in _eval_type
    ev_args = tuple(_eval_type(a, globalns, localns, recursive_guard) for a in t.__args__)
  File "/usr/lib/python3.9/typing.py", line 279, in <genexpr>
    ev_args = tuple(_eval_type(a, globalns, localns, recursive_guard) for a in t.__args__)
  File "/usr/lib/python3.9/typing.py", line 277, in _eval_type
    return t._evaluate(globalns, localns, recursive_guard)
  File "/usr/lib/python3.9/typing.py", line 533, in _evaluate
    eval(self.__forward_code__, globalns, localns),
  File "<string>", line 1, in <module>
NameError: name 'Node' is not defined

Serializing any type via cattrs-like structure/unstructure

Hi, thanks for a great library!

A lot of field types are supported, is there a reason they can't be serialized on their own?
I would be nice to be able to do this:

mashumaro.structure({"test": 4}, collections.Counter[str])
mashumaro.unstructure(my_list_of_named_tuple_instances)

Inconsistent checks for invalid value type for str field type

If a class based on DataClassDictMixin has a field with type str it will construct instances from data that contains data of other types for that field, including numbers, lists, and dicts. However fields of other types, eg int, do not accept other non-compatible types. Not sure if this is intentional and I'm missing something here, but it kinda seems like unexpected/undesirable behaviour when you want the input data to be validated.

The following example only throws an error on the very last line:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin


@dataclass
class StrType(DataClassDictMixin):
    a: str

StrType.from_dict({'a': 1})
StrType.from_dict({'a': [1, 2]})
StrType.from_dict({'a': {'b': 1}})


@dataclass
class IntType(DataClassDictMixin):
    a: int

IntType.from_dict({'a': 'blah'})

Handling defaults in dataclasses

QUESTION

When it comes to default values in dataclasses, is there any documentation how it should be handled properly? Unfortunately I haven't found anything related.

e.g. I have my schema below:

@dataclass
class LogLevel(DataClassYAMLMixin):
    level: str = field(default="INFO")

@dataclass
class Config(DataClassYAMLMixin):
    logging: LogLevel

yaml file:

logging:


there is no log level defined in yaml, only 'logging' section, so expected dataclass will be created using defaults, but it's not.

Class: MyServiceClass, Mq(mq=MqSetup(setup=1, setup2='neco')), Config(logging=None)

instead of:

Class: MyServiceClass, Mq(mq=MqSetup(setup=1, setup2='neco')), Config(logging=LogLevel(level='INFO'))

Can this usecase be handled by mashumaro library, please?

mypy raises error on writing msgpack to binary file

Line of code which save msgpack to binary file, cause mypy to raise typing error. How can I fix typing here? Save/load works fine, by the way.

@attr.s(auto_attribs=True)
class Phrase(DataClassMessagePackMixin):
    some_data: str
....

with open(base_data_path / "my_data", "wb") as f:
        f.write(phrase.to_msgpack())

Here, mypy raises typing error:
error: Argument 1 to "write" of "IO" has incompatible type "Union[str, bytes, bytearray]"; expected "bytes"

mypy --version

mypy 0.930

Custom IntEnum dumping

Hi, first of all, great package, congrats!
My question is the following: I would like to achieve that an IntEnum class would be dumped as its name to yaml.
I tried the following:

class Example(IntEnum, SerializableType):
    A = 1
    B = 2

    def _serialize(self):
        return {"value": self.name}

    @classmethod
    def _deserialize(cls, value):
        return Example.__members__[value["value"]]

However I get this exception: TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Is there any other way to achieve this?
Thanks!

`TypeError` using PEP-585 standard collection annotation types for Python <3.9

Summary

In the release v2.8 of mashumaru one of the updates includes support for PEP 585 compliance.

This implies supporting type annotations using the standard collection types list, tuple, ... instead of the corresponding generics typing.List, typing.Tuple, ...

However, on Python 3.7 or Python 3.8 a TypeError is raised.

Reproduce the Error

The following code snippet

from __future__ import annotations
from mashumaro import DataClassJSONMixin

# This works fine
x: list[int] = [1, 2, 3]
print(x)

# This raises TypeError on Python <3.9
class MyClass(DataClassJSONMixin):
    arg1: list[int]

runs without issues on Python 3.9, but produces an error on Python 3.7 and and Python 3.8:

TypeError: 'type' object is not subscriptable

Environment

  • Operating System: Mac OS X
  • mashumaro Version: 2.10.1
  • Python Versions: 3.7.6, 3.8.1, 3.9.1

Provide options to to_dict and from_dict calls.

We have callouts that we use in our 'from_dict' and 'to_dict' calls that sometimes need to run some transformations and sometimes can't run the transformations. It's difficult to give them information on which case is correct for this particular call without the ability to pass options along to the callouts. One way to do that would be to allow an 'options' dictionary on to_dict and from_dict that is passed to pre/post serialize/deserialize calls.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.