medcat.storage.jsonserialiser

Attributes

T

_def_registry

Classes

Serialiser

The abstract serialiser base class.

AvailableSerialisers

Describes the available serialisers.

TypeHandler

Base class for protocol classes.

TypeRegistry

TypeBasedHandler

Base class for protocol classes.

NumpyArrayHandler

Base class for protocol classes.

SetHandler

Base class for protocol classes.

DateTimeHandler

Base class for protocol classes.

DataClassHandler

Base class for protocol classes.

JsonSerialiser

The abstract serialiser base class.

Module Contents

class medcat.storage.jsonserialiser.Serialiser

Bases: abc.ABC

The abstract serialiser base class.

This class is responsible for both serialising and deserialising.

RAW_FILE = 'raw_dict.dat'
property ser_type: AvailableSerialisers
Abstractmethod:

Return type:

AvailableSerialisers

The serialiser type.

abstract serialise(raw_parts, target_file)

Serialise the raw attributes / objects.

Parameters:
  • raw_parts (dict[str, Any]) – The raw objects to serialise.

  • target_file (str) – The file name to write to.

Return type:

None

abstract deserialise(target_file)

Deserialise data written to the specified file.

Parameters:

target_file (str) – The file to read from.

Returns:

dict[str, Any] – The deserialised raw attributes / objects.

Return type:

dict[str, Any]

classmethod get_ser_type_file(folder)
Parameters:

folder (str)

Return type:

str

save_ser_type_file(folder)

Save the serialiser type into the specified folder.

Parameters:

folder (str) – The folder to use.

Return type:

None

classmethod get_manually_serialised_path(folder)
Parameters:

folder (str)

Return type:

Optional[str]

check_ser_type(folder)

Check that the folder contains data serialised by this serialiser.

Parameters:

folder (str) – Target folder.

Raises:

TypeError – If the folder was not serialised by this serialiser.

Return type:

None

serialise_all(obj, target_folder, overwrite=False)

Serialise the entire object into the target folder.

This finds the serialisable parts (attributes) of the object and calls the same method on them recursively. It also finds the raw attributes (if any) and serialises them.

Parameters:
  • obj (Serialisable) – The object to serialise.

  • target_folder (str) – The target folder.

  • overwrite (bool) – Whether to allow overwriting. Defaults to False.

Raises:

IllegalSchemaException – If there’s multiple parts with the same name or a file already exists.

Return type:

None

classmethod deserialise_manually(folder_path, man_cls_path, **init_kwargs)
Parameters:
  • folder_path (str)

  • man_cls_path (str)

Return type:

medcat.storage.serialisables.Serialisable

deserialise_all(folder_path, ignore_folders_prefix=set(), ignore_folders_suffix=set(), **kwargs)

Deserialise contents of folder.

Additional initialisation keyword arguments can be provided if needed.

This loads both the raw attributes for this object as well as the serialisable parts / attributes recursively.

Parameters:
  • folder_path (str) – The folder path.

  • ignore_folders_prefix (set[str]) – The prefixes of folders to ignore.

  • ignore_folders_suffix (set[str]) – The suffixes of folders to ignore.

Returns:

Serialisable – The resulting object.

Return type:

medcat.storage.serialisables.Serialisable

__slots__ = ()
class medcat.storage.jsonserialiser.AvailableSerialisers

Bases: enum.Enum

Describes the available serialisers.

dill
json
write_to(file_path)
Parameters:

file_path (str)

Return type:

None

classmethod from_file(file_path)
Parameters:

file_path (str)

Return type:

AvailableSerialisers

__new__(value)
_generate_next_value_(start, count, last_values)

Generate the next value when not given.

name: the name of the member start: the initial start value or None count: the number of existing members last_value: the last value assigned or None

classmethod _missing_(value)
__repr__()
__str__()
__dir__()

Returns all members and all public methods

__format__(format_spec)

Returns format using actual value type unless __str__ has been overridden.

__hash__()
__reduce_ex__(proto)
name()

The name of the Enum member.

value()

The value of the Enum member.

medcat.storage.jsonserialiser.T
class medcat.storage.jsonserialiser.TypeHandler

Bases: Protocol, Generic[T]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
type_name: str
type_cls: type
should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

encode(obj)

Encode an object of the registered type.

Parameters:

obj (T)

Return type:

Any

decode(obj)

Decode an object of the registered type.

Parameters:

obj (Any)

Return type:

T

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.storage.jsonserialiser.TypeRegistry
_type_key = '__type__'
_data_key = 'data'
__init__()
Return type:

None

handlers: dict[str, TypeHandler]
register(handler)

Register a new type handler.

Parameters:

handler (TypeHandler)

Return type:

None

encode(obj)

Encode an object using the registered handler.

Parameters:

obj (Any)

Return type:

Any

decode(obj)

Decode an object using the registered handler.

Parameters:

obj (Any)

Return type:

Any

class medcat.storage.jsonserialiser.TypeBasedHandler

Bases: TypeHandler[T]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

type_name: str
type_cls: type
encode(obj)

Encode an object of the registered type.

Parameters:

obj (T)

Return type:

Any

decode(obj)

Decode an object of the registered type.

Parameters:

obj (Any)

Return type:

T

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.storage.jsonserialiser.NumpyArrayHandler

Bases: TypeBasedHandler[numpy.ndarray]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
type_name = 'ndarray'
type_cls
_dtype_key = 'dtype'
_data_key = 'data'
_shape_key = 'shape'
encode(obj)

Encode numpy ndarray.

Parameters:

obj (numpy.ndarray)

Return type:

Any

decode(obj)

Decode to numpy ndarray.

Parameters:

obj (Any)

Return type:

numpy.ndarray

should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.storage.jsonserialiser.SetHandler

Bases: TypeBasedHandler[set]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
type_name = 'set'
type_cls
encode(obj)

Encode set.

Parameters:

obj (set)

Return type:

Any

decode(obj)

Decode to set.

Parameters:

obj (Any)

Return type:

set

should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.storage.jsonserialiser.DateTimeHandler

Bases: TypeBasedHandler[datetime.datetime]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
type_name = 'datetime'
type_cls
encode(obj)

Encode an object of the registered type.

Parameters:

obj (datetime.datetime)

Return type:

Any

decode(obj)

Decode an object of the registered type.

Parameters:

obj (Any)

should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.storage.jsonserialiser.DataClassHandler

Bases: TypeHandler[T]

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
type_name = 'dataclass'
type_cls
_cls_key = 'class-path'
_data_key = 'data'
should_encode(obj)
Parameters:

obj (Any)

Return type:

bool

encode(obj)

Encode an object of the registered type.

Parameters:

obj (T)

Return type:

Any

decode(obj)

Decode an object of the registered type.

Parameters:

obj (Any)

Return type:

T

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
medcat.storage.jsonserialiser._def_registry
class medcat.storage.jsonserialiser.JsonSerialiser

Bases: medcat.storage.serialisers.Serialiser

The abstract serialiser base class.

This class is responsible for both serialising and deserialising.

ser_type

The serialiser type.

serialise(raw_parts, target_file)

Serialise the raw attributes / objects.

Parameters:
  • raw_parts (dict[str, Any]) – The raw objects to serialise.

  • target_file (str) – The file name to write to.

Return type:

None

deserialise(target_file)

Deserialise data written to the specified file.

Parameters:

target_file (str) – The file to read from.

Returns:

dict[str, Any] – The deserialised raw attributes / objects.

Return type:

dict[str, Any]

RAW_FILE = 'raw_dict.dat'
classmethod get_ser_type_file(folder)
Parameters:

folder (str)

Return type:

str

save_ser_type_file(folder)

Save the serialiser type into the specified folder.

Parameters:

folder (str) – The folder to use.

Return type:

None

classmethod get_manually_serialised_path(folder)
Parameters:

folder (str)

Return type:

Optional[str]

check_ser_type(folder)

Check that the folder contains data serialised by this serialiser.

Parameters:

folder (str) – Target folder.

Raises:

TypeError – If the folder was not serialised by this serialiser.

Return type:

None

serialise_all(obj, target_folder, overwrite=False)

Serialise the entire object into the target folder.

This finds the serialisable parts (attributes) of the object and calls the same method on them recursively. It also finds the raw attributes (if any) and serialises them.

Parameters:
  • obj (Serialisable) – The object to serialise.

  • target_folder (str) – The target folder.

  • overwrite (bool) – Whether to allow overwriting. Defaults to False.

Raises:

IllegalSchemaException – If there’s multiple parts with the same name or a file already exists.

Return type:

None

classmethod deserialise_manually(folder_path, man_cls_path, **init_kwargs)
Parameters:
  • folder_path (str)

  • man_cls_path (str)

Return type:

medcat.storage.serialisables.Serialisable

deserialise_all(folder_path, ignore_folders_prefix=set(), ignore_folders_suffix=set(), **kwargs)

Deserialise contents of folder.

Additional initialisation keyword arguments can be provided if needed.

This loads both the raw attributes for this object as well as the serialisable parts / attributes recursively.

Parameters:
  • folder_path (str) – The folder path.

  • ignore_folders_prefix (set[str]) – The prefixes of folders to ignore.

  • ignore_folders_suffix (set[str]) – The suffixes of folders to ignore.

Returns:

Serialisable – The resulting object.

Return type:

medcat.storage.serialisables.Serialisable

__slots__ = ()