medcat.utils.regression.checking

Attributes

logger

UNKNOWN_METADATA

Exceptions

MalformedRegressionCaseException

Inappropriate argument value (of correct type).

Classes

CAT

This is a collection of serialisable model parts.

TranslationLayer

The translation layer for translating:

OptionSet

The targeting option set.

FinalTarget

The final target.

TargetedPhraseChanger

The target phrase changer.

MedCATTrainerExportConverter

Used to convert an MCT export to the format required for regression.

EditGetter

Base class for protocol classes.

MultiDescriptor

The descriptor of results over multiple different results (parts).

ResultDescriptor

The overarching result descriptor that handles multiple phrases.

Finding

Describes whether or how the finding verified.

BasicSpellChecker

RegressionCase

A regression case that has a name, defines options, filters and phrases.

MetaData

The metadata for the regression suite.

RegressionSuite

The regression checker.

Functions

partial_substitute(phrase, placeholder, name, nr)

Substitute all but 1 of the many placeholders present in the phrase.

pick_random_edits(edit_gen, num_to_pick, orig_len, ...)

get_ontology_and_version(model_card)

Attempt to get ontology (and its version) from a model card dict.

fix_np_float64(d)

Fix numpy.float64 in dictionary for yaml saving purposes.

Module Contents

class medcat.utils.regression.checking.CAT(cdb, vocab=None, config=None, model_load_path=None)

Bases: medcat.storage.serialisables.AbstractSerialisable

This is a collection of serialisable model parts.

Parameters:
__init__(cdb, vocab=None, config=None, model_load_path=None)
Parameters:
Return type:

None

cdb
vocab = None
config = None
_trainer: medcat.trainer.Trainer | None = None
_pipeline
usage_monitor
_recreate_pipe(model_load_path=None)
Parameters:

model_load_path (Optional[str])

Return type:

medcat.pipeline.pipeline.Pipeline

classmethod get_init_attrs()
Return type:

list[str]

classmethod ignore_attrs()
Return type:

list[str]

__call__(text)
Parameters:

text (str)

Return type:

Optional[medcat.tokenizing.tokens.MutableDocument]

_ensure_not_training()

Method to ensure config is not set to train.

config.components.linking.train should only be True while training and not during inference. This aalso corrects the setting if necessary.

Return type:

None

get_entities(text: str, only_cui: Literal[False] = False) medcat.data.entities.Entities
get_entities(text: str, only_cui: Literal[True] = True) medcat.data.entities.OnlyCUIEntities
get_entities(text: str, only_cui: bool = False) dict | medcat.data.entities.Entities | medcat.data.entities.OnlyCUIEntities

Get the entities recognised and linked within the provided text.

This will run the text through the pipeline and annotated the recognised and linked entities.

Parameters:
  • text (str) – The text to use.

  • only_cui (bool, optional) – Whether to only output the CUIs rather than the entire context. Defaults to False.

Returns:

Union[dict, Entities, OnlyCUIEntities] – The entities found and linked within the text.

_mp_worker_func(texts_and_indices)
Parameters:

texts_and_indices (list[tuple[str, str, bool]])

Return type:

list[tuple[str, str, Union[dict, medcat.data.entities.Entities, medcat.data.entities.OnlyCUIEntities]]]

_generate_batches_by_char_length(text_iter, batch_size_chars, only_cui)
Parameters:
  • text_iter (Union[Iterator[str], Iterator[tuple[str, str]]])

  • batch_size_chars (int)

  • only_cui (bool)

Return type:

Iterator[list[tuple[str, str, bool]]]

_generate_batches(text_iter, batch_size, batch_size_chars, only_cui)
Parameters:
  • text_iter (Union[Iterator[str], Iterator[tuple[str, str]]])

  • batch_size (int)

  • batch_size_chars (int)

  • only_cui (bool)

Return type:

Iterator[list[tuple[str, str, bool]]]

_generate_simple_batches(text_iter, batch_size, only_cui)
Parameters:
  • text_iter (Union[Iterator[str], Iterator[tuple[str, str]]])

  • batch_size (int)

  • only_cui (bool)

Return type:

Iterator[list[tuple[str, str, bool]]]

_mp_one_batch_per_process(executor, batch_iter, external_processes)
Parameters:
  • executor (concurrent.futures.ProcessPoolExecutor)

  • batch_iter (Iterator[list[tuple[str, str, bool]]])

  • external_processes (int)

Return type:

Iterator[tuple[str, Union[dict, medcat.data.entities.Entities, medcat.data.entities.OnlyCUIEntities]]]

get_entities_multi_texts(texts, only_cui=False, n_process=1, batch_size=-1, batch_size_chars=1000000)

Get entities from multiple texts (potentially in parallel).

If n_process > 1, n_process - 1 new processes will be created and data will be processed on those as well as the main process in parallel.

Parameters:
  • texts (Union[Iterable[str], Iterable[tuple[str, str]]]) – The input text. Either an iterable of raw text or one with in the format of (text_index, text).

  • only_cui (bool) – Whether to only return CUIs rather than other information like start/end and annotated value. Defaults to False.

  • n_process (int) – Number of processes to use. Defaults to 1.

  • batch_size (int) – The number of texts to batch at a time. A batch of the specified size will be given to each worker process. Defaults to -1 and in this case the character count will be used instead.

  • batch_size_chars (int) – The maximum number of characters to process in a batch. Each process will be given batch of texts with a total number of characters not exceeding this value. Defaults to 1,000,000 characters. Set to -1 to disable.

Yields:

Iterator[tuple[str, Union[dict, Entities, OnlyCUIEntities]]] – The results in the format of (text_index, entities).

Return type:

Iterator[tuple[str, Union[dict, medcat.data.entities.Entities, medcat.data.entities.OnlyCUIEntities]]]

_get_entity(ent, doc_tokens, cui)
Parameters:
Return type:

medcat.data.entities.Entity

get_addon_output(ent)

Get the addon output for the entity.

This includes a key-value pair for each addon that provides some. Sometimes same-type addons may combine their output under the same key.

Parameters:

ent (MutableEntity) – The entity in quesiton.

Raises:

ValueError – If unable to merge multiple addon output.

Returns:

dict[str, dict] – All the addon output.

Return type:

dict[str, dict]

_doc_to_out_entity(ent, doc_tokens, only_cui)
Parameters:
Return type:

tuple[int, Union[medcat.data.entities.Entity, str]]

_doc_to_out(doc, only_cui, out_with_text=False)
Parameters:
Return type:

Union[medcat.data.entities.Entities, medcat.data.entities.OnlyCUIEntities]

property trainer

The trainer object.

save_model_pack(target_folder, pack_name=DEFAULT_PACK_NAME, serialiser_type='dill', make_archive=True, only_archive=False, add_hash_to_pack_name=True, change_description=None)

Save model pack.

The resulting model pack name will have the hash of the model pack in its name if (and only if) the default model pack name is used.

Parameters:
  • target_folder (str) – The folder to save the pack in.

  • pack_name (str, optional) – The model pack name. Defaults to DEFAULT_PACK_NAME.

  • serialiser_type (Union[str, AvailableSerialisers], optional) – The serialiser type. Defaults to ‘dill’.

  • make_archive (bool) – Whether to make the arhive /.zip file. Defaults to True.

  • only_archive (bool) – Whether to clear the non-compressed folder. Defaults to False.

  • add_hash_to_pack_name (bool) – Whether to add the hash to the pack name. This is only relevant if pack_name is specified. Defaults to True.

  • change_description (Optional[str]) – If provided, this the description will be added to the model description. Defaults to None.

Returns:

str – The final model pack path.

Return type:

str

_get_hash()
Return type:

str

_versioning(change_description)
Parameters:

change_description (Optional[str])

Return type:

str

classmethod attempt_unpack(zip_path)

Attempt unpack the zip to a folder and get the model pack path.

If the folder already exists, no unpacking is done.

Parameters:

zip_path (str) – The ZIP path

Returns:

str – The model pack path

Return type:

str

classmethod load_model_pack(model_pack_path)

Load the model pack from file.

Parameters:

model_pack_path (str) – The model pack path.

Raises:

ValueError – If the saved data does not represent a model pack.

Returns:

CAT – The loaded model pack.

Return type:

CAT

classmethod load_cdb(model_pack_path)

Loads the concept database from the provided model pack path

Parameters:

model_pack_path (str) – path to model pack, zip or dir.

Returns:

CDB – The loaded concept database

Return type:

medcat.cdb.CDB

get_model_card(as_dict: Literal[True]) medcat.data.model_card.ModelCard
get_model_card(as_dict: Literal[False]) str

Get the model card either a (nested) dict or a json string.

Parameters:

as_dict (bool) – Whether to return as dict. Defaults to False.

Returns:

Union[str, ModelCard] – The model card.

__eq__(other)
Parameters:

other (Any)

Return type:

bool

add_addon(addon)
Parameters:

addon (medcat.components.addons.addons.AddonComponent)

Return type:

None

get_strategy()
Return type:

SerialisingStrategy

classmethod include_properties()
Return type:

list[str]

class medcat.utils.regression.checking.TranslationLayer(cui2info, name2info, cui2children, separator, whitespace=' ')

The translation layer for translating: - CUIs to names - names to CUIs - type_ids to CUIs - CUIs to chil CUIs

The idea is to decouple these translations from the CDB instance in case something changes there.

Parameters:
  • cui2info (dict[str, CUIInfo]) – The map from CUI to names

  • name2info (dict[str, NameInfo]) – The map from name to CUIs

  • cui2type_ids (dict[str, set[str]]) – The map from CUI to type_ids

  • cui2children (dict[str, set[str]]) – The map from CUI to child CUIs

  • separator (str)

  • whitespace (str)

__init__(cui2info, name2info, cui2children, separator, whitespace=' ')
Parameters:
Return type:

None

cui2info
name2info
separator
whitespace = ' '
type_id2cuis: dict[str, set[str]]
cui2children
get_names_of(cui, only_prefnames)

Get the preprocessed names of a CUI.

This method preporcesses the names by replacing the separator (generally ~) with the appropriate whitespace (` `).

If the concept is not in the underlying CDB, an empty list is returned.

Parameters:
  • cui (str) – The concept in question.

  • only_prefnames (bool) – Whether to only return a preferred name.

Returns:

list[str] – The list of names.

Return type:

list[str]

get_preferred_name(cui)

Get the preferred name of a concept.

If no preferred name is found, the random ‘first’ name is selected.

Parameters:

cui (str) – The concept ID.

Returns:

str – The preferred name.

Return type:

str

get_first_name(cui)

Get the preprocessed (potentially) arbitrarily first name of the given concept.

If the concept does not exist, the CUI itself is returned.

PS: The “first” name may not be consistent across runs since it relies on set order.

Parameters:

cui (str) – The concept ID.

Returns:

str – The first name.

Return type:

str

get_direct_children(cui)

Get the direct children of a concept.

This means only the children, but not grandchildren.

If the underlying CDB doesn’t list children for this CUI, an empty list is returned.

Parameters:

cui (str) – The concept in question.

Returns:

list[str] – The (potentially empty) list of direct children.

Return type:

list[str]

get_direct_parents(cui)

Get the direct parent(s) of a concept.

PS: This method can be quite a CPU heavy one since it relies

on running through all the parent-children relationships since the child->parent(s) relationship isn’t normally kept track of.

Parameters:

cui (str) – _description_

Returns:

list[str] – _description_

Return type:

list[str]

get_children_of(found_cuis, cui, depth=1)

Get the children of the specifeid CUI in the listed CUIs (if they exist).

Parameters:
  • found_cuis (Iterable[str]) – The list of CUIs to look in

  • cui (str) – The target parent CUI

  • depth (int) – The depth to carry out the search for

Returns:

list[str] – The list of children found

Return type:

list[str]

classmethod from_CDB(cdb)

Construct a TranslationLayer object from a context database (CDB).

This translation layer will refer to the same dicts that the CDB refers to. While there is no obvious reason these should be modified, it’s something to keep in mind.

Parameters:

cdb (CDB) – The CDB

Returns:

TranslationLayer – The subsequent TranslationLayer

Return type:

TranslationLayer

class medcat.utils.regression.checking.OptionSet(/, **data)

Bases: pydantic.BaseModel

The targeting option set.

This describes all the target placeholders and concepts needed.

Parameters:

data (Any)

options: list[TargetPlaceholder]
allow_any_combinations: bool = False
classmethod from_dict(section)

Construct a OptionSet instance from a dict.

The assumed structure is: {

‘placeholders’: [

{ ‘placeholder’: <e.g {DIAGNOSIS}’>, ‘cuis’: <the CUI>, ‘prefname-only’: ‘true’ }, <potentially more>],

‘any-combination’: <True or False>

}

The prefname-only key is optional.

Parameters:

section (dict[str, Any]) – The dict to parse

Raises:
Returns:

OptionSet – The resulting OptionSet

Return type:

OptionSet

to_dict()

Convert the OptionSet to a dict.

Returns:

dict – The dict representation

Return type:

dict

_get_all_combinations(cur_opts, other_opts, translation)
Parameters:
Return type:

Iterator[tuple[PhraseChanger, str]]

estimate_num_of_subcases()

Get the number of distinct subcases.

This includes ones that can be calculated without the knowledge of the underlying CDB. I.e it doesn’t care for the number of names involved per CUI but only takes into account what is described in the option set itself.

If any combination is allowed, then the answer is the combination of the number of target concepts per option. If any combination is not allowed, then the answer is simply the number of target concepts for an option (they should all have the same number).

Returns:

int – Te number of subcases.

Return type:

int

get_preprocessors_and_targets(translation)

Get the targeted phrase changers.

Parameters:

translation (TranslationLayer) – The translaton layer.

Yields:

Iterator[TargetedPhraseChanger] – Thetarget phrase changers.

Return type:

Iterator[TargetedPhraseChanger]

model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump(*, mode='python', include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode (Literal['json', 'python'] | str) – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include (IncEx | None) – A set of fields to include in the output.

  • exclude (IncEx | None) – A set of fields to exclude from the output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict[str, Any]

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

class medcat.utils.regression.checking.FinalTarget(/, **data)

Bases: pydantic.BaseModel

The final target.

This involves the final phrase (which (potentially) has other placeholder replaced in it), the placeholder to be replaced, and the CUI and specific name being used.

Parameters:

data (Any)

placeholder: str
cui: str
name: str
final_phrase: str
model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump(*, mode='python', include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode (Literal['json', 'python'] | str) – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include (IncEx | None) – A set of fields to include in the output.

  • exclude (IncEx | None) – A set of fields to exclude from the output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict[str, Any]

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

class medcat.utils.regression.checking.TargetedPhraseChanger(/, **data)

Bases: pydantic.BaseModel

The target phrase changer.

It includes the phrase changer (for preprocessing) along with the relevant concept and the placeholder it will replace.

Parameters:

data (Any)

changer: PhraseChanger
placeholder: str
cui: str
onlyprefnames: bool
model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump(*, mode='python', include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode (Literal['json', 'python'] | str) – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include (IncEx | None) – A set of fields to include in the output.

  • exclude (IncEx | None) – A set of fields to exclude from the output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict[str, Any]

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

medcat.utils.regression.checking.partial_substitute(phrase, placeholder, name, nr)

Substitute all but 1 of the many placeholders present in the phrase.

First, the first nr placeholders are replaced. Then the next (1) placeholder is replaced with a temporary one After that, the rest of the placeholders are replaced. And finally, the temporary placeholder is returned back to its original form.

Example

If we’ve got phrase = “some [PH] and [PH] we [PH]” placeholder = “[PH]”, and name = ‘NAME’, we’d get the following based on the number nr: 0: “some [PH] and NAME we NAME” 1: “some NAME and [PH] we NAME” 2: “some NAME and NAME we [PH]”

Parameters:
  • phrase (str) – The phrase in question.

  • placeholder (str) – The placeholder to replace.

  • name (str) – The name to replace the placeholder for.

  • nr (int) – The number of the target to keep.

Raises:

IncompatiblePhraseException – If the number of placeholders in the phrase is 1 or the number to be kept is too high; or the phrase has the temporary placeholder.

Returns:

str – The partially substituted phrase.

Return type:

str

class medcat.utils.regression.checking.MedCATTrainerExportConverter(mct_export, use_only_existing_name=False)

Used to convert an MCT export to the format required for regression.

Parameters:
TEMP_PLACEHOLDER = '##[SWAPME-{}-{}]##'
__init__(mct_export, use_only_existing_name=False)
Parameters:
Return type:

None

mct_export
use_only_existing_name = False
_get_placeholder(cui, nr)
Parameters:
  • cui (str)

  • nr (int)

Return type:

str

convert()

Converts the MedCATtrainer export into regression suite dict.

I.e this should producce a dict in the same format as one read from a regression suite YAML.

Returns:

dict – The Regression-suite compatible dict.

Return type:

dict

_iter_docs()
Return type:

Iterator[tuple[str, str, Iterator[tuple[int, int, str, str]]]]

_iter_anns_backwards(doc)
Parameters:

doc (medcat.data.mctexport.MedCATTrainerExportDocument)

Return type:

Iterator[tuple[int, int, str, str]]

medcat.utils.regression.checking.pick_random_edits(edit_gen, num_to_pick, orig_len, edit_distance, rng_seed)
Parameters:
  • edit_gen (Union[list[str], set[str], Iterator[str]])

  • num_to_pick (int)

  • orig_len (int)

  • edit_distance (int)

  • rng_seed (int)

Return type:

Iterator[str]

class medcat.utils.regression.checking.EditGetter

Bases: Protocol

Base class for protocol classes.

Protocol classes are defined as:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with @typing.runtime_checkable act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures. Protocol classes can be generic, they are defined as:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...
__call__(word, use_diacritics=False, return_ordered=False)
Parameters:
  • word (str)

  • use_diacritics (bool)

  • return_ordered (bool)

Return type:

Union[Iterator[str], set[str], list[str]]

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.utils.regression.checking.MultiDescriptor(/, **data)

Bases: pydantic.BaseModel

The descriptor of results over multiple different results (parts).

The idea is that this would likely be used with a regression suite and it would incorporate all the different regression cases it describes.

Parameters:

data (Any)

name: str

The name of the collection being checked

parts: list[ResultDescriptor] = []

The parts kept track of

property findings: dict[Finding, int]

The total findings.

Returns:

dict[Finding, int] – The total number of successes.

Return type:

dict[Finding, int]

iter_examples(strictness_threshold)

Iterate over all relevant examples.

Only examples that are not in the strictness matrix for the specified threshold will be used.

Parameters:

strictness_threshold (Strictness) – The threshold of avoidance.

Yields:

Iterable[tuple[FinalTarget, tuple[Finding, Optional[str]]]] – The examples

Return type:

Iterable[tuple[medcat.utils.regression.targeting.FinalTarget, tuple[Finding, Optional[str]]]]

_get_part_report(part, allowed_findings, total_findings, hide_empty, examples_strictness, phrases_separately, phrase_max_len)
Parameters:
  • part (ResultDescriptor)

  • allowed_findings (set[Finding])

  • total_findings (dict[Finding, int])

  • hide_empty (bool)

  • examples_strictness (Optional[Strictness])

  • phrases_separately (bool)

  • phrase_max_len (int)

Return type:

tuple[str, int, int, int]

calculate_report(phrases_separately=False, hide_empty=False, examples_strictness=Strictness.STRICTEST, strictness=Strictness.NORMAL, phrase_max_len=80)

Calculate some of the major parts of the report.

Parameters:
  • phrases_separately (bool) – Whether to include per-phrase information

  • hide_empty (bool) – Whether to hide empty cases

  • examples_strictness (Optional[Strictness.STRICTEST]) – What level of strictness to show for examples. Set to None to disable examples. Defaults to Strictness.STRICTEST.

  • strictness (Strictness) – The strictness of the success / fail overview. Defaults to Strictness.NORMAL.

  • phrase_max_len (int) – The maximum length of the phrase in examples. Defaults to 80.

Returns:

tuple[int, int, int, int, str] – The total number of examples, the total successes, the total failures, the delegated part, and the number of empty

Return type:

tuple[int, int, int, str, int]

get_report(phrases_separately, hide_empty=False, examples_strictness=Strictness.STRICTEST, strictness=Strictness.NORMAL, phrase_max_len=80)

Get the report associated with this descriptor

Parameters:
  • phrases_separately (bool) – Whether to include per-phrase information

  • hide_empty (bool) – Whether to hide empty cases

  • examples_strictness (Optional[Strictness.STRICTEST]) – What level of strictness to show for examples. Set to None to disable examples. Defaults to Strictness.STRICTEST.

  • strictness (Strictness) – The strictness of the success / fail overview. Defaults to Strictness.NORMAL.

  • phrase_max_len (int) – The maximum length of the phrase in examples. Defaults to 80.

Returns:

str – The report string

Return type:

str

model_dump(**kwargs)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include – A set of fields to include in the output.

  • exclude – A set of fields to exclude from the output.

  • context – Additional context to pass to the serializer.

  • by_alias – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults – Whether to exclude fields that are set to their default value.

  • exclude_none – Whether to exclude fields that have a value of None.

  • round_trip – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict

model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

class medcat.utils.regression.checking.ResultDescriptor(/, **data)

Bases: SingleResultDescriptor

The overarching result descriptor that handles multiple phrases.

This class keeps track of the results on a per-phrase basis and can be used to get the overall report and/or iterate over examples.

Parameters:

data (Any)

per_phrase_results: dict[str, SingleResultDescriptor]
report(target, finding)

Report a test case and its successfulness

Parameters:
  • target (FinalTarget) – The final targe configuration

  • finding (tuple[Finding, Optional[str]]) – To what extent the concept was recognised

Return type:

None

iter_examples(strictness_threshold)

Iterate suitable examples.

The strictness threshold at which to include examples.

Any finding that is assumed to be “correct enough” according to the strictness matrix for this threshold will be withheld from examples.

In simpler terms, if the finding is NOT in the strictness matrix for this strictness, the example is recorded.

NOTE: To disable example keeping, set the threshold to

Strictness.ANYTHING.

Parameters:

strictness_threshold (Strictness) – The strictness threshold.

Yields:

Iterable[tuple[FinalTarget, tuple[Finding, Optional[str]]]] – The placeholder, phrase, finding, CUI, and name.

Return type:

Iterable[tuple[medcat.utils.regression.targeting.FinalTarget, tuple[Finding, Optional[str]]]]

get_report(phrases_separately=False)

Get the report associated with this descriptor

Parameters:

phrases_separately (bool) – Whether to output descriptor for each phrase separately

Returns:

str – The report string

Return type:

str

model_dump(**kwargs)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include – A set of fields to include in the output.

  • exclude – A set of fields to exclude from the output.

  • context – Additional context to pass to the serializer.

  • by_alias – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults – Whether to exclude fields that are set to their default value.

  • exclude_none – Whether to exclude fields that have a value of None.

  • round_trip – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict

name: str

The name of the part that was checked

findings: dict[Finding, int]

The description of failures

examples: list[tuple[medcat.utils.regression.targeting.FinalTarget, tuple[Finding, str | None]]] = []

The examples of non-perfect alignment.

report_success(target, found)

Report a test case and its successfulness.

Parameters:
  • target (FinalTarget) – The target configuration

  • found (tuple[Finding, Optional[str]]) – Whether or not the check was successful

Return type:

None

json(**kwargs)
Return type:

str

model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

class medcat.utils.regression.checking.Finding

Bases: enum.Enum

Describes whether or how the finding verified.

The idea is that we know where we expect the entity to be recognised and the enum constants describe how the recognition compared to the expectation.

In essence, we want to know the relative positions of the two pairs of numbers (character numbers): - Expected Start, Expected End - Recognised Start, Recognised End

We can model this as 4 numbers on the number line. And we want to know their position relative to each other. For example, if the expected positions are marked with * and recognised positions with #, we may have something like: ___*__#_______#*______________ Which would indicate that there is a partial, but smaller span recognised.

IDENTICAL

The CUI and the span recognised are identical to what was expected.

BIGGER_SPAN_RIGHT

The CUI is the same, but the recognised span is longer on the right.

If we use the notation from the class doc string, e.g: _*#__*__#

BIGGER_SPAN_LEFT

The CUI is the same, but the recognised span is longer on the left.

If we use the notation from the class doc string, e.g: _#_*__*#_

BIGGER_SPAN_BOTH

The CUI is the same, but the recognised span is longer on both sides.

If we use the notation from the class doc string, e.g: _#__*__*__#_

SMALLER_SPAN

The CUI is the same, but the recognised span is smaller.

If we use the notation from the class doc string, e.g: _*_#_#_*_ (neither start nor end match) _*#_#_*__ (start matches, but end is before expected) _*__#_#*_ (end matches, but start is after expected)

PARTIAL_OVERLAP

The CUI is the same, but the span overlaps partially.

If we use the notation from the class doc string, e.g: _*_#__*_#_ (starts between expected start and end, but ends beyond) _#_*_#_*__ (start before expected start, but ends between expected start and end)

FOUND_DIR_PARENT

The recognised CUI is a parent of the expected CUI but the span is an exact match.

FOUND_DIR_GRANDPARENT

The recognised CUI is a grandparent of the expected CUI but the span is an exact match.

FOUND_ANY_CHILD

The recognised CUI is a child of the expected CUI but the span is an exact match.

FOUND_CHILD_PARTIAL

The recognised CUI is a child yet the match is only partial (smaller/bigger/partial).

FOUND_OTHER

Found another CUI in the same span.

FAIL

The concept was not recognised in any meaningful way.

has_correct_cui()

Whether the finding found the correct concept.

Returns:

bool – Whether the correct concept was found.

Return type:

bool

classmethod determine(exp_cui, exp_start, exp_end, tl, found_entities, strict_only=False, check_children=True, check_parent=True, check_grandparent=True)

Determine the finding type based on the input

Parameters:
  • exp_cui (str) – Expected CUI.

  • exp_start (int) – Expected span start.

  • exp_end (int) – Expected span end.

  • tl (TranslationLayer) – The translation layer.

  • found_entities (dict[int, Entity]) – The entities found by the model.

  • strict_only (bool) – Whether to use a strict-only mode (either identical or fail). Defaults to False.

  • check_children (bool) – Whether to check the children. Defaults to True.

  • check_parent (bool) – Whether to check for parent(s). Defaults to True.

  • check_grandparent (bool) – Whether to check for grandparent(s). Defaults to True.

Returns:

tuple[‘Finding’, Optional[str]] – The type of finding determined, and the alternative.

Return type:

tuple[Finding, Optional[str]]

__new__(value)
_generate_next_value_(start, count, last_values)

Generate the next value when not given.

name: the name of the member start: the initial start value or None count: the number of existing members last_value: the last value assigned or None

classmethod _missing_(value)
__repr__()
__str__()
__dir__()

Returns all members and all public methods

__format__(format_spec)

Returns format using actual value type unless __str__ has been overridden.

__hash__()
__reduce_ex__(proto)
name()

The name of the Enum member.

value()

The value of the Enum member.

class medcat.utils.regression.checking.BasicSpellChecker(cdb_vocab, config, data_vocab=None)
Parameters:
__init__(cdb_vocab, config, data_vocab=None)
Parameters:
vocab
config
data_vocab = None
P(word)

Probability of word.

Parameters:

word (str) – The word in question.

Returns:

float – The probability.

Return type:

float

__contains__(word)
fix(word)

Most probable spelling correction for word.

Parameters:

word (str) – The word.

Returns:

Optional[str] – Fixed word, or None if no fixes were applied.

Return type:

Optional[str]

candidates(word)

Generate possible spelling corrections for word.

Parameters:

word (str) – The word.

Returns:

Iterable[str] – The list of candidate words.

Return type:

Iterable[str]

known(words)

The subset of words that appear in the dictionary of WORDS.

Parameters:

words (Iterable[str]) – The words.

Returns:

set[str] – The set of candidates.

Return type:

set[str]

edits1(word)

All edits that are one edit away from word.

Parameters:

word (str) – The word.

Returns:

set[str] – The set of all edits

Return type:

set[str]

classmethod raw_edits1(word: str, use_diacritics: bool = False, return_ordered: Literal[False] = False) set[str]
classmethod raw_edits1(word: str, use_diacritics: bool = False, return_ordered: Literal[True] = True) list[str]
classmethod raw_edits1(word: str, use_diacritics: bool = False, return_ordered: bool = False) set[str] | list[str]
edits2(word)

All edits that are two edits away from word.

Parameters:

word (str) – The word to start from.

Returns:

Iterator[str] – All 2-away edits.

Return type:

Iterator[str]

classmethod raw_edits2(word, use_diacritics=False, return_ordered=False)
Parameters:
  • word (str)

  • use_diacritics (bool)

  • return_ordered (bool)

Return type:

Iterator[str]

edits3(word)

All edits that are two edits away from word.

medcat.utils.regression.checking.logger
class medcat.utils.regression.checking.RegressionCase(/, **data)

Bases: pydantic.BaseModel

A regression case that has a name, defines options, filters and phrases.

Parameters:

data (Any)

name: str
options: medcat.utils.regression.targeting.OptionSet
phrases: list[str]
report: medcat.utils.regression.results.ResultDescriptor
check_specific_for_phrase(cat, target, translation)

Checks whether the specific target along with the specified phrase is able to be identified using the specified model.

Parameters:
Raises:

MalformedRegressionCaseException – If there are too many placeholders in phrase.

Returns:

tuple[Finding, Optional[str]] – The nature to which the target was (or wasn’t) identified

Return type:

tuple[medcat.utils.regression.results.Finding, Optional[str]]

estimate_num_of_diff_subcases()
Return type:

int

get_distinct_cases(translation, edit_distance, use_diacritics)

Gets the various distinct sub-case iterators.

The sub-cases are those that can be determine without the translation layer. However, the translation layer is included here since it streamlines the operation.

Parameters:
  • translation (TranslationLayer) – The translation layer.

  • edit_distance (tuple[int, int, int]) – The edit distance(s) to try.

  • use_diacritics (bool) – Whether to use diacritics for edit distance.

Yields:

Iterator[Iterator[FinalTarget]] – The iterator of iterators of different sub cases.

Return type:

Iterator[Iterator[medcat.utils.regression.targeting.FinalTarget]]

_get_subcases(phrase, changer, translation, edit_distance, use_diacritics)
Parameters:
Return type:

Iterator[medcat.utils.regression.targeting.FinalTarget]

to_dict()

Converts the RegressionCase to a dict for serialisation.

Returns:

dict – The dict representation

Return type:

dict

classmethod from_dict(name, in_dict)

Construct the regression case from a dict.

The expected structure: {

‘targeting’: {
[

# the placeholder to be replaced ‘placeholder’: ‘[DIAGNOSIS]’ ‘cuis’: [‘cui1’, ‘cui2’] ‘prefname-only’: ‘false’, # optional

]

}, ‘phrases’: [‘phrase %s’] # possible multiple

}

Parameters:
  • name (str) – The name of the case

  • in_dict (dict) – The dict describing the case

Raises:
  • ValueError – If the input dict does not have the ‘targeting’ section

  • ValueError – If there are no phrases defined

Returns:

RegressionCase – The constructed regression cases.

Return type:

RegressionCase

__hash__()
Return type:

int

__eq__(other)
Parameters:

other (Any)

Return type:

bool

model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump(*, mode='python', include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode (Literal['json', 'python'] | str) – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include (IncEx | None) – A set of fields to include in the output.

  • exclude (IncEx | None) – A set of fields to exclude from the output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict[str, Any]

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

medcat.utils.regression.checking.UNKNOWN_METADATA = 'Unknown'
medcat.utils.regression.checking.get_ontology_and_version(model_card)

Attempt to get ontology (and its version) from a model card dict.

If no ontology is found, ‘Unknown’ is returned. The version is always returned as the first source ontology. That is, unless the specified location does not exist in the model card, in which case ‘Unknown’ is returned.

The ontology is assumed to be described at:

model_card[‘Source Ontology’][0] (or model_card[‘Source Ontology’] if it’s a string instead of a list)

The ontology version is read from:

model_card[‘Source Ontology’][0] (or model_card[‘Source Ontology’] if it’s a string instead of a list)

Currently, only SNOMED-CT, UMLS and ICD are supported / found.

Parameters:

model_card (dict) – The input model card.

Returns:

tuple[str, str] – The ontology (if found) or ‘Unknown’; and the version (if found) or ‘Unknown’

Return type:

tuple[str, str]

class medcat.utils.regression.checking.MetaData(/, **data)

Bases: pydantic.BaseModel

The metadata for the regression suite.

This should define which ontology (e.g UMLS or SNOMED) as well as which version was used when generating the regression suite.

The metadata may contain further information as well, this may include the annotator(s) involved when converting from MCT export or other relevant data.

Parameters:

data (Any)

ontology: str
ontology_version: str
extra: dict
regr_suite_creation_date: str
classmethod from_modelcard(model_card)

Generate a MetaData object from a model card.

This involves reading ontology info and version from the model card.

It must be noted that the model card should be provided as a dict not a string.

Parameters:

model_card (dict) – The CAT modelcard

Returns:

MetaData – The resulting MetaData

Return type:

MetaData

classmethod unknown()
Return type:

MetaData

model_config: ClassVar[pydantic.config.ConfigDict]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, pydantic.fields.FieldInfo]]

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_computed_fields: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]]

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

__class_vars__: ClassVar[set[str]]

The names of the class variables defined on the model.

__private_attributes__: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]]

Metadata about the private attributes of the model.

__signature__: ClassVar[inspect.Signature]

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: ClassVar[bool] = False

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__: ClassVar[pydantic_core.CoreSchema]

The core schema of the model.

__pydantic_custom_init__: ClassVar[bool]

Whether the model has a custom __init__ method.

__pydantic_decorators__: ClassVar[pydantic._internal._decorators.DecoratorInfos]

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: ClassVar[pydantic._internal._generics.PydanticGenericMetadata]

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: ClassVar[Dict[str, Any] | None] = None

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__: ClassVar[None | Literal['model_post_init']]

The name of the post-init method for the model, if defined.

__pydantic_root_model__: ClassVar[bool] = False

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__: ClassVar[pydantic_core.SchemaSerializer]

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator]

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_extra__: dict[str, Any] | None

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__: set[str]

The names of fields explicitly set during instantiation.

__pydantic_private__: dict[str, Any] | None

Values of private attributes set on the model instance.

__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__')
__init__(/, **data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or `None` if `config.extra` is not set to `”allow”`.

Return type:

dict[str, Any] | None

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set, – i.e. that were not filled from defaults.

Return type:

set[str]

classmethod model_construct(_fields_set=None, **values)

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Parameters:
  • _fields_set (set[str] | None) – A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

  • values (Any) – Trusted or pre-validated data dictionary.

Returns:

A new instance of the `Model` class with validated data.

Return type:

typing_extensions.Self

model_copy(*, update=None, deep=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy

Returns a copy of the model.

Parameters:
  • update (dict[str, Any] | None) – Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data.

  • deep (bool) – Set to True to make a deep copy of the model.

Returns:

New model instance.

Return type:

typing_extensions.Self

model_dump(*, mode='python', include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters:
  • mode (Literal['json', 'python'] | str) – The mode in which to_python should run. If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

  • include (IncEx | None) – A set of fields to include in the output.

  • exclude (IncEx | None) – A set of fields to exclude from the output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to use the field’s alias in the dictionary key if defined.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

Return type:

dict[str, Any]

model_dump_json(*, indent=None, include=None, exclude=None, context=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False, warnings=True, serialize_as_any=False)

Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json

Generates a JSON representation of the model using Pydantic’s to_json method.

Parameters:
  • indent (int | None) – Indentation to use in the JSON output. If None is passed, the output will be compact.

  • include (IncEx | None) – Field(s) to include in the JSON output.

  • exclude (IncEx | None) – Field(s) to exclude from the JSON output.

  • context (Any | None) – Additional context to pass to the serializer.

  • by_alias (bool) – Whether to serialize using field aliases.

  • exclude_unset (bool) – Whether to exclude fields that have not been explicitly set.

  • exclude_defaults (bool) – Whether to exclude fields that are set to their default value.

  • exclude_none (bool) – Whether to exclude fields that have a value of None.

  • round_trip (bool) – If True, dumped values should be valid as input for non-idempotent types such as Json[T].

  • warnings (bool | Literal['none', 'warn', 'error']) – How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors, “error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

  • serialize_as_any (bool) – Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

Return type:

str

classmethod model_json_schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, schema_generator=GenerateJsonSchema, mode='validation')

Generates a JSON schema for a model class.

Parameters:
  • by_alias (bool) – Whether to use attribute aliases or not.

  • ref_template (str) – The reference template.

  • schema_generator (type[pydantic.json_schema.GenerateJsonSchema]) – To override the logic used to generate the JSON schema, as a subclass of GenerateJsonSchema with your desired modifications

  • mode (pydantic.json_schema.JsonSchemaMode) – The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

Return type:

dict[str, Any]

classmethod model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Parameters:

params (tuple[type[Any], Ellipsis]) – Tuple of types of the class. Given a generic class Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where `params` are passed to `cls` as type variables.

Raises:

TypeError – Raised when trying to generate concrete names for non-generic models.

Return type:

str

model_post_init(__context)

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Parameters:

__context (Any)

Return type:

None

classmethod model_rebuild(*, force=False, raise_errors=True, _parent_namespace_depth=2, _types_namespace=None)

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Parameters:
  • force (bool) – Whether to force the rebuilding of the model schema, defaults to False.

  • raise_errors (bool) – Whether to raise errors, defaults to True.

  • _parent_namespace_depth (int) – The depth level of the parent namespace, defaults to 2.

  • _types_namespace (dict[str, Any] | None) – The types namespace, defaults to None.

Returns:
  • Returns `None` if the schema is already “complete” and rebuilding was not required.

  • If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.

Return type:

bool | None

classmethod model_validate(obj, *, strict=None, from_attributes=None, context=None)

Validate a pydantic model instance.

Parameters:
  • obj (Any) – The object to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • from_attributes (bool | None) – Whether to extract data from object attributes.

  • context (Any | None) – Additional context to pass to the validator.

Raises:

ValidationError – If the object could not be validated.

Returns:

The validated model instance.

Return type:

typing_extensions.Self

classmethod model_validate_json(json_data, *, strict=None, context=None)

Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing

Validate the given JSON data against the Pydantic model.

Parameters:
  • json_data (str | bytes | bytearray) – The JSON data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Raises:

ValidationError – If json_data is not a JSON string or the object could not be validated.

Return type:

typing_extensions.Self

classmethod model_validate_strings(obj, *, strict=None, context=None)

Validate the given object with string data against the Pydantic model.

Parameters:
  • obj (Any) – The object containing string data to validate.

  • strict (bool | None) – Whether to enforce types strictly.

  • context (Any | None) – Extra variables to pass to the validator.

Returns:

The validated Pydantic model.

Return type:

typing_extensions.Self

classmethod __get_pydantic_core_schema__(source, handler, /)

Hook into generating the model’s CoreSchema.

Parameters:
  • source (type[BaseModel]) – The class we are generating a schema for. This will generally be the same as the cls argument if this is a classmethod.

  • handler (pydantic.annotated_handlers.GetCoreSchemaHandler) – A callable that calls into Pydantic’s internal CoreSchema generation logic.

Returns:

A `pydantic-core` `CoreSchema`.

Return type:

pydantic_core.CoreSchema

classmethod __get_pydantic_json_schema__(core_schema, handler, /)

Hook into generating the model’s JSON schema.

Parameters:
  • core_schema (pydantic_core.CoreSchema) – A pydantic-core CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema ({‘type’: ‘nullable’, ‘schema’: current_schema}), or just call the handler with the original schema.

  • handler (pydantic.annotated_handlers.GetJsonSchemaHandler) – Call into Pydantic’s internal JSON schema generation. This will raise a pydantic.errors.PydanticInvalidForJsonSchema if JSON schema generation fails. Since this gets called by BaseModel.model_json_schema you can override the schema_generator argument to that function to change JSON schema generation globally for a type.

Returns:

A JSON schema, as a Python object.

Return type:

pydantic.json_schema.JsonSchemaValue

classmethod __pydantic_init_subclass__(**kwargs)

This is intended to behave just like __init_subclass__, but is called by ModelMetaclass only after the class is actually fully initialized. In particular, attributes like model_fields will be present when this is called.

This is necessary because __init_subclass__ will always be called by type.__new__, and it would require a prohibitively large refactor to the ModelMetaclass to ensure that type.__new__ was called in such a manner that the class would already be sufficiently initialized.

This will receive the same kwargs that would be passed to the standard __init_subclass__, namely, any kwargs passed to the class definition that aren’t used internally by pydantic.

Parameters:

**kwargs (Any) – Any keyword arguments passed to the class definition that aren’t used internally by pydantic.

Return type:

None

classmethod __class_getitem__(typevar_values)
Parameters:

typevar_values (type[Any] | tuple[type[Any], Ellipsis])

Return type:

type[BaseModel] | pydantic._internal._forward_ref.PydanticRecursiveRef

__copy__()

Returns a shallow copy of the model.

Return type:

typing_extensions.Self

__deepcopy__(memo=None)

Returns a deep copy of the model.

Parameters:

memo (dict[int, Any] | None)

Return type:

typing_extensions.Self

__getattr__(item)
Parameters:

item (str)

Return type:

Any

_check_frozen(name, value)
Parameters:
  • name (str)

  • value (Any)

Return type:

None

__getstate__()
Return type:

dict[Any, Any]

__setstate__(state)
Parameters:

state (dict[Any, Any])

Return type:

None

__eq__(other)
Parameters:

other (Any)

Return type:

bool

classmethod __init_subclass__(**kwargs)

This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs.

```py from pydantic import BaseModel

class MyModel(BaseModel, extra=’allow’): … ```

However, this may be deceiving, since the _actual_ calls to __init_subclass__ will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way ModelMetaclass.__new__ works.)

Parameters:

**kwargs (typing_extensions.Unpack[pydantic.config.ConfigDict]) – Keyword arguments passed to the class definition, which set model_config

Note

You may want to override __pydantic_init_subclass__ instead, which behaves similarly but is called after the class is fully initialized.

__iter__()

So dict(model) works.

Return type:

TupleGenerator

__repr__()
Return type:

str

__repr_args__()
Return type:

pydantic._internal._repr.ReprArgs

__repr_name__
__repr_str__
__pretty__
__rich_repr__
__str__()
Return type:

str

property __fields__: dict[str, pydantic.fields.FieldInfo]
Return type:

dict[str, pydantic.fields.FieldInfo]

property __fields_set__: set[str]
Return type:

set[str]

dict(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

Return type:

Dict[str, Any]

json(*, include=None, exclude=None, by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, encoder=PydanticUndefined, models_as_dict=PydanticUndefined, **dumps_kwargs)
Parameters:
  • include (IncEx | None)

  • exclude (IncEx | None)

  • by_alias (bool)

  • exclude_unset (bool)

  • exclude_defaults (bool)

  • exclude_none (bool)

  • encoder (Callable[[Any], Any] | None)

  • models_as_dict (bool)

  • dumps_kwargs (Any)

Return type:

str

classmethod parse_obj(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod parse_raw(b, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • b (str | bytes)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod parse_file(path, *, content_type=None, encoding='utf8', proto=None, allow_pickle=False)
Parameters:
  • path (str | pathlib.Path)

  • content_type (str | None)

  • encoding (str)

  • proto (pydantic.deprecated.parse.Protocol | None)

  • allow_pickle (bool)

Return type:

typing_extensions.Self

classmethod from_orm(obj)
Parameters:

obj (Any)

Return type:

typing_extensions.Self

classmethod construct(_fields_set=None, **values)
Parameters:
  • _fields_set (set[str] | None)

  • values (Any)

Return type:

typing_extensions.Self

copy(*, include=None, exclude=None, update=None, deep=False)

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Parameters:
  • include (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to include in the copied model.

  • exclude (pydantic._internal._utils.AbstractSetIntStr | pydantic._internal._utils.MappingIntStrAny | None) – Optional set or mapping specifying which fields to exclude in the copied model.

  • update (Dict[str, Any] | None) – Optional dictionary of field-value pairs to override field values in the copied model.

  • deep (bool) – If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

Return type:

typing_extensions.Self

classmethod schema(by_alias=True, ref_template=DEFAULT_REF_TEMPLATE)
Parameters:
  • by_alias (bool)

  • ref_template (str)

Return type:

Dict[str, Any]

classmethod schema_json(*, by_alias=True, ref_template=DEFAULT_REF_TEMPLATE, **dumps_kwargs)
Parameters:
  • by_alias (bool)

  • ref_template (str)

  • dumps_kwargs (Any)

Return type:

str

classmethod validate(value)
Parameters:

value (Any)

Return type:

typing_extensions.Self

classmethod update_forward_refs(**localns)
Parameters:

localns (Any)

Return type:

None

_iter(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_copy_and_set_values(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

classmethod _get_value(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

_calculate_keys(*args, **kwargs)
Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

medcat.utils.regression.checking.fix_np_float64(d)

Fix numpy.float64 in dictionary for yaml saving purposes.

These types of objects are unable to be cleanly serialized using yaml. So we need to convert them to the corresponding floats.

The changes will be made within the dictionary itself as well as dictionaries within, recursively.

Parameters:

d (dict) – The input dict

Return type:

None

class medcat.utils.regression.checking.RegressionSuite(cases, metadata, name)

The regression checker. This is used to check a bunch of regression cases at once against a model.

Parameters:
  • cases (list[RegressionCase]) – The list of regression cases

  • metadata (MetaData) – The metadata for the regression suite

  • use_report (bool) – Whether or not to use the report functionality. Defaults to False.

  • name (str)

__init__(cases, metadata, name)
Parameters:
Return type:

None

cases: list[RegressionCase]
report
metadata
get_all_distinct_cases(translation, edit_distance, use_diacritics)

Gets all the distinct cases for this regression suite.

While distinct cases can be determined without the translation layer, including it here simplifies the process.

Parameters:
  • translation (TranslationLayer) – The translation layer.

  • edit_distance (tuple[int, int, int]) – The edit distance(s) to try. Defaults to (0, 0, 0).

  • use_diacritics (bool) – Whether to use diacritics for edit distance.

Yields:

Iterator[tuple[RegressionCase, Iterator[FinalTarget]]] – The generator of the regression case along with its corresponding sub-cases.

Return type:

Iterator[tuple[RegressionCase, Iterator[medcat.utils.regression.targeting.FinalTarget]]]

estimate_total_distinct_cases()
Return type:

int

iter_subcases(translation, show_progress=True, edit_distance=(0, 0, 0), use_diacritics=False)

Iterate over all the sub-cases.

Each sub-case present a unique target (phrase, concept, name) on the corresponding regression case.

Parameters:
  • translation (TranslationLayer) – The translation layer.

  • show_progress (bool) – Whether to show progress. Defaults to True.

  • edit_distance (tuple[int, int, int]) – The edit distance(s) to try. Defaults to (0, 0, 0).

  • use_diacritics (bool) – Whether to use diacritics for edit distance.

Yields:

Iterator[tuple[RegressionCase, FinalTarget]]

The generator of the

regression case along with each of the final target sub-cases.

Return type:

Iterator[tuple[RegressionCase, medcat.utils.regression.targeting.FinalTarget]]

check_model(cat, translation, edit_distance=(0, 0, 0), use_diacritics=False)

Checks model and generates a report

Parameters:
  • cat (CAT) – The model to check against

  • translation (TranslationLayer) – The translation layer

  • edit_distance (tuple[int, int, int]) – The edit distance of the names. Defaults to (0, 0, 0).

  • use_diacritics (bool) – Whether to use diacritics for edit distance.

Returns:

MultiDescriptor – A report description

Return type:

medcat.utils.regression.results.MultiDescriptor

__str__()
Return type:

str

__repr__()
Return type:

str

to_dict()

Converts the RegressionChecker to dict for serialisation.

Returns:

dict – The dict representation

Return type:

dict

to_yaml()

Convert the RegressionChecker to YAML string.

Returns:

str – The YAML representation

Return type:

str

__eq__(other)
Parameters:

other (object)

Return type:

bool

classmethod from_dict(in_dict, name)

Construct a RegressionChecker from a dict.

Most of the parsing is handled in RegressionChecker.from_dict. This just assumes that each key in the dict is a name and each value describes a RegressionCase.

Parameters:
  • in_dict (dict) – The input dict.

  • name (str) – The name of the regression suite.

Returns:

RegressionChecker – The built regression checker

Return type:

RegressionSuite

classmethod from_yaml(file_name)

Constructs a RegressionChcker from a YAML file.

The from_dict method is used for the construction from the dict.

Parameters:

file_name (str) – The file name

Returns:

RegressionChecker – The constructed regression checker

Return type:

RegressionSuite

classmethod from_mct_export(file_name)
Parameters:

file_name (str)

Return type:

RegressionSuite

exception medcat.utils.regression.checking.MalformedRegressionCaseException(*args)

Bases: ValueError

Inappropriate argument value (of correct type).

Parameters:

args (object)

__init__(*args)

Initialize self. See help(type(self)) for accurate signature.

Parameters:

args (object)

Return type:

None

class __cause__

exception cause

class __context__

exception context

__delattr__()

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__()

Return self==value.

__format__()

Default object formatter.

__ge__()

Return self>=value.

__getattribute__()

Return getattr(self, name).

__gt__()

Return self>value.

__hash__()

Return hash(self).

__le__()

Return self<=value.

__lt__()

Return self<value.

__ne__()

Return self!=value.

__new__()

Create and return a new object. See help(type) for accurate signature.

__reduce__()
__reduce_ex__()

Helper for pickle.

__repr__()

Return repr(self).

__setattr__()

Implement setattr(self, name, value).

__setstate__()
__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__()

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

class __suppress_context__
class __traceback__
class args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.