medcat2.utils.postprocessing

Classes

MutableDocument

The mutable parts of the document.

MutableEntity

The mutable part of an entity.

Functions

create_main_ann(doc)

Creates annotation in the spacy ents list

Module Contents

class medcat2.utils.postprocessing.MutableDocument

Bases: Protocol

The mutable parts of the document.

Represents parts of the document that can / should be changed by the various components.

property base: BaseDocument

The base document.

Return type:

BaseDocument

property final_ents: list[MutableEntity]

The linked entities associated with the document.

This should be set by the linker.

Return type:

list[MutableEntity]

property all_ents: list[MutableEntity]

All entities recognised by NER.

This should be set by the NER component.

Return type:

list[MutableEntity]

__iter__()
Return type:

Iterator[MutableToken]

__getitem__(index: int) MutableToken
__getitem__(index: slice) MutableEntity
get_tokens(start_index, end_index)

Get the tokens that span the specified character indices.

Parameters:
  • start_index (int) – The starting character index.

  • end_index (int) – The ending character index.

Returns:

list[MutableToken] – The list of tokens.

Return type:

list[MutableToken]

set_addon_data(path, val)

Used to add arbitrary data to the entity.

This is generally used by addons to keep track of their data.

NB! The path used needs to be registered using the register_addon_path class method.

Parameters:
  • path (str) – The data ID / path.

  • val (Any) – The value to be added.

Return type:

None

get_addon_data(path)

Get data added to the entity.

See add_data for details.

Parameters:

path (str) – The data ID / path.

Returns:

Any – The stored value.

Return type:

Any

classmethod register_addon_path(path, def_val=None, force=True)

Register a custom/arbitrary data path.

This can be used to store arbitrary data along with the entity for use in an addon (e.g MetaCAT).

PS: If using this, it is important to use paths namespaced to the component you’re using in order to avoid conflicts.

Parameters:
  • path (str) – The path to be used. Should be prefixed by component name (e.g meta_cat_id for an ID tied to the meta_cat addon)

  • def_val (Any) – Default value. Defaults to None.

  • force (bool) – Whether to forcefully add the value. Defaults to True.

Return type:

None

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat2.utils.postprocessing.MutableEntity

Bases: Protocol

The mutable part of an entity.

This represent the changeable part of an entnity. That is, parts that should be changed by the various components.

property base: BaseEntity

The base / static entity part.

Return type:

BaseEntity

property detected_name: str

The detected name (if any) for this entity.

This should be set by the NER component.

Return type:

str

set_addon_data(path, val)

Used to add arbitrary data to the entity.

This is generally used by addons to keep track of their data.

NB! The path used needs to be registered using the register_addon_path class method.

Parameters:
  • path (str) – The data ID / path.

  • val (Any) – The value to be added.

Return type:

None

get_addon_data(path)

Get data added to the entity.

See add_data for details.

Parameters:

path (str) – The data ID / path.

Returns:

Any – The stored value.

Return type:

Any

The candidates for the detected name (if any) for this entity.

This should be set by the NER component.

Return type:

list[str]

property context_similarity: float

The context similarity of the lnked entity.

This should be set by the linker component.

Return type:

float

property confidence: float

The confidence for the lnked entity.

NOTE: This seems to be unused!

Return type:

float

property cui: str

The CUI of the lnked entity.

This should be set by the linker component.

Return type:

str

property id: int

The ID of the entity within the document.

This counts all the entities recognised, not just ones that were successfully linked.

This should be set by the NER.

Return type:

int

classmethod register_addon_path(path, def_val=None, force=True)

Register a custom/arbitrary data path.

This can be used to store arbitrary data along with the entity for use in an addon (e.g MetaCAT).

PS: If using this, it is important to use paths namespaced to the component you’re using in order to avoid conflicts.

Parameters:
  • path (str) – The path to be used. Should be prefixed by component name (e.g meta_cat_id for an ID tied to the meta_cat addon)

  • def_val (Any) – Default value. Defaults to None.

  • force (bool) – Whether to forcefully add the value. Defaults to True.

Return type:

None

__iter__()
Return type:

Iterator[MutableToken]

__len__()
Return type:

int

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
medcat2.utils.postprocessing.create_main_ann(doc)

Creates annotation in the spacy ents list from all the annotations for this document.

Parameters:

doc (Doc) – Spacy document.

Return type:

None