medcat.utils.postprocessing

Classes

MutableDocument

The mutable parts of the document.

MutableEntity

The mutable part of an entity.

Functions

create_main_ann(doc)

Creates annotation in the spacy ents list

Module Contents

class medcat.utils.postprocessing.MutableDocument

Bases: Protocol

The mutable parts of the document.

Represents parts of the document that can / should be changed by the various components.

property base: BaseDocument

The base document.

Return type:

BaseDocument

property linked_ents: list[MutableEntity]

The linked entities associated with the document.

This should be set by the linker.

Return type:

list[MutableEntity]

property ner_ents: list[MutableEntity]

All entities recognised by NER.

This should be set by the NER component.

Return type:

list[MutableEntity]

__iter__()
Return type:

Iterator[MutableToken]

__getitem__(index: int) MutableToken
__getitem__(index: slice) MutableEntity
__len__()
Return type:

int

get_tokens(start_index, end_index)

Get the tokens that span the specified character indices.

Parameters:
  • start_index (int) – The starting character index.

  • end_index (int) – The ending character index.

Returns:

list[MutableToken] – The list of tokens.

Return type:

list[MutableToken]

set_addon_data(path, val)

Used to add arbitrary data to the entity.

This is generally used by addons to keep track of their data.

NB! The path used needs to be registered using the register_addon_path class method.

Parameters:
  • path (str) – The data ID / path.

  • val (Any) – The value to be added.

Return type:

None

has_addon_data(path)

Checks whether the addon data for a specific path has been set.

Parameters:

path (str) – The path to check.

Returns:

bool – Whether the addon data had been set.

Return type:

bool

get_addon_data(path)

Get data added to the entity.

See add_data for details.

Parameters:

path (str) – The data ID / path.

Returns:

Any – The stored value.

Return type:

Any

get_available_addon_paths()

Gets the available addon data paths for this document.

This will only include paths that have values set.

Returns:

list[str] – List of available addon data paths.

Return type:

list[str]

classmethod register_addon_path(path, def_val=None, force=True)

Register a custom/arbitrary data path.

This can be used to store arbitrary data along with the entity for use in an addon (e.g MetaCAT).

PS: If using this, it is important to use paths namespaced to the component you’re using in order to avoid conflicts.

Parameters:
  • path (str) – The path to be used. Should be prefixed by component name (e.g meta_cat_id for an ID tied to the meta_cat addon)

  • def_val (Any) – Default value. Defaults to None.

  • force (bool) – Whether to forcefully add the value. Defaults to True.

Return type:

None

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
class medcat.utils.postprocessing.MutableEntity

Bases: Protocol

The mutable part of an entity.

This represent the changeable part of an entnity. That is, parts that should be changed by the various components.

property base: BaseEntity

The base / static entity part.

Return type:

BaseEntity

property detected_name: str

The detected name (if any) for this entity.

This should be set by the NER component.

Return type:

str

set_addon_data(path, val)

Used to add arbitrary data to the entity.

This is generally used by addons to keep track of their data.

NB! The path used needs to be registered using the register_addon_path class method.

Parameters:
  • path (str) – The data ID / path.

  • val (Any) – The value to be added.

Return type:

None

has_addon_data(path)

Checks whether the addon data for a specific path has been set.

Parameters:

path (str) – The path to check.

Returns:

bool – Whether the addon data had been set.

Return type:

bool

get_addon_data(path)

Get data added to the entity.

See add_data for details.

Parameters:

path (str) – The data ID / path.

Returns:

Any – The stored value.

Return type:

Any

get_available_addon_paths()

Gets the available addon data paths for this entity.

This will only include paths that have values set.

Returns:

list[str] – List of available addon data paths.

Return type:

list[str]

The candidates for the detected name (if any) for this entity.

This should be set by the NER component.

Return type:

list[str]

property context_similarity: float

The context similarity of the lnked entity.

This should be set by the linker component.

Return type:

float

property confidence: float

The confidence for the lnked entity.

NOTE: This seems to be unused!

Return type:

float

property cui: str

The CUI of the lnked entity.

This should be set by the linker component.

Return type:

str

property id: int

The ID of the entity within the document.

This counts all the entities recognised, not just ones that were successfully linked.

This should be set by the NER.

Return type:

int

classmethod register_addon_path(path, def_val=None, force=True)

Register a custom/arbitrary data path.

This can be used to store arbitrary data along with the entity for use in an addon (e.g MetaCAT).

PS: If using this, it is important to use paths namespaced to the component you’re using in order to avoid conflicts.

Parameters:
  • path (str) – The path to be used. Should be prefixed by component name (e.g meta_cat_id for an ID tied to the meta_cat addon)

  • def_val (Any) – Default value. Defaults to None.

  • force (bool) – Whether to forcefully add the value. Defaults to True.

Return type:

None

__iter__()
Return type:

Iterator[MutableToken]

__len__()
Return type:

int

__slots__ = ()
_is_protocol = True
_is_runtime_protocol = False
classmethod __init_subclass__(*args, **kwargs)
classmethod __class_getitem__(params)
medcat.utils.postprocessing.create_main_ann(doc)

Creates annotation in the spacy ents list from all the annotations for this document.

Parameters:

doc (Doc) – Spacy document.

Return type:

None