medcat.components.addons.meta_cat.mctokenizers.bert_tokenizer
Attributes
Classes
Helper class that provides a standard way to create an ABC using |
|
Wrapper around a huggingface BERT tokenizer so that it works with the |
Module Contents
- class medcat.components.addons.meta_cat.mctokenizers.bert_tokenizer.TokenizerWrapperBase(hf_tokenizer=None)
Bases:
abc.ABCHelper class that provides a standard way to create an ABC using inheritance.
- Parameters:
hf_tokenizer (Optional[tokenizers.Tokenizer])
- name: str
- __init__(hf_tokenizer=None)
- Parameters:
hf_tokenizer (Optional[tokenizers.Tokenizer])
- Return type:
None
- hf_tokenizers = None
- __call__(text: str) dict
- __call__(text: list[str]) list[dict]
- abstract save(dir_path)
- Parameters:
dir_path (str)
- Return type:
None
- classmethod load(dir_path, model_variant='', **kwargs)
- Abstractmethod:
- Parameters:
dir_path (str)
model_variant (Optional[str])
- Return type:
tokenizers.Tokenizer
- abstract get_size()
- Return type:
int
- abstract token_to_id(token)
- Parameters:
token (str)
- Return type:
Union[int, list[int]]
- abstract get_pad_id()
- Return type:
Union[Optional[int], list[int]]
- ensure_tokenizer()
- Return type:
tokenizers.Tokenizer
- __slots__ = ()
- medcat.components.addons.meta_cat.mctokenizers.bert_tokenizer.FAKE_TOKENIZER_PATH = Multiline-String
Show Value
"""# /fake-path-not-exist#/"""
- class medcat.components.addons.meta_cat.mctokenizers.bert_tokenizer.TokenizerWrapperBERT(hf_tokenizers=None)
Bases:
medcat.components.addons.meta_cat.mctokenizers.tokenizers.TokenizerWrapperBaseWrapper around a huggingface BERT tokenizer so that it works with the MetaCAT models.
- Parameters:
transformers.models.bert.tokenization_bert_fast.BertTokenizerFast – A huggingface Fast BERT.
hf_tokenizers (Optional[transformers.models.bert.tokenization_bert_fast.BertTokenizerFast])
- name = 'bert-tokenizer'
- __init__(hf_tokenizers=None)
- Parameters:
hf_tokenizers (Optional[transformers.models.bert.tokenization_bert_fast.BertTokenizerFast])
- Return type:
None
- __call__(text: str) dict
- __call__(text: list[str]) list[dict]
- save(dir_path)
- Parameters:
dir_path (str)
- Return type:
None
- classmethod load(dir_path, model_variant='', **kwargs)
- Parameters:
dir_path (str)
model_variant (Optional[str])
- Return type:
- classmethod create_new(model_variant)
- Parameters:
model_variant (Optional[str])
- Return type:
- get_size()
- Return type:
int
- token_to_id(token)
- Parameters:
token (str)
- Return type:
Union[int, list[int]]
- get_pad_id()
- Return type:
Optional[int]
- hf_tokenizers = None
- ensure_tokenizer()
- Return type:
tokenizers.Tokenizer
- __slots__ = ()