medcat.components.addons.relation_extraction.modernbert.config ============================================================== .. py:module:: medcat.components.addons.relation_extraction.modernbert.config Attributes ---------- .. autoapisummary:: medcat.components.addons.relation_extraction.modernbert.config.logger Classes ------- .. autoapisummary:: medcat.components.addons.relation_extraction.modernbert.config.ConfigRelCAT medcat.components.addons.relation_extraction.modernbert.config.RelExtrBaseConfig medcat.components.addons.relation_extraction.modernbert.config.RelExtrModernBertConfig Module Contents --------------- .. py:class:: ConfigRelCAT(/, **data) Bases: :py:obj:`medcat.config.config.ComponentConfig` The RelCAT part of the config .. py:attribute:: general :type: General .. py:attribute:: model :type: Model .. py:attribute:: train :type: Train .. py:class:: Config .. py:attribute:: extra :value: 'allow' .. py:attribute:: validate_assignment :value: True .. py:method:: load(load_path = './') :classmethod: Load the config from a file. :param load_path: Path to RelCAT config. Defaults to "./". :type load_path: str :Returns: **ConfigRelCAT** -- The loaded config. .. py:attribute:: comp_name :type: str :value: 'default' The name of the component. If a custom implementation is required, it needs to be registered using `medcat.components.types.register_core_component( , , ) By default, only the 'default' component is registered. .. py:attribute:: _is_dirty :type: bool :value: False .. py:method:: __setattr__(name, value) .. py:property:: is_dirty :type: bool .. py:method:: mark_clean() .. py:method:: get_strategy() .. py:method:: get_init_attrs() :classmethod: .. py:method:: ignore_attrs() :classmethod: .. py:method:: include_properties() :classmethod: .. py:method:: merge_config(other) Merge this config with another config's (partial) model dump. The exepctation is that the `other` dict is a partial model dump. Values specified there are overwritten into the current config. Values not specified there are left intact. The `other` config can have keys/values that do not exist in the config or sub-config. And they will be added where possible. :param other: The model dump :type other: dict :raises IncorrectConfigValues: If unable to set the attribute, trying to set incorrect value, or trying to set sub-config values in an incorrect format (non-dict). .. py:attribute:: model_config :type: ClassVar[pydantic.config.ConfigDict] Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: model_fields :type: ClassVar[Dict[str, pydantic.fields.FieldInfo]] Metadata about the fields defined on the model, mapping of field names to [`FieldInfo`][pydantic.fields.FieldInfo] objects. This replaces `Model.__fields__` from Pydantic V1. .. py:attribute:: model_computed_fields :type: ClassVar[Dict[str, pydantic.fields.ComputedFieldInfo]] A dictionary of computed field names and their corresponding `ComputedFieldInfo` objects. .. py:attribute:: __class_vars__ :type: ClassVar[set[str]] The names of the class variables defined on the model. .. py:attribute:: __private_attributes__ :type: ClassVar[Dict[str, pydantic.fields.ModelPrivateAttr]] Metadata about the private attributes of the model. .. py:attribute:: __signature__ :type: ClassVar[inspect.Signature] The synthesized `__init__` [`Signature`][inspect.Signature] of the model. .. py:attribute:: __pydantic_complete__ :type: ClassVar[bool] :value: False Whether model building is completed, or if there are still undefined fields. .. py:attribute:: __pydantic_core_schema__ :type: ClassVar[pydantic_core.CoreSchema] The core schema of the model. .. py:attribute:: __pydantic_custom_init__ :type: ClassVar[bool] Whether the model has a custom `__init__` method. .. py:attribute:: __pydantic_decorators__ :type: ClassVar[pydantic._internal._decorators.DecoratorInfos] Metadata containing the decorators defined on the model. This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1. .. py:attribute:: __pydantic_generic_metadata__ :type: ClassVar[pydantic._internal._generics.PydanticGenericMetadata] Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these. .. py:attribute:: __pydantic_parent_namespace__ :type: ClassVar[Dict[str, Any] | None] :value: None Parent namespace of the model, used for automatic rebuilding of models. .. py:attribute:: __pydantic_post_init__ :type: ClassVar[None | Literal['model_post_init']] The name of the post-init method for the model, if defined. .. py:attribute:: __pydantic_root_model__ :type: ClassVar[bool] :value: False Whether the model is a [`RootModel`][pydantic.root_model.RootModel]. .. py:attribute:: __pydantic_serializer__ :type: ClassVar[pydantic_core.SchemaSerializer] The `pydantic-core` `SchemaSerializer` used to dump instances of the model. .. py:attribute:: __pydantic_validator__ :type: ClassVar[pydantic_core.SchemaValidator | pydantic.plugin._schema_validator.PluggableSchemaValidator] The `pydantic-core` `SchemaValidator` used to validate instances of the model. .. py:attribute:: __pydantic_extra__ :type: dict[str, Any] | None A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra] is set to `'allow'`. .. py:attribute:: __pydantic_fields_set__ :type: set[str] The names of fields explicitly set during instantiation. .. py:attribute:: __pydantic_private__ :type: dict[str, Any] | None Values of private attributes set on the model instance. .. py:attribute:: __slots__ :value: ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__') .. py:method:: __init__(/, **data) Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:property:: model_extra :type: dict[str, Any] | None Get extra fields set during validation. :Returns: **A dictionary of extra fields, or `None` if `config.extra` is not set to `"allow"`.** .. py:property:: model_fields_set :type: set[str] Returns the set of fields that have been explicitly set on this model instance. :Returns: **A set of strings representing the fields that have been set,** -- i.e. that were not filled from defaults. .. py:method:: model_construct(_fields_set = None, **values) :classmethod: Creates a new instance of the `Model` class with validated data. Creates a new model setting `__dict__` and `__pydantic_fields_set__` from trusted or pre-validated data. Default values are respected, but no other validation is performed. !!! note `model_construct()` generally respects the `model_config.extra` setting on the provided model. That is, if `model_config.extra == 'allow'`, then all extra passed values are added to the model instance's `__dict__` and `__pydantic_extra__` fields. If `model_config.extra == 'ignore'` (the default), then all extra passed values are ignored. Because no validation is performed with a call to `model_construct()`, having `model_config.extra == 'forbid'` does not result in an error if extra values are passed, but they will be ignored. :param _fields_set: A set of field names that were originally explicitly set during instantiation. If provided, this is directly used for the [`model_fields_set`][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the `values` argument will be used. :param values: Trusted or pre-validated data dictionary. :Returns: **A new instance of the `Model` class with validated data.** .. py:method:: model_copy(*, update = None, deep = False) Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#model_copy Returns a copy of the model. :param update: Values to change/add in the new model. Note: the data is not validated before creating the new model. You should trust this data. :param deep: Set to `True` to make a deep copy of the model. :Returns: **New model instance.** .. py:method:: model_dump(*, mode = 'python', include = None, exclude = None, context = None, by_alias = False, exclude_unset = False, exclude_defaults = False, exclude_none = False, round_trip = False, warnings = True, serialize_as_any = False) Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump Generate a dictionary representation of the model, optionally specifying which fields to include or exclude. :param mode: The mode in which `to_python` should run. If mode is 'json', the output will only contain JSON serializable types. If mode is 'python', the output may contain non-JSON-serializable Python objects. :param include: A set of fields to include in the output. :param exclude: A set of fields to exclude from the output. :param context: Additional context to pass to the serializer. :param by_alias: Whether to use the field's alias in the dictionary key if defined. :param exclude_unset: Whether to exclude fields that have not been explicitly set. :param exclude_defaults: Whether to exclude fields that are set to their default value. :param exclude_none: Whether to exclude fields that have a value of `None`. :param round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. :param warnings: How to handle serialization errors. False/"none" ignores them, True/"warn" logs errors, "error" raises a [`PydanticSerializationError`][pydantic_core.PydanticSerializationError]. :param serialize_as_any: Whether to serialize fields with duck-typing serialization behavior. :Returns: **A dictionary representation of the model.** .. py:method:: model_dump_json(*, indent = None, include = None, exclude = None, context = None, by_alias = False, exclude_unset = False, exclude_defaults = False, exclude_none = False, round_trip = False, warnings = True, serialize_as_any = False) Usage docs: https://docs.pydantic.dev/2.9/concepts/serialization/#modelmodel_dump_json Generates a JSON representation of the model using Pydantic's `to_json` method. :param indent: Indentation to use in the JSON output. If None is passed, the output will be compact. :param include: Field(s) to include in the JSON output. :param exclude: Field(s) to exclude from the JSON output. :param context: Additional context to pass to the serializer. :param by_alias: Whether to serialize using field aliases. :param exclude_unset: Whether to exclude fields that have not been explicitly set. :param exclude_defaults: Whether to exclude fields that are set to their default value. :param exclude_none: Whether to exclude fields that have a value of `None`. :param round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. :param warnings: How to handle serialization errors. False/"none" ignores them, True/"warn" logs errors, "error" raises a [`PydanticSerializationError`][pydantic_core.PydanticSerializationError]. :param serialize_as_any: Whether to serialize fields with duck-typing serialization behavior. :Returns: **A JSON string representation of the model.** .. py:method:: model_json_schema(by_alias = True, ref_template = DEFAULT_REF_TEMPLATE, schema_generator = GenerateJsonSchema, mode = 'validation') :classmethod: Generates a JSON schema for a model class. :param by_alias: Whether to use attribute aliases or not. :param ref_template: The reference template. :param schema_generator: To override the logic used to generate the JSON schema, as a subclass of `GenerateJsonSchema` with your desired modifications :param mode: The mode in which to generate the schema. :Returns: **The JSON schema for the given model class.** .. py:method:: model_parametrized_name(params) :classmethod: Compute the class name for parametrizations of generic classes. This method can be overridden to achieve a custom naming scheme for generic BaseModels. :param params: Tuple of types of the class. Given a generic class `Model` with 2 type variables and a concrete model `Model[str, int]`, the value `(str, int)` would be passed to `params`. :Returns: **String representing the new class where `params` are passed to `cls` as type variables.** :raises TypeError: Raised when trying to generate concrete names for non-generic models. .. py:method:: model_post_init(__context) Override this method to perform additional initialization after `__init__` and `model_construct`. This is useful if you want to do some validation that requires the entire model to be initialized. .. py:method:: model_rebuild(*, force = False, raise_errors = True, _parent_namespace_depth = 2, _types_namespace = None) :classmethod: Try to rebuild the pydantic-core schema for the model. This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails. :param force: Whether to force the rebuilding of the model schema, defaults to `False`. :param raise_errors: Whether to raise errors, defaults to `True`. :param _parent_namespace_depth: The depth level of the parent namespace, defaults to 2. :param _types_namespace: The types namespace, defaults to `None`. :Returns: * **Returns `None` if the schema is already "complete" and rebuilding was not required.** * **If rebuilding _was_ required, returns `True` if rebuilding was successful, otherwise `False`.** .. py:method:: model_validate(obj, *, strict = None, from_attributes = None, context = None) :classmethod: Validate a pydantic model instance. :param obj: The object to validate. :param strict: Whether to enforce types strictly. :param from_attributes: Whether to extract data from object attributes. :param context: Additional context to pass to the validator. :raises ValidationError: If the object could not be validated. :Returns: **The validated model instance.** .. py:method:: model_validate_json(json_data, *, strict = None, context = None) :classmethod: Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing Validate the given JSON data against the Pydantic model. :param json_data: The JSON data to validate. :param strict: Whether to enforce types strictly. :param context: Extra variables to pass to the validator. :Returns: **The validated Pydantic model.** :raises ValidationError: If `json_data` is not a JSON string or the object could not be validated. .. py:method:: model_validate_strings(obj, *, strict = None, context = None) :classmethod: Validate the given object with string data against the Pydantic model. :param obj: The object containing string data to validate. :param strict: Whether to enforce types strictly. :param context: Extra variables to pass to the validator. :Returns: **The validated Pydantic model.** .. py:method:: __get_pydantic_core_schema__(source, handler, /) :classmethod: Hook into generating the model's CoreSchema. :param source: The class we are generating a schema for. This will generally be the same as the `cls` argument if this is a classmethod. :param handler: A callable that calls into Pydantic's internal CoreSchema generation logic. :Returns: **A `pydantic-core` `CoreSchema`.** .. py:method:: __get_pydantic_json_schema__(core_schema, handler, /) :classmethod: Hook into generating the model's JSON schema. :param core_schema: A `pydantic-core` CoreSchema. You can ignore this argument and call the handler with a new CoreSchema, wrap this CoreSchema (`{'type': 'nullable', 'schema': current_schema}`), or just call the handler with the original schema. :param handler: Call into Pydantic's internal JSON schema generation. This will raise a `pydantic.errors.PydanticInvalidForJsonSchema` if JSON schema generation fails. Since this gets called by `BaseModel.model_json_schema` you can override the `schema_generator` argument to that function to change JSON schema generation globally for a type. :Returns: **A JSON schema, as a Python object.** .. py:method:: __pydantic_init_subclass__(**kwargs) :classmethod: This is intended to behave just like `__init_subclass__`, but is called by `ModelMetaclass` only after the class is actually fully initialized. In particular, attributes like `model_fields` will be present when this is called. This is necessary because `__init_subclass__` will always be called by `type.__new__`, and it would require a prohibitively large refactor to the `ModelMetaclass` to ensure that `type.__new__` was called in such a manner that the class would already be sufficiently initialized. This will receive the same `kwargs` that would be passed to the standard `__init_subclass__`, namely, any kwargs passed to the class definition that aren't used internally by pydantic. :param \*\*kwargs: Any keyword arguments passed to the class definition that aren't used internally by pydantic. .. py:method:: __class_getitem__(typevar_values) :classmethod: .. py:method:: __copy__() Returns a shallow copy of the model. .. py:method:: __deepcopy__(memo = None) Returns a deep copy of the model. .. py:method:: __getattr__(item) .. py:method:: _check_frozen(name, value) .. py:method:: __getstate__() .. py:method:: __setstate__(state) .. py:method:: __eq__(other) .. py:method:: __init_subclass__(**kwargs) :classmethod: This signature is included purely to help type-checkers check arguments to class declaration, which provides a way to conveniently set model_config key/value pairs. ```py from pydantic import BaseModel class MyModel(BaseModel, extra='allow'): ... ``` However, this may be deceiving, since the _actual_ calls to `__init_subclass__` will not receive any of the config arguments, and will only receive any keyword arguments passed during class initialization that are _not_ expected keys in ConfigDict. (This is due to the way `ModelMetaclass.__new__` works.) :param \*\*kwargs: Keyword arguments passed to the class definition, which set model_config .. note:: You may want to override `__pydantic_init_subclass__` instead, which behaves similarly but is called *after* the class is fully initialized. .. py:method:: __iter__() So `dict(model)` works. .. py:method:: __repr__() .. py:method:: __repr_args__() .. py:attribute:: __repr_name__ .. py:attribute:: __repr_str__ .. py:attribute:: __pretty__ .. py:attribute:: __rich_repr__ .. py:method:: __str__() .. py:property:: __fields__ :type: dict[str, pydantic.fields.FieldInfo] .. py:property:: __fields_set__ :type: set[str] .. py:method:: dict(*, include = None, exclude = None, by_alias = False, exclude_unset = False, exclude_defaults = False, exclude_none = False) .. py:method:: json(*, include = None, exclude = None, by_alias = False, exclude_unset = False, exclude_defaults = False, exclude_none = False, encoder = PydanticUndefined, models_as_dict = PydanticUndefined, **dumps_kwargs) .. py:method:: parse_obj(obj) :classmethod: .. py:method:: parse_raw(b, *, content_type = None, encoding = 'utf8', proto = None, allow_pickle = False) :classmethod: .. py:method:: parse_file(path, *, content_type = None, encoding = 'utf8', proto = None, allow_pickle = False) :classmethod: .. py:method:: from_orm(obj) :classmethod: .. py:method:: construct(_fields_set = None, **values) :classmethod: .. py:method:: copy(*, include = None, exclude = None, update = None, deep = False) Returns a copy of the model. !!! warning "Deprecated" This method is now deprecated; use `model_copy` instead. If you need `include` or `exclude`, use: ```py data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) ``` :param include: Optional set or mapping specifying which fields to include in the copied model. :param exclude: Optional set or mapping specifying which fields to exclude in the copied model. :param update: Optional dictionary of field-value pairs to override field values in the copied model. :param deep: If True, the values of fields that are Pydantic models will be deep-copied. :Returns: **A copy of the model with included, excluded and updated fields as specified.** .. py:method:: schema(by_alias = True, ref_template = DEFAULT_REF_TEMPLATE) :classmethod: .. py:method:: schema_json(*, by_alias = True, ref_template = DEFAULT_REF_TEMPLATE, **dumps_kwargs) :classmethod: .. py:method:: validate(value) :classmethod: .. py:method:: update_forward_refs(**localns) :classmethod: .. py:method:: _iter(*args, **kwargs) .. py:method:: _copy_and_set_values(*args, **kwargs) .. py:method:: _get_value(*args, **kwargs) :classmethod: .. py:method:: _calculate_keys(*args, **kwargs) .. py:class:: RelExtrBaseConfig(pretrained_model_name_or_path, **kwargs) Bases: :py:obj:`transformers.PretrainedConfig` Base class for the RelCAT models .. py:attribute:: name :value: 'base-config-relcat' .. py:method:: __init__(pretrained_model_name_or_path, **kwargs) .. py:attribute:: model_type :value: 'relcat' .. py:attribute:: pretrained_model_name_or_path .. py:attribute:: hf_model_config :type: transformers.PretrainedConfig .. py:method:: to_dict() Serializes this instance to a Python dictionary. :Returns: **`Dict[str, Any]`** -- Dictionary of all the attributes that make up this configuration instance. .. py:method:: save(save_path) .. py:method:: load(pretrained_model_name_or_path, relcat_config, **kwargs) :classmethod: .. py:attribute:: base_config_key :type: str :value: '' .. py:attribute:: sub_configs :type: Dict[str, PretrainedConfig] .. py:attribute:: is_composition :type: bool :value: False .. py:attribute:: attribute_map :type: Dict[str, str] .. py:attribute:: base_model_tp_plan :type: Optional[Dict[str, Any]] :value: None .. py:attribute:: _auto_class :type: Optional[str] :value: None .. py:method:: __setattr__(key, value) .. py:method:: __getattribute__(key) .. py:attribute:: return_dict .. py:attribute:: output_hidden_states .. py:attribute:: output_attentions .. py:attribute:: torchscript .. py:attribute:: torch_dtype .. py:attribute:: use_bfloat16 .. py:attribute:: tf_legacy_loss .. py:attribute:: pruned_heads .. py:attribute:: tie_word_embeddings .. py:attribute:: chunk_size_feed_forward .. py:attribute:: is_encoder_decoder .. py:attribute:: is_decoder .. py:attribute:: cross_attention_hidden_size .. py:attribute:: add_cross_attention .. py:attribute:: tie_encoder_decoder .. py:attribute:: architectures .. py:attribute:: finetuning_task .. py:attribute:: id2label .. py:attribute:: label2id .. py:attribute:: tokenizer_class .. py:attribute:: prefix .. py:attribute:: bos_token_id .. py:attribute:: pad_token_id .. py:attribute:: eos_token_id .. py:attribute:: sep_token_id .. py:attribute:: decoder_start_token_id .. py:attribute:: task_specific_params .. py:attribute:: problem_type .. py:attribute:: _name_or_path :value: '' .. py:attribute:: _commit_hash .. py:attribute:: _attn_implementation_internal .. py:attribute:: _attn_implementation_autoset :value: False .. py:attribute:: transformers_version .. py:property:: name_or_path :type: str .. py:property:: use_return_dict :type: bool Whether or not return [`~utils.ModelOutput`] instead of tuples. :type: `bool` .. py:property:: num_labels :type: int The number of labels for classification models. :type: `int` .. py:property:: _attn_implementation .. py:method:: save_pretrained(save_directory, push_to_hub = False, **kwargs) Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the [`~PretrainedConfig.from_pretrained`] class method. :param save_directory: Directory where the configuration JSON file will be saved (will be created if it does not exist). :type save_directory: `str` or `os.PathLike` :param push_to_hub: Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace). :type push_to_hub: `bool`, *optional*, defaults to `False` :param kwargs: Additional key word arguments passed along to the [`~utils.PushToHubMixin.push_to_hub`] method. :type kwargs: `Dict[str, Any]`, *optional* .. py:method:: _set_token_in_kwargs(kwargs, token=None) :staticmethod: Temporary method to deal with `token` and `use_auth_token`. This method is to avoid apply the same changes in all model config classes that overwrite `from_pretrained`. Need to clean up `use_auth_token` in a follow PR. .. py:method:: from_pretrained(pretrained_model_name_or_path, cache_dir = None, force_download = False, local_files_only = False, token = None, revision = 'main', **kwargs) :classmethod: Instantiate a [`PretrainedConfig`] (or a derived class) from a pretrained model configuration. :param pretrained_model_name_or_path: This can be either: - a string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - a path to a *directory* containing a configuration file saved using the [`~PretrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`. - a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`. :type pretrained_model_name_or_path: `str` or `os.PathLike` :param cache_dir: Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. :type cache_dir: `str` or `os.PathLike`, *optional* :param force_download: Whether or not to force to (re-)download the configuration files and override the cached versions if they exist. :type force_download: `bool`, *optional*, defaults to `False` :param resume_download: Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers. :param proxies: A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request. :type proxies: `Dict[str, str]`, *optional* :param token: The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `huggingface-cli login` (stored in `~/.huggingface`). :type token: `str` or `bool`, *optional* :param revision: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git. To test a pull request you made on the Hub, you can pass `revision="refs/pr/"`. :type revision: `str`, *optional*, defaults to `"main"` :param return_unused_kwargs: If `False`, then this function returns just the final configuration object. If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored. :type return_unused_kwargs: `bool`, *optional*, defaults to `False` :param subfolder: In case the relevant files are located inside a subfolder of the model repo on huggingface.co, you can specify the folder name here. :type subfolder: `str`, *optional*, defaults to `""` :param kwargs: The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter. :type kwargs: `Dict[str, Any]`, *optional* :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from this pretrained model. Examples: ```python # We can't instantiate directly the base class *PretrainedConfig* so let's show the examples on a # derived class: BertConfig config = BertConfig.from_pretrained( "google-bert/bert-base-uncased" ) # Download configuration from huggingface.co and cache. config = BertConfig.from_pretrained( "./test/saved_model/" ) # E.g. config (or model) was saved using *save_pretrained('./test/saved_model/')* config = BertConfig.from_pretrained("./test/saved_model/my_configuration.json") config = BertConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False) assert config.output_attentions == True config, unused_kwargs = BertConfig.from_pretrained( "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True ) assert config.output_attentions == True assert unused_kwargs == {"foo": False} ``` .. py:method:: get_config_dict(pretrained_model_name_or_path, **kwargs) :classmethod: From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a [`PretrainedConfig`] using `from_dict`. :param pretrained_model_name_or_path: The identifier of the pre-trained checkpoint from which we want the dictionary of parameters. :type pretrained_model_name_or_path: `str` or `os.PathLike` :Returns: **`Tuple[Dict, Dict]`** -- The dictionary(ies) that will be used to instantiate the configuration object. .. py:method:: _get_config_dict(pretrained_model_name_or_path, **kwargs) :classmethod: .. py:method:: from_dict(config_dict, **kwargs) :classmethod: Instantiates a [`PretrainedConfig`] from a Python dictionary of parameters. :param config_dict: Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the [`~PretrainedConfig.get_config_dict`] method. :type config_dict: `Dict[str, Any]` :param kwargs: Additional parameters from which to initialize the configuration object. :type kwargs: `Dict[str, Any]` :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from those parameters. .. py:method:: from_json_file(json_file) :classmethod: Instantiates a [`PretrainedConfig`] from the path to a JSON file of parameters. :param json_file: Path to the JSON file containing the parameters. :type json_file: `str` or `os.PathLike` :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from that JSON file. .. py:method:: _dict_from_json_file(json_file) :classmethod: .. py:method:: __eq__(other) .. py:method:: __repr__() .. py:method:: __iter__() .. py:method:: to_diff_dict() Removes all attributes from config which correspond to the default config attributes for better readability and serializes to a Python dictionary. :Returns: **`Dict[str, Any]`** -- Dictionary of all the attributes that make up this configuration instance, .. py:method:: to_json_string(use_diff = True) Serializes this instance to a JSON string. :param use_diff: If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` is serialized to JSON string. :type use_diff: `bool`, *optional*, defaults to `True` :Returns: **`str`** -- String containing all the attributes that make up this configuration instance in JSON format. .. py:method:: to_json_file(json_file_path, use_diff = True) Save this instance to a JSON file. :param json_file_path: Path to the JSON file in which this configuration instance's parameters will be saved. :type json_file_path: `str` or `os.PathLike` :param use_diff: If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` is serialized to JSON file. :type use_diff: `bool`, *optional*, defaults to `True` .. py:method:: update(config_dict) Updates attributes of this class with attributes from `config_dict`. :param config_dict: Dictionary of attributes that should be updated for this class. :type config_dict: `Dict[str, Any]` .. py:method:: update_from_string(update_str) Updates attributes of this class with attributes from `update_str`. The expected format is ints, floats and strings as is, and for booleans use `true` or `false`. For example: "n_embd=10,resid_pdrop=0.2,scale_attn_weights=false,summary_type=cls_index" The keys to change have to already exist in the config object. :param update_str: String with attributes that should be updated for this class. :type update_str: `str` .. py:method:: dict_torch_dtype_to_str(d) Checks whether the passed dictionary and its nested dicts have a *torch_dtype* key and if it's not None, converts torch.dtype to a string of just the type. For example, `torch.float32` get converted into *"float32"* string, which can then be stored in the json format. .. py:method:: register_for_auto_class(auto_class='AutoConfig') :classmethod: Register this class with a given auto class. This should only be used for custom configurations as the ones in the library are already mapped with `AutoConfig`. This API is experimental and may have some slight breaking changes in the next releases. :param auto_class: The auto class to register this new configuration with. :type auto_class: `str` or `type`, *optional*, defaults to `"AutoConfig"` .. py:method:: _get_global_generation_defaults() :staticmethod: .. py:method:: _get_non_default_generation_parameters() Gets the non-default generation parameters on the PretrainedConfig instance .. py:method:: get_text_config(decoder=False) Returns the config that is meant to be used with text IO. On most models, it is the original config instance itself. On specific composite models, it is under a set of valid names. If `decoder` is set to `True`, then only search for decoder config names. .. py:method:: _create_repo(repo_id, private = None, token = None, repo_url = None, organization = None) Create the repo if needed, cleans up repo_id with deprecated kwargs `repo_url` and `organization`, retrieves the token. .. py:method:: _get_files_timestamps(working_dir) Returns the list of files with their last modification timestamp. .. py:method:: _upload_modified_files(working_dir, repo_id, files_timestamps, commit_message = None, token = None, create_pr = False, revision = None, commit_description = None) Uploads all modified files in `working_dir` to `repo_id`, based on `files_timestamps`. .. py:method:: push_to_hub(repo_id, use_temp_dir = None, commit_message = None, private = None, token = None, max_shard_size = '5GB', create_pr = False, safe_serialization = True, revision = None, commit_description = None, tags = None, **deprecated_kwargs) Upload the {object_files} to the 🤗 Model Hub. :param repo_id: The name of the repository you want to push your {object} to. It should contain your organization name when pushing to a given organization. :type repo_id: `str` :param use_temp_dir: Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise. :type use_temp_dir: `bool`, *optional* :param commit_message: Message to commit while pushing. Will default to `"Upload {object}"`. :type commit_message: `str`, *optional* :param private: Whether to make the repo private. If `None` (default), the repo will be public unless the organization's default is private. This value is ignored if the repo already exists. :type private: `bool`, *optional* :param token: The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url` is not specified. :type token: `bool` or `str`, *optional* :param max_shard_size: Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`). We default it to `"5GB"` so that users can easily load models on free-tier Google Colab instances without any CPU OOM issues. :type max_shard_size: `int` or `str`, *optional*, defaults to `"5GB"` :param create_pr: Whether or not to create a PR with the uploaded files or directly commit. :type create_pr: `bool`, *optional*, defaults to `False` :param safe_serialization: Whether or not to convert the model weights in safetensors format for safer serialization. :type safe_serialization: `bool`, *optional*, defaults to `True` :param revision: Branch to push the uploaded files to. :type revision: `str`, *optional* :param commit_description: The description of the commit that will be created :type commit_description: `str`, *optional* :param tags: List of tags to push on the Hub. :type tags: `List[str]`, *optional* Examples: ```python from transformers import {object_class} {object} = {object_class}.from_pretrained("google-bert/bert-base-cased") # Push the {object} to your namespace with the name "my-finetuned-bert". {object}.push_to_hub("my-finetuned-bert") # Push the {object} to an organization with the name "my-finetuned-bert". {object}.push_to_hub("huggingface/my-finetuned-bert") ``` .. py:data:: logger .. py:class:: RelExtrModernBertConfig(pretrained_model_name_or_path, **kwargs) Bases: :py:obj:`medcat.components.addons.relation_extraction.config.RelExtrBaseConfig` Class for ModernBertConfig .. py:attribute:: name :value: 'modern-bert-config' .. py:attribute:: pretrained_model_name_or_path :value: 'answerdotai/ModernBERT-base' .. py:attribute:: hf_model_config :type: transformers.models.modernbert.ModernBertConfig .. py:method:: load(pretrained_model_name_or_path, relcat_config, **kwargs) :classmethod: .. py:method:: __init__(pretrained_model_name_or_path, **kwargs) .. py:attribute:: model_type :value: 'relcat' .. py:method:: to_dict() Serializes this instance to a Python dictionary. :Returns: **`Dict[str, Any]`** -- Dictionary of all the attributes that make up this configuration instance. .. py:method:: save(save_path) .. py:attribute:: base_config_key :type: str :value: '' .. py:attribute:: sub_configs :type: Dict[str, PretrainedConfig] .. py:attribute:: is_composition :type: bool :value: False .. py:attribute:: attribute_map :type: Dict[str, str] .. py:attribute:: base_model_tp_plan :type: Optional[Dict[str, Any]] :value: None .. py:attribute:: _auto_class :type: Optional[str] :value: None .. py:method:: __setattr__(key, value) .. py:method:: __getattribute__(key) .. py:attribute:: return_dict .. py:attribute:: output_hidden_states .. py:attribute:: output_attentions .. py:attribute:: torchscript .. py:attribute:: torch_dtype .. py:attribute:: use_bfloat16 .. py:attribute:: tf_legacy_loss .. py:attribute:: pruned_heads .. py:attribute:: tie_word_embeddings .. py:attribute:: chunk_size_feed_forward .. py:attribute:: is_encoder_decoder .. py:attribute:: is_decoder .. py:attribute:: cross_attention_hidden_size .. py:attribute:: add_cross_attention .. py:attribute:: tie_encoder_decoder .. py:attribute:: architectures .. py:attribute:: finetuning_task .. py:attribute:: id2label .. py:attribute:: label2id .. py:attribute:: tokenizer_class .. py:attribute:: prefix .. py:attribute:: bos_token_id .. py:attribute:: pad_token_id .. py:attribute:: eos_token_id .. py:attribute:: sep_token_id .. py:attribute:: decoder_start_token_id .. py:attribute:: task_specific_params .. py:attribute:: problem_type .. py:attribute:: _name_or_path :value: '' .. py:attribute:: _commit_hash .. py:attribute:: _attn_implementation_internal .. py:attribute:: _attn_implementation_autoset :value: False .. py:attribute:: transformers_version .. py:property:: name_or_path :type: str .. py:property:: use_return_dict :type: bool Whether or not return [`~utils.ModelOutput`] instead of tuples. :type: `bool` .. py:property:: num_labels :type: int The number of labels for classification models. :type: `int` .. py:property:: _attn_implementation .. py:method:: save_pretrained(save_directory, push_to_hub = False, **kwargs) Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the [`~PretrainedConfig.from_pretrained`] class method. :param save_directory: Directory where the configuration JSON file will be saved (will be created if it does not exist). :type save_directory: `str` or `os.PathLike` :param push_to_hub: Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace). :type push_to_hub: `bool`, *optional*, defaults to `False` :param kwargs: Additional key word arguments passed along to the [`~utils.PushToHubMixin.push_to_hub`] method. :type kwargs: `Dict[str, Any]`, *optional* .. py:method:: _set_token_in_kwargs(kwargs, token=None) :staticmethod: Temporary method to deal with `token` and `use_auth_token`. This method is to avoid apply the same changes in all model config classes that overwrite `from_pretrained`. Need to clean up `use_auth_token` in a follow PR. .. py:method:: from_pretrained(pretrained_model_name_or_path, cache_dir = None, force_download = False, local_files_only = False, token = None, revision = 'main', **kwargs) :classmethod: Instantiate a [`PretrainedConfig`] (or a derived class) from a pretrained model configuration. :param pretrained_model_name_or_path: This can be either: - a string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - a path to a *directory* containing a configuration file saved using the [`~PretrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`. - a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`. :type pretrained_model_name_or_path: `str` or `os.PathLike` :param cache_dir: Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. :type cache_dir: `str` or `os.PathLike`, *optional* :param force_download: Whether or not to force to (re-)download the configuration files and override the cached versions if they exist. :type force_download: `bool`, *optional*, defaults to `False` :param resume_download: Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers. :param proxies: A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request. :type proxies: `Dict[str, str]`, *optional* :param token: The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `huggingface-cli login` (stored in `~/.huggingface`). :type token: `str` or `bool`, *optional* :param revision: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git. To test a pull request you made on the Hub, you can pass `revision="refs/pr/"`. :type revision: `str`, *optional*, defaults to `"main"` :param return_unused_kwargs: If `False`, then this function returns just the final configuration object. If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored. :type return_unused_kwargs: `bool`, *optional*, defaults to `False` :param subfolder: In case the relevant files are located inside a subfolder of the model repo on huggingface.co, you can specify the folder name here. :type subfolder: `str`, *optional*, defaults to `""` :param kwargs: The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter. :type kwargs: `Dict[str, Any]`, *optional* :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from this pretrained model. Examples: ```python # We can't instantiate directly the base class *PretrainedConfig* so let's show the examples on a # derived class: BertConfig config = BertConfig.from_pretrained( "google-bert/bert-base-uncased" ) # Download configuration from huggingface.co and cache. config = BertConfig.from_pretrained( "./test/saved_model/" ) # E.g. config (or model) was saved using *save_pretrained('./test/saved_model/')* config = BertConfig.from_pretrained("./test/saved_model/my_configuration.json") config = BertConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False) assert config.output_attentions == True config, unused_kwargs = BertConfig.from_pretrained( "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True ) assert config.output_attentions == True assert unused_kwargs == {"foo": False} ``` .. py:method:: get_config_dict(pretrained_model_name_or_path, **kwargs) :classmethod: From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a [`PretrainedConfig`] using `from_dict`. :param pretrained_model_name_or_path: The identifier of the pre-trained checkpoint from which we want the dictionary of parameters. :type pretrained_model_name_or_path: `str` or `os.PathLike` :Returns: **`Tuple[Dict, Dict]`** -- The dictionary(ies) that will be used to instantiate the configuration object. .. py:method:: _get_config_dict(pretrained_model_name_or_path, **kwargs) :classmethod: .. py:method:: from_dict(config_dict, **kwargs) :classmethod: Instantiates a [`PretrainedConfig`] from a Python dictionary of parameters. :param config_dict: Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the [`~PretrainedConfig.get_config_dict`] method. :type config_dict: `Dict[str, Any]` :param kwargs: Additional parameters from which to initialize the configuration object. :type kwargs: `Dict[str, Any]` :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from those parameters. .. py:method:: from_json_file(json_file) :classmethod: Instantiates a [`PretrainedConfig`] from the path to a JSON file of parameters. :param json_file: Path to the JSON file containing the parameters. :type json_file: `str` or `os.PathLike` :Returns: **[`PretrainedConfig`]** -- The configuration object instantiated from that JSON file. .. py:method:: _dict_from_json_file(json_file) :classmethod: .. py:method:: __eq__(other) .. py:method:: __repr__() .. py:method:: __iter__() .. py:method:: to_diff_dict() Removes all attributes from config which correspond to the default config attributes for better readability and serializes to a Python dictionary. :Returns: **`Dict[str, Any]`** -- Dictionary of all the attributes that make up this configuration instance, .. py:method:: to_json_string(use_diff = True) Serializes this instance to a JSON string. :param use_diff: If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` is serialized to JSON string. :type use_diff: `bool`, *optional*, defaults to `True` :Returns: **`str`** -- String containing all the attributes that make up this configuration instance in JSON format. .. py:method:: to_json_file(json_file_path, use_diff = True) Save this instance to a JSON file. :param json_file_path: Path to the JSON file in which this configuration instance's parameters will be saved. :type json_file_path: `str` or `os.PathLike` :param use_diff: If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` is serialized to JSON file. :type use_diff: `bool`, *optional*, defaults to `True` .. py:method:: update(config_dict) Updates attributes of this class with attributes from `config_dict`. :param config_dict: Dictionary of attributes that should be updated for this class. :type config_dict: `Dict[str, Any]` .. py:method:: update_from_string(update_str) Updates attributes of this class with attributes from `update_str`. The expected format is ints, floats and strings as is, and for booleans use `true` or `false`. For example: "n_embd=10,resid_pdrop=0.2,scale_attn_weights=false,summary_type=cls_index" The keys to change have to already exist in the config object. :param update_str: String with attributes that should be updated for this class. :type update_str: `str` .. py:method:: dict_torch_dtype_to_str(d) Checks whether the passed dictionary and its nested dicts have a *torch_dtype* key and if it's not None, converts torch.dtype to a string of just the type. For example, `torch.float32` get converted into *"float32"* string, which can then be stored in the json format. .. py:method:: register_for_auto_class(auto_class='AutoConfig') :classmethod: Register this class with a given auto class. This should only be used for custom configurations as the ones in the library are already mapped with `AutoConfig`. This API is experimental and may have some slight breaking changes in the next releases. :param auto_class: The auto class to register this new configuration with. :type auto_class: `str` or `type`, *optional*, defaults to `"AutoConfig"` .. py:method:: _get_global_generation_defaults() :staticmethod: .. py:method:: _get_non_default_generation_parameters() Gets the non-default generation parameters on the PretrainedConfig instance .. py:method:: get_text_config(decoder=False) Returns the config that is meant to be used with text IO. On most models, it is the original config instance itself. On specific composite models, it is under a set of valid names. If `decoder` is set to `True`, then only search for decoder config names. .. py:method:: _create_repo(repo_id, private = None, token = None, repo_url = None, organization = None) Create the repo if needed, cleans up repo_id with deprecated kwargs `repo_url` and `organization`, retrieves the token. .. py:method:: _get_files_timestamps(working_dir) Returns the list of files with their last modification timestamp. .. py:method:: _upload_modified_files(working_dir, repo_id, files_timestamps, commit_message = None, token = None, create_pr = False, revision = None, commit_description = None) Uploads all modified files in `working_dir` to `repo_id`, based on `files_timestamps`. .. py:method:: push_to_hub(repo_id, use_temp_dir = None, commit_message = None, private = None, token = None, max_shard_size = '5GB', create_pr = False, safe_serialization = True, revision = None, commit_description = None, tags = None, **deprecated_kwargs) Upload the {object_files} to the 🤗 Model Hub. :param repo_id: The name of the repository you want to push your {object} to. It should contain your organization name when pushing to a given organization. :type repo_id: `str` :param use_temp_dir: Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise. :type use_temp_dir: `bool`, *optional* :param commit_message: Message to commit while pushing. Will default to `"Upload {object}"`. :type commit_message: `str`, *optional* :param private: Whether to make the repo private. If `None` (default), the repo will be public unless the organization's default is private. This value is ignored if the repo already exists. :type private: `bool`, *optional* :param token: The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url` is not specified. :type token: `bool` or `str`, *optional* :param max_shard_size: Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`). We default it to `"5GB"` so that users can easily load models on free-tier Google Colab instances without any CPU OOM issues. :type max_shard_size: `int` or `str`, *optional*, defaults to `"5GB"` :param create_pr: Whether or not to create a PR with the uploaded files or directly commit. :type create_pr: `bool`, *optional*, defaults to `False` :param safe_serialization: Whether or not to convert the model weights in safetensors format for safer serialization. :type safe_serialization: `bool`, *optional*, defaults to `True` :param revision: Branch to push the uploaded files to. :type revision: `str`, *optional* :param commit_description: The description of the commit that will be created :type commit_description: `str`, *optional* :param tags: List of tags to push on the Hub. :type tags: `List[str]`, *optional* Examples: ```python from transformers import {object_class} {object} = {object_class}.from_pretrained("google-bert/bert-base-cased") # Push the {object} to your namespace with the name "my-finetuned-bert". {object}.push_to_hub("my-finetuned-bert") # Push the {object} to an organization with the name "my-finetuned-bert". {object}.push_to_hub("huggingface/my-finetuned-bert") ```