medcat.components.addons.relation_extraction.pad_seq ==================================================== .. py:module:: medcat.components.addons.relation_extraction.pad_seq Classes ------- .. autoapisummary:: medcat.components.addons.relation_extraction.pad_seq.Pad_Sequence Module Contents --------------- .. py:class:: Pad_Sequence(seq_pad_value, label_pad_value = -1) .. py:method:: __init__(seq_pad_value, label_pad_value = -1) Used in rel_cat.py in RelCAT to create DataLoaders for train/test datasets. collate_fn for dataloader to collate sequences of different input_ids, ent1/ent2, and label lengths into a fixed length batch. This is applied per batch and not on the whole DataLoader data, padded x sequence, y sequence, x lengths and y lengths of batch. :param seq_pad_value: pad value for input_ids. :type seq_pad_value: int :param label_pad_value: pad value for labels. Defaults to -1. :type label_pad_value: int .. py:attribute:: seq_pad_value :type: int .. py:attribute:: label_pad_value :type: int :value: -1 .. py:method:: __call__(batch) Pads a batch of input_ids. :param batch: gets the batch of Tensors from RelData.dataset (check __getitem__() method for data returned) and pads the token sequence + labels as needed See https://pytorch.org/docs/stable/_modules/torch/nn/utils/rnn.html#pad_sequence for extra info. :type batch: list[torch.Tensor] :Returns: **tuple[Tensor, Tensor, Tensor, LongTensor, LongTensor]** -- padded data padded input ids, ent1&ent2 start token pos, padded labels, padded input_id_lengths, padded labels length