medcat.utils.hasher

Classes

Hasher

A consistent hasher.

Functions

dumps(obj[, length])

Dump the content of an object to bytes.

Module Contents

medcat.utils.hasher.dumps(obj, length=False)

Dump the content of an object to bytes.

This method uses dill to dump the contents of an object into a BytesIO object and then either reads its bytes or (or length == True) simply reruns the process on the length of the byte array.

Parameters:
  • obj (Any) – The object to dump.

  • length (bool, optional) – Whether to only dump the length of the file array. Defaults to False.

Returns:

bytes – The resulting byte array.

Return type:

bytes

class medcat.utils.hasher.Hasher(dumper=dumps)

A consistent hasher.

This class is able to hash the same object(s) to the same value every time. This is in contrast to the normal hashing in python that does not guarantee identical results over multiple runs.

Parameters:

dumper (Callable[[Any, bool], bytes], optional) – The dumper to be used. Defaults to the dumps method.

__init__(dumper=dumps)
Parameters:

dumper (Callable[[Any, bool], bytes])

m
_dumper
update(obj, length=False)

Update the hasher with the object in question.

If length = True is passed, only the length of the byte array corresponding to the data is considered Otherwise the entire byte array is used.

Parameters:
  • obj (Any) – The object to be added / hashed.

  • length (bool, optional) – Whether to only dump the length of the file array. Defaults to False.

Return type:

None

update_bytes(b)

Update the hasher with a byte array.

Parameters:

b (bytes) – The byte array to update with.

Return type:

None

hexdigest()

Get the hex for the current hash state.

Returns:

str – The hex representation of the hashed objects.

Return type:

str