Class HashingAlgorithm

  • All Implemented Interfaces:
    Serializable
    Direct Known Subclasses:
    AverageHash, DifferenceHash, HogHash, PerceptiveHash, RotAverageHash, RotPHash, WaveletHash

    public abstract class HashingAlgorithm
    extends Object
    implements Serializable
    Base class for hashing algorithms returning perceptual hashes for supplied images reducing the number of bits needed to represent said image.

    Opposed to cryptographic hashes, hashes computed by these classes are entirely predictable. Similarity metrics applied to these hashes shall produce a higher score for closely related images.

    If implementing impose a limitation on the lower bounds on the dimension of hashable images the method getKeyResolution() has to be overridden.

    Unless otherwise noted hashing algorithms are thread safe.

    Since:
    1.0.0
    Author:
    Kilian
    See Also:
    Serialized Form
    • Field Detail

      • preProcessing

        protected List<Filter> preProcessing
      • bitResolution

        protected final int bitResolution
        The target bit resolution supplied during algorithm creation. This number represents the number of bits the final hash SHOULD have, but does not necessarily reflect it's actual length.

        Therefore, it is not advised to use this value during computation of the hash unless you made sure that the value actually reflects

      • keyResolution

        protected int keyResolution
        The actual bit resolution of produced hashes
      • opaqueReplacementColor

        protected Color opaqueReplacementColor
        Color used in replacement of opaque pixels
      • opaqueReplacementThreshold

        protected int opaqueReplacementThreshold
        Maximum alpha value a pixel must have in order to be replaced
      • immutableState

        protected boolean immutableState
        After a hash was created or the id was calculated the object may not be altered anymore.
    • Constructor Detail

      • HashingAlgorithm

        public HashingAlgorithm​(int bitResolution)
        Promises a key with approximately bit resolution. Due to geometric requirements the key might be marginally larger or smaller than specified. Hashing algorithms shall try to at least provide the number of bits specified
        Parameters:
        bitResolution - The bit count of the final hash
    • Method Detail

      • setOpaqueHandling

        public void setOpaqueHandling​(Color replacementColor,
                                      int alphaThreshold)
        Define how the algorithm shall handle images with alpha value. Hashing algorithms usually depend on the luminosity value, which by default will be treated as being black.

        Sometimes display software may choose to display missing pixels in a different color e.g. white. For the algorithm this would result in an entirely black image while for the user these images are perceptually different.

        Parameters:
        replacementColor - The color used to replace opaque values. A color should be chosen which is unlikely to be part of the target images. By default an orange color is selected. If a value of null is provided
        alphaThreshold - All colors with a value lower or equal value [0-255] will be replaced.
        • 0 means only invisible (entirely opaque pixels will be replaced)
        Throws:
        IllegalStateException - if a hash was already created and the object is considered immutable.
        Since:
        3.0.1
      • setOpaqueHandling

        public void setOpaqueHandling​(int alphaThreshold)
        Define how the algorithm shall handle images with alpha value. Hashing algorithms usually depend on the luminosity value, which by default will be treated as being black.

        Sometimes display software may choose to display missing pixels in a different color e.g. white. For the algorithm this would result in an entirely black image while for the user these images are perceptually different.

        Parameters:
        alphaThreshold - All colors with a value lower or equal value [0-255] will be replaced.
        • 0 means only invisible (entirely opaque pixels will be replaced)
        Throws:
        IllegalStateException - if a hash was already created and the object is considered immutable.
        Since:
        3.0.1
      • getOpaqueReplacementColor

        public Color getOpaqueReplacementColor()
        Returns:
        color used in replacement of opaque pixels
        Since:
        3.0.1
      • getOpaqueReplacementThreshold

        public int getOpaqueReplacementThreshold()
        Returns:
        the maximum alpha value a pixel must have in order to be replaced by the opaque replacement color [0-255].
        Since:
        3.0.1
      • hash

        public Hash[] hash​(BufferedImage... images)
        Calculate hashes for the given images. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.
        Parameters:
        images - whose hash will be calculated
        Returns:
        The hash representing the image
        Since:
        3.0.0
        See Also:
        Hash
      • hash

        public Hash[] hash​(File... imageFiles)
                    throws IOException
        Calculate hashes for the given images. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.
        Parameters:
        imageFiles - pointing to the images
        Returns:
        The hash representing the images
        Throws:
        IOException - if an error occurs during loading the image
        Since:
        3.0.0
        See Also:
        Hash
      • hash

        public Hash hash​(BufferedImage image)
        Calculate a hash for the given image. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.
        Parameters:
        image - Image whose hash will be calculated
        Returns:
        The hash representing the image
        See Also:
        Hash
      • hash

        public Hash hash​(File imageFile)
                  throws IOException
        Calculate a hash for the given image. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.
        Parameters:
        imageFile - The file pointing to the image
        Returns:
        The hash representing the image
        Throws:
        IOException - if an error occurs during loading the image
        See Also:
        Hash
      • hash

        protected abstract BigInteger hash​(BufferedImage image,
                                           HashBuilder hashBuilder)
        Calculate a hash for the given image. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.

        This method is intended to be overwritten by implementations and takes a baseHash argument to allow concatenating multiple hashes as well to be able to compute the effective hash length in getKeyResolution(). Preceding 0's are omitted in big integer objects, while the usual hamming distance can be calculated due to xoring without issue the normalized distance requires the potential length of the key to be known.

        Parameters:
        image - Image whose hash will be calculated
        hashBuilder - a hash builder used to construct the hash
        Returns:
        the hash encoded as a big integer
      • createPixelAccessor

        protected dev.brachtendorf.graphics.FastPixel createPixelAccessor​(BufferedImage image,
                                                                          int width,
                                                                          int height)
      • algorithmId

        public final int algorithmId()
        A unique id identifying the settings and algorithms used to generate the output result. The id shall stay consistent throughout restarts of the jvm.

        Even if different bitResolutions are used in the constructor HashingAlgorithm(int) the algorithId MUST return the same id for two instances if the returned hashes for the same input will always be equal. Therefore instead of checking against the bitResolution the actual resolution as returned by getKeyResolution() should be used.

        Returns:
        the algorithm id identifying this hashing algorithm
      • precomputeAlgoId

        protected abstract int precomputeAlgoId()
        A unique id identifying the settings and algorithms used to generate the output result. This method shall contain a hash code for the object which
        • Stays consistent throughout restart of the jvm
        • Value does not change after the constructor finished
        • Must return the same value if two instances compute the same hashes for identical input

        Even if different bitResolutions are used in the constructor HashingAlgorithm(int) the algorithId MUST return the same id for two instances if the returned hashes for the same input will always be equal. Therefore instead of checking against the bitResolution the actual resolution as returned by getKeyResolution() should be used. This method algorithm id's as information available to the child class and will be extended by the hashcode of the kernels.

        Returns:
        the preliminary algorithm id identifying this hashing algorithm
      • getKeyResolution

        public int getKeyResolution()
        Get the actual bit key resolution of all hashes computed by this algorithm.

        Be aware that this value may differ from:

        • the supplied bit resolution during algorithm creation due to geometric constraints of the hashing algorithm.
        • the returned hash's BigInteger.bitCount() value due to preceding 0 bits being truncated in the big integer
        Returns:
        the actual bit resolution of the hash.
      • addFilter

        public void addFilter​(Filter filter)
        Add a Filter to this hashing algorithm which will be used to alter the image before the hashing operation is applied. Kernels are invoked in the order they are added and are performed individually on all 3 RGB channels.

        Be aware that filters can only be added or removed until the first hash is computed. This limitation is enforced due to modified Kernels changing the hashcode of the object which might be used in hash collections leading to the object not being found after said operation.

        Parameters:
        filter - The filter to add.
        Throws:
        NullPointerException - if filter is null
        IllegalStateException - if a hash was already created and the object is considered immutable.
        Since:
        2.0.0
      • removeFilter

        public boolean removeFilter​(Filter filter)
        Remove the first occurance of a Filter from this hashing algorithm.

        Be aware that filters can only be added or removed until the first hash is computed. This limitation is enforced due to modified Kernels changing the hashcode of the object which might be used in hash collections leading to the object not being found after said operation.

        Parameters:
        filter - The filters to remove.
        Returns:
        true if the kernel was removed. False otherwise
        Throws:
        IllegalStateException - if a hash was already created and the object is considered immutable.
        Since:
        2.0.0
      • createAlgorithmSpecificHash

        public Hash createAlgorithmSpecificHash​(Hash original)
        Wraps the values supplied in the argument hash into a hash object as it would be produced by this algorithm.

        Some algorithms may choose to return an extended hash class to overwrite certain behavior, in particular the Hash.toImage(int) is likely to differ.

        If the algorithm does not utilize a special hash sub class this method returns the supplied argument.

        Parameters:
        original - the hash to transform
        Returns:
        a hash as it would be created by this algorithm.
        Since:
        3.0.0
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object