Class PerceptiveHash

  • All Implemented Interfaces:
    Serializable

    public class PerceptiveHash
    extends HashingAlgorithm
    Calculate a hash based on the frequency of an image using the DCT T2. This algorithm provides a very good accuracy and is robust to several image transformations, usually providing much better distinction capability than color or gradient based approaches.

    Due to the nature of dct's the hashes generated by this algorithms bits do not have a 50% chance of being set resulting in the normalized hamming distance usually not covering the entire [0-1] range.

    Implnote: In future versions this issue can possibly be addressed by applying principal component analysis on a huge set of hashes and figuring out which bits are holding the most information. Maybe we can fix this by taking a look at the difference to a fixed cosine.

    A lot of implementations around also compute the hash based on the mean value and not the mean. Take a look at this as well.

    Since:
    1.0.0
    Author:
    Kilian
    See Also:
    Serialized Form
    • Constructor Detail

      • PerceptiveHash

        public PerceptiveHash​(int bitResolution)
        Parameters:
        bitResolution - The bit resolution specifies the final length of the generated hash. A higher resolution will increase computation time and space requirement while being able to track finer detail in the image. Be aware that a high key is not always desired.
    • Method Detail

      • hash

        protected BigInteger hash​(BufferedImage image,
                                  HashBuilder hash)
        Description copied from class: HashingAlgorithm
        Calculate a hash for the given image. Invoking the hash function on the same image has to return the same hash value. A comparison of the hashes relates to the similarity of the images. The lower the value the more similar the images are. Equal images will produce a similarity of 0.

        This method is intended to be overwritten by implementations and takes a baseHash argument to allow concatenating multiple hashes as well to be able to compute the effective hash length in HashingAlgorithm.getKeyResolution(). Preceding 0's are omitted in big integer objects, while the usual hamming distance can be calculated due to xoring without issue the normalized distance requires the potential length of the key to be known.

        Specified by:
        hash in class HashingAlgorithm
        Parameters:
        image - Image whose hash will be calculated
        hash - a hash builder used to construct the hash
        Returns:
        the hash encoded as a big integer
      • precomputeAlgoId

        protected int precomputeAlgoId()
        Description copied from class: HashingAlgorithm
        A unique id identifying the settings and algorithms used to generate the output result. This method shall contain a hash code for the object which
        • Stays consistent throughout restart of the jvm
        • Value does not change after the constructor finished
        • Must return the same value if two instances compute the same hashes for identical input

        Even if different bitResolutions are used in the constructor HashingAlgorithm(int) the algorithId MUST return the same id for two instances if the returned hashes for the same input will always be equal. Therefore instead of checking against the bitResolution the actual resolution as returned by HashingAlgorithm.getKeyResolution() should be used. This method algorithm id's as information available to the child class and will be extended by the hashcode of the kernels.

        Specified by:
        precomputeAlgoId in class HashingAlgorithm
        Returns:
        the preliminary algorithm id identifying this hashing algorithm