Class Hash
- java.lang.Object
-
- dev.brachtendorf.jimagehash.hash.Hash
-
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
DifferenceHash.DHash,FuzzyHash
public class Hash extends Object implements Serializable
Hashes are bit encoded encoded values (0101011101) created from images using a hashing algorithm. Hashes enable a quick approximate similarity comparison between images while only storing a fraction of the original data.They are created from images down scaling information and enabling quick comparison between instances produced by the same algorithm. Every bit in the hash usually represents a section of the image containing certain information (hue, brightness, color, frequencies or gradients)
- Since:
- 1.0.0, 3.0.0 Serializable
- Author:
- Kilian
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected intalgorithmIdUnique identifier of the algorithm and settings used to create the hashprotected inthashLengthHow many bits does this hash represent.protected BigIntegerhashValueHash value representation Hashes are constructed by left shifting BigIntegers with either Zero or One depending on the condition found in the image.
-
Constructor Summary
Constructors Constructor Description Hash(BigInteger hashValue, int hashLength, int algorithmId)Creates a Hash object with the specified hashValue and algorithmId.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanequals(Object obj)static HashfromFile(File source)Reads a hash from a serialization file and returns it.intgetAlgorithmId()Return the algorithm identifier specifying by which algorithm and setting this hash was created.booleangetBit(int position)Check if the bit at the given position is set.intgetBitResolution()booleangetBitUnsafe(int position)Check if the bit at the given position of the hash is set.BigIntegergetHashValue()inthammingDistance(Hash h)Calculate the hamming distance of 2 hash values.inthammingDistanceFast(Hash h)Calculate the hamming distance of 2 hash values.inthammingDistanceFast(BigInteger bInt)Calculate the hamming distance of 2 hash values.inthashCode()doublenormalizedHammingDistance(Hash h)Calculate the hamming distance of 2 hash values.doublenormalizedHammingDistanceFast(Hash h)Calculate the hamming distance of 2 hash values.byte[]toByteArray()Return the byte representation of the big integer with the leading zero byte stripped if present.voidtoFile(File saveLocation)Saves this hash to a file for persistent storage.BufferedImagetoImage(int blockSize)Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash assuming default bit encoding.BufferedImagetoImage(int[] bitColorIndex, javafx.scene.paint.Color[] colors, int blockSize)Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash.BufferedImagetoImage(int blockSize, HashingAlgorithm hasher)Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash.StringtoString()
-
-
-
Field Detail
-
algorithmId
protected int algorithmId
Unique identifier of the algorithm and settings used to create the hash
-
hashValue
protected BigInteger hashValue
Hash value representation Hashes are constructed by left shifting BigIntegers with either Zero or One depending on the condition found in the image. Preceding 0's will be truncated therefore it is the algorithms responsibility to add a 1 padding bit at the beginning new BigInteger("011011) new BigInteger("000101) 1xxxxx
-
hashLength
protected int hashLength
How many bits does this hash represent. Necessary due to suffix 0 bits beginning dropped.
-
-
Constructor Detail
-
Hash
public Hash(BigInteger hashValue, int hashLength, int algorithmId)
Creates a Hash object with the specified hashValue and algorithmId. To allow save comparison of different hashes they have to be generated by the same algorithm.- Parameters:
hashValue- The hash value describing the imagehashLength- the actual bit resolution of the hash. The bigInteger truncates leading zero bits resulting in a loss of length information.algorithmId- Unique identifier of the algorithm used to create this hash
-
-
Method Detail
-
hammingDistance
public int hammingDistance(Hash h)
Calculate the hamming distance of 2 hash values. The distance of two hashes is the difference of the individual bits found in the hash.The hamming distance falls within [0-bitResolution]. Lower values indicate closer similarity while identical images must return a score of 0. On the flip side score of 0 does not mean images have to be identical!
A longer hash (higher bitResolution) will increase the average hamming distance returned. While this method allows for the most accurate fine tuning of the distance
normalizedHammingDistance(Hash)is hash length independent.Please be aware that only hashes produced by the same algorithm with the same settings will return meaningful result and should be compared. This method will check if the hashes are compatible if no additional check is required see
hammingDistanceFast(Hash)- Parameters:
h- The hash to calculate the distance to- Returns:
- similarity value ranging between [0 - hash length]
-
hammingDistanceFast
public int hammingDistanceFast(Hash h)
Calculate the hamming distance of 2 hash values. The distance of two hashes is the difference of the individual bits found in the hash.The hamming distance falls within [0-bitResolution]. Lower values indicate closer similarity while identical images must return a score of 0. On the flip side score of 0 does not mean images have to be identical!
A longer hash (higher bitResolution) will increase the average hamming distance returned. While this method allows for the most accurate fine tuning of the distance
normalizedHammingDistance(Hash)is hash length independent.Please be aware that only hashes produced by the same algorithm with the same settings will return meaningful result and should be compared. This method will NOT check if the hashes are compatible.
- Parameters:
h- The hash to calculate the distance to- Returns:
- similarity value ranging between [0 - hash length]
- See Also:
hammingDistance(Hash)
-
hammingDistanceFast
public int hammingDistanceFast(BigInteger bInt)
Calculate the hamming distance of 2 hash values. The distance of two hashes is the difference of the individual bits found in the hash.The hamming distance falls within [0-bitResolution]. Lower values indicate closer similarity while identical images must return a score of 0. On the flip side score of 0 does not mean images have to be identical!
A longer hash (higher bitResolution) will increase the average hamming distance returned. While this method allows for the most accurate fine tuning of the distance
normalizedHammingDistance(Hash)is hash length independent.Please be aware that only hashes produced by the same algorithm with the same settings will return meaningful result and should be compared. This method will NOT check if the hashes are compatible.
- Parameters:
bInt- A big integer representing a hash- Returns:
- similarity value ranging between [0 - hash length]
- See Also:
hammingDistance(Hash)
-
normalizedHammingDistance
public double normalizedHammingDistance(Hash h)
Calculate the hamming distance of 2 hash values. The distance of two hashes is the difference of the individual bits found in the hash.The normalized hamming distance falls within [0-1]. Lower values indicate closer similarity while identical images must return a score of 0. On the flip side score of 0 does not mean images have to be identical!
See
hammingDistance(Hash)for a non normalized version Please be aware that only hashes produced by the same algorithm with the same settings will return meaningful result and should be compared. This method will check if the hashes are compatible if no additional check is required seenormalizedHammingDistanceFast(Hash)- Parameters:
h- The hash to calculate the distance to- Returns:
- similarity value ranging between [0 - 1]
-
normalizedHammingDistanceFast
public double normalizedHammingDistanceFast(Hash h)
Calculate the hamming distance of 2 hash values. The distance of two hashes is the difference of the individual bits found in the hash.The normalized hamming distance falls within [0-1]. Lower values indicate closer similarity while identical images must return a score of 0. On the flip side score of 0 does not mean images have to be identical!
See
hammingDistance(Hash)for a non normalized version Please be aware that only hashes produced by the same algorithm with the same settings will return meaningful result and should be compared. This method will NOT check if the hashes are compatible.- Parameters:
h- The hash to calculate the distance to- Returns:
- similarity value ranging between [0 - 1]
- See Also:
hammingDistance(Hash)
-
getBit
public boolean getBit(int position)
Check if the bit at the given position is set.- Parameters:
position- of the bit. An index of 0 points to the lowest (rightmost bit)- Returns:
- true if the bit is set (1) or false if it's not set (0)
- Throws:
IllegalArgumentException- if the supplied index is outside the hash bound- Since:
- 2.0.0
-
getBitUnsafe
public boolean getBitUnsafe(int position)
Check if the bit at the given position of the hash is set. This method does not check the bounds of the supplied argument.- Parameters:
position- of the bit. An index of 0 points to the lowest (rightmost bit)- Returns:
- true if the bit is set (1). False if it's not set (0) ot the index is bigger than the hash length.
- Throws:
ArithmeticException- if position is negative- Since:
- 2.0.0
-
getAlgorithmId
public int getAlgorithmId()
Return the algorithm identifier specifying by which algorithm and setting this hash was created. The id shall remain constant.- Returns:
- The algorithm id
-
getHashValue
public BigInteger getHashValue()
- Returns:
- the base BigInteger holding the hash value
-
toImage
public BufferedImage toImage(int blockSize)
Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash assuming default bit encoding.Some hash algorithms may chose to construct their hashes in a non default manner (e.g.
DifferenceHash). In this casetoImage(int, HashingAlgorithm)may help to resolve the issue;- Parameters:
blockSize- scaling factor of each pixel in the has. each bit of the hash will be represented to blockSize*blockSize pixels- Returns:
- A black and white image representing the individual bits of the hash
-
toImage
public BufferedImage toImage(int blockSize, HashingAlgorithm hasher)
Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash.Some hash algorithms may chose to construct their hashes in a non default manner (e.g.
DifferenceHash).- Parameters:
blockSize- scaling factor of each pixel in the has. each bit of the hash will be represented to blockSize*blockSize pixelshasher- HashAlgorithm which created this hash.- Returns:
- A black and white image representing the individual bits of the hash
- Since:
- 3.0.0
-
toImage
public BufferedImage toImage(int[] bitColorIndex, javafx.scene.paint.Color[] colors, int blockSize)
Creates a visual representation of the hash mapping the hash values to the section of the rescaled image used to generate the hash.- Parameters:
bitColorIndex- array mapping each bit of the hash to a color of the color arraycolors- array to colorize the pixelsblockSize- scaling factor of each pixel in the has. each bit of the hash will be represented to blockSize*blockSize pixels- Returns:
- A colorized image representing the individual bits of the hash
-
getBitResolution
public int getBitResolution()
- Returns:
- the hash resolution in bits
-
toFile
public void toFile(File saveLocation) throws IOException
Saves this hash to a file for persistent storage. The hash can later be recovered by callingfromFile(File);- Parameters:
saveLocation- the file to save the hash to- Throws:
IOException- If an error occurs during file access- Since:
- 3.0.0
-
fromFile
public static Hash fromFile(File source) throws IOException, ClassNotFoundException
Reads a hash from a serialization file and returns it. Only hashes can be read from file that got saved by the same class instance usingtoFile(File);- Parameters:
source- The file this hash can be read from.- Returns:
- a hash object
- Throws:
IOException- If an error occurs during file readClassNotFoundException- if the class used to serialize this hash can not be found- Since:
- 3.0.0
-
toByteArray
public byte[] toByteArray()
Return the byte representation of the big integer with the leading zero byte stripped if present. The BigInteger class prepends a sign byte if necessary to indicate the signum of the number. Since our hashes are always positive we can get rid of it and reduce the space requirement in our db by 1 byte.To reconstruct the big integer value we can simply prepend a [0x00] byte even if it wasn't present in the first place. The constructor
BigInteger(byte[])will take care of it.- Returns:
- the byte representation of the big integer without an artificial sign byte.
-
-