Class DatabaseImageMatcher
- java.lang.Object
-
- dev.brachtendorf.jimagehash.matcher.TypedImageMatcher
-
- dev.brachtendorf.jimagehash.matcher.persistent.database.DatabaseImageMatcher
-
- All Implemented Interfaces:
Serializable,AutoCloseable
- Direct Known Subclasses:
H2DatabaseImageMatcher
public class DatabaseImageMatcher extends TypedImageMatcher implements Serializable, AutoCloseable
A naive database based image matcher implementation. Images indexed by this matcher will be added to the database and retrieved if an image match is queried.The image matcher supports chaining multiple hashing steps which will be invoked in the order the algorithms were added. Once a hashing algorithm fails to match a specific image the image is discarded pruning the search tree quickly.
Opposed to the
ConsecutiveMatcherthis matcher does not stores a reference to the image data itself but just keeps track of the hash and the url of the image file. Additionally if hashing algorithms are added after images have been hashed the images will not be found without reindexing the image in question..Multiple database image matchers may use the same database in which case hashes created by the same hashing algorithm will be used in both matchers.
Starting from this point matcher1 would also be able to match against Image1. Be aware that this relationship isn't symmetric. Images added by calling matcher1.addImage(..) method will be matched at the first step in matcher0 but fail to find a hash forDatabaseImageMatcher matcher0, matcher1; matcher0.addHashingAlgorithm(new AverageHash(32),...,...) matcher1.addHashingAlgorithm(new AverageHash(32),...,...) matcher0.addHashingAlgorithm(new AverageHash(24),...,...) matcher0.addImage(Image1)AverageHash(24)therefore discarding the image as a possible match.If this behaviour is not desired simply choose a different database for each image matcher.
2 + n Tables are generated to save vales:
- ImageHasher(id,serialize): Allows to serialize an image matcher to the database
- HashingAlgos(id,keyLenght): Saves the bit resolution of each hashing algorithm
- ... n a table for each hashing algorithm used in an image matcher
For each and every match the hashes have to be read from the database. This allows to persistently stores hashes but might not be as efficient as the
ConsecutiveMatcher. Optimizations may include to store 0 or 1 level hashes (hashes created by the first invoked hashing algorithms at a memory level and only retrieve the later hashes from the database.- Since:
- 2.0.2 added, 3.0.0 extract h2 database image matcher into it's own class
- Author:
- Kilian
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class dev.brachtendorf.jimagehash.matcher.TypedImageMatcher
TypedImageMatcher.AlgoSettings
-
-
Field Summary
Fields Modifier and Type Field Description protected ConnectionconnDatabase connection.-
Fields inherited from class dev.brachtendorf.jimagehash.matcher.TypedImageMatcher
steps
-
-
Constructor Summary
Constructors Constructor Description DatabaseImageMatcher(Connection connection)Attempts to establish a connection to the given database using the supplied connection object.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddHashingAlgorithm(HashingAlgorithm algo, double threshold)Append a new hashing algorithm which will be executed after all hash algorithms passed the test.voidaddHashingAlgorithm(HashingAlgorithm algo, double threshold, boolean normalized)Append a new hashing algorithm which will be executed after all hash algorithms passed the test.protected voidaddImage(HashingAlgorithm hashAlgo, String url, BufferedImage image)voidaddImage(File imageFile)Index the image.voidaddImage(String uniqueId, BufferedImage image)Index the image.voidaddImage(String uniqueId, File imageFile)Index the image.voidaddImages(File... images)Index the images.voidaddImages(String[] uniqueIds, BufferedImage[] images)Index the images.voidaddImages(String[] uniqueIds, File[] images)Index the images.voidclearHashingAlgorithms(boolean forceTableDeletion)Removes all hashing algorithm from the image matcher.voidclose()protected voidcreateHashTable(HashingAlgorithm hasher)Create a table to hold image hashes for a particular image hashing algorithmbooleandoesEntryExist(String uniqueId, HashingAlgorithm hashAlgo)Check if an entry with the given uniqueId already existsprotected booleandoesTableExist(String tableName)Query if the database contains a table with the given nameMap<String,PriorityQueue<Result<String>>>getAllMatchingImages()Return all images stored in the database which are considered matches to other images in the database.static DatabaseImageMatchergetFromDatabase(Connection conn, int id)Get a database image matcher which previously was serialized usingserializeToDatabase(int).PriorityQueue<Result<String>>getMatchingImages(BufferedImage image)Search for all similar images passing the algorithm filters supplied to this matcher.PriorityQueue<Result<String>>getMatchingImages(File imageFile)Search for all similar images passing the algorithm filters supplied to this matcher.PriorityQueue<Result<String>>getMatchingImagesWithinDistance(BufferedImage image, double[] normalizedDistance)Search for all similar images passing the algorithm filters supplied to this matcher.protected List<Result<String>>getSimilarImages(Hash targetHash, int maxDistance, HashingAlgorithm hasher)Return all url descriptors which describe images within the provided hammington distance of the supplied hashprotected voidinitialize(Connection conn)Create the default tables used if they do not yet exist.protected HashreconstructHashFromDatabase(HashingAlgorithm hasher, byte[] bytes)Reconstruct a hash value from the databasebooleanremoveHashingAlgo(HashingAlgorithm algo, boolean forceTableDeletion)Removes the hashing algorithm from the image matcher.protected StringresolveTableName(HashingAlgorithm hashAlgo)Map a hashing algorithm to a table namevoidserializeToDatabase(int id)Serialize this image matcher to the database.StringtoString()-
Methods inherited from class dev.brachtendorf.jimagehash.matcher.TypedImageMatcher
clearHashingAlgorithms, equals, getAlgorithms, hashCode, removeHashingAlgo
-
-
-
-
Field Detail
-
conn
protected transient Connection conn
Database connection. Maybe use connection pooling?
-
-
Constructor Detail
-
DatabaseImageMatcher
public DatabaseImageMatcher(Connection connection) throws SQLException
Attempts to establish a connection to the given database using the supplied connection object. If the database does not yet exist an empty db will be initialized.- Parameters:
connection- the database connection- Throws:
SQLException- if a database access error occursnullSQLTimeoutException- when the driver has determined that the timeout value specified by thesetLoginTimeoutmethod has been exceeded and has at least tried to cancel the current database connection attempt
-
-
Method Detail
-
getFromDatabase
public static DatabaseImageMatcher getFromDatabase(Connection conn, int id) throws SQLException
Get a database image matcher which previously was serialized usingserializeToDatabase(int). If the serialized matcher does not exist the connection will be closed.- Parameters:
conn- the database connectionid- the id supplied to the serializeDatabase call- Returns:
- the image matcher found in the database or null if not present
- Throws:
SQLException- if an SQL exception occurs
-
initialize
protected void initialize(Connection conn) throws SQLException
Create the default tables used if they do not yet exist.- Parameters:
conn- The database connection- Throws:
SQLException- if an sql error occurs
-
addHashingAlgorithm
public void addHashingAlgorithm(HashingAlgorithm algo, double threshold)
Append a new hashing algorithm which will be executed after all hash algorithms passed the test.The same algorithm may only be added once. Attempts to add an identical algorithm will instead update the settings of the old instance.
This method assumes the normalized hamming distance. If the definite distance shall be used take a look at
TypedImageMatcher.addHashingAlgorithm(HashingAlgorithm, double, boolean)throws a wrapped SQL exception as RuntimeException if an SQL error occurs during table creation.- Overrides:
addHashingAlgorithmin classTypedImageMatcher- Parameters:
algo- The algorithms to be addedthreshold- maximum normalized hamming distance between hashes in order to pass as identical image
-
addHashingAlgorithm
public void addHashingAlgorithm(HashingAlgorithm algo, double threshold, boolean normalized)
Append a new hashing algorithm which will be executed after all hash algorithms passed the test.The same algorithm may only be added once to an image hasher. Attempts to add an identical algorithm will instead update the settings of the old instance. throws a wrapped SQL exception as RuntimeException if an SQL error occurs during table creation.
- Overrides:
addHashingAlgorithmin classTypedImageMatcher- Parameters:
algo- The algorithms to be addedthreshold- the threshold the hamming distance may be in order to pass as identical image.normalized- Weather the normalized or default hamming distance shall be used. The normalized hamming distance will be in range of [0-1] while the hamming distance depends on the length of the hash
-
addImage
public void addImage(File imageFile) throws IOException, SQLException
Index the image. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to the absolute path of the image.The path of the file has to be unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
imageFile- The image whose hash will be added to the matcher- Throws:
IOException- if an error exists reading the fileSQLException- if an SQL error occurs
-
addImage
public void addImage(String uniqueId, File imageFile) throws IOException, SQLException
Index the image. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to the absolute path of the image.The uniqueId has to be globally unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
uniqueId- a unique identifier returned if querying for the imageimageFile- The image whose hash will be added to the matcher- Throws:
IOException- if an error exists reading the fileSQLException- if an SQL error occurs- Since:
- 2.0.2
-
addImages
public void addImages(File... images) throws IOException, SQLException
Index the images. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to the absolute path of the image. *The path of the files have to be unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
images- The images whose hash will be added to the matcher- Throws:
IOException- if an error exists reading the fileSQLException- if an SQL error occurs
-
addImages
public void addImages(String[] uniqueIds, File[] images) throws IOException, SQLException
Index the images. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to the absolute path of the image.The uniqueIds have to be globally unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
uniqueIds- a unique identifier returned if querying for the imageimages- The images whose hash will be added to the matcher- Throws:
IOException- if an error exists reading the fileSQLException- if an SQL error occursIllegalArgumentException- if uniqueIds and images don't have the same length- Since:
- 2.0.2
-
addImage
public void addImage(String uniqueId, BufferedImage image) throws SQLException
Index the image. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to a user supplied string.The uniqueId has to be globally unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
uniqueId- a unique identifier returned if querying for the imageimage- The image to hash- Throws:
SQLException- if an SQL error occurs
-
addImages
public void addImages(String[] uniqueIds, BufferedImage[] images) throws SQLException
Index the images. This enables the image matcher to find the image in future searches. The database image matcher does not store the image data itself but indexes the hash bound to a user supplied string.The uniqueIds have to be globally unique in order for this operation to return deterministic results. Otherwise this image will only added to the database for the hashing algorithms no entry exists yet.
This is useful for the situation in which you want to add an additional hashing algorithm to the database image matcher, but will leave the db in inconsistent stage the unique id is used multiple times.
- Parameters:
uniqueIds- a unique identifier returned if querying for the imageimages- The images to hash- Throws:
SQLException- if an SQL error occursIllegalArgumentException- if uniqueIds and images don't have the same length- Since:
- 2.0.2
- See Also:
addImage(String, BufferedImage)
-
serializeToDatabase
public void serializeToDatabase(int id) throws SQLExceptionSerialize this image matcher to the database. The image matcher object can be later be retrieved by callinggetFromDatabase(Connection, int)- Parameters:
id- The id this image matcher object will be associated with- Throws:
SQLException- if an SQL error occurs
-
removeHashingAlgo
public boolean removeHashingAlgo(HashingAlgorithm algo, boolean forceTableDeletion) throws SQLException
Removes the hashing algorithm from the image matcher.- Parameters:
algo- The algorithm to removeforceTableDeletion- if true also delete all hashes in the database created by this particular algorithm. false keep the table and hashes stored. If two or more image matcher use the same database caution should be used when using this command.- Returns:
- true if the algorithm was removed. False if it wasn't present
- Throws:
SQLException- if connection to the database failed. An SQL exception can only be thrown if forceTableDeletion is set to true. Even if an exception is thrown the algorithm will be removed from this particular image matcher object.
-
clearHashingAlgorithms
public void clearHashingAlgorithms(boolean forceTableDeletion) throws SQLExceptionRemoves all hashing algorithm from the image matcher.- Parameters:
forceTableDeletion- if true also delete all hashes in the database created by this particular algorithm. false keep the table and hashes stored. If two or more image matcher use the same database caution should be used when using this command.- Throws:
SQLException- if connection to the database failed. An SQL exception can only be thrown if forceTableDeletion is set to true. Even if an exception is thrown the algorithm will be removed from this particular image matcher object.
-
getAllMatchingImages
public Map<String,PriorityQueue<Result<String>>> getAllMatchingImages() throws SQLException
Return all images stored in the database which are considered matches to other images in the database.Be careful that depending on the number of images in the database this operation can be very expensive.
- Returns:
- A Map containing a queue which points to matched images
Key: UniqueId Of Image U1 Value: Images considered matches to U1The matched images are unique ids/file paths sorted by the hamming distance of the last applied algorithms - Throws:
SQLException- if an SQL error occurs- Since:
- 2.0.2
-
getMatchingImagesWithinDistance
public PriorityQueue<Result<String>> getMatchingImagesWithinDistance(BufferedImage image, double[] normalizedDistance) throws SQLException
Search for all similar images passing the algorithm filters supplied to this matcher. If the image itself was added to the matcher it will be returned with a distance of 0This method effectively circumvents the algorithm settings and should be used sparsely only when you know what you are doing. Usually you may want to use
instead.- Parameters:
image- The image to search matches fornormalizedDistance- the distance used for the algorithms- Returns:
- Return all unique ids/file paths sorted by the hamming distance of the last applied algorithms
- Throws:
SQLException- if an SQL error occurs- Since:
- 2.0.2
-
getMatchingImages
public PriorityQueue<Result<String>> getMatchingImages(File imageFile) throws SQLException, IOException
Search for all similar images passing the algorithm filters supplied to this matcher. If the image itself was added to the matcher it will be returned with a distance of 0- Parameters:
imageFile- The image other images will be matched against- Returns:
- Return all unique ids/file paths sorted by the hamming distance of the last applied algorithms
- Throws:
SQLException- if an SQL error occursIOException- if an error occurs when reading the file
-
getMatchingImages
public PriorityQueue<Result<String>> getMatchingImages(BufferedImage image) throws SQLException
Search for all similar images passing the algorithm filters supplied to this matcher. If the image itself was added to the matcher it will be returned with a distance of 0- Parameters:
image- The image other images will be matched against- Returns:
- Return all unique ids/file paths sorted by the hamming distance of the last applied algorithms
- Throws:
SQLException- if an SQL error occurs
-
getSimilarImages
protected List<Result<String>> getSimilarImages(Hash targetHash, int maxDistance, HashingAlgorithm hasher) throws SQLException
Return all url descriptors which describe images within the provided hammington distance of the supplied hash- Parameters:
targetHash- The hash to check the database againstmaxDistance- The maximum distance the hashes may havehasher- the hashing algorithm used to identify the table- Returns:
- all urls within distance x of the supplied hash
- Throws:
SQLException- if an SQL error occurs
-
addImage
protected void addImage(HashingAlgorithm hashAlgo, String url, BufferedImage image) throws SQLException
- Throws:
SQLException
-
createHashTable
protected void createHashTable(HashingAlgorithm hasher) throws SQLException
Create a table to hold image hashes for a particular image hashing algorithm- Parameters:
hasher- the hashing algorithm- Throws:
SQLException- if an SQL error occurs
-
doesTableExist
protected boolean doesTableExist(String tableName) throws SQLException
Query if the database contains a table with the given name- Parameters:
tableName- The table name to check for- Returns:
- true if a table with the name exists, false otherwise
- Throws:
SQLException- if an SQLError occurs
-
resolveTableName
protected String resolveTableName(HashingAlgorithm hashAlgo)
Map a hashing algorithm to a table name- Parameters:
hashAlgo- The hashing algorithm- Returns:
- the table name to identify the table used to save hashes produced by this algorithm into
-
reconstructHashFromDatabase
protected Hash reconstructHashFromDatabase(HashingAlgorithm hasher, byte[] bytes)
Reconstruct a hash value from the database- Parameters:
hasher- The hashing algorithm used to create the hashbytes- the byte array stored in the database- Returns:
- a hash value which tests .equals == true to the hash object saved in the database
- Since:
- 2.0.2
-
doesEntryExist
public boolean doesEntryExist(String uniqueId, HashingAlgorithm hashAlgo) throws SQLException
Check if an entry with the given uniqueId already exists- Parameters:
uniqueId- the unique id to check againsthashAlgo- the hashing algorithm- Returns:
- true if the entry does not exist. false otherwise
- Throws:
SQLException- if an SQL error occurs- Since:
- 2.1.0
-
close
public void close() throws SQLException- Specified by:
closein interfaceAutoCloseable- Throws:
SQLException
-
-