org.icepdf.ri.common.search
Class DocumentSearchControllerImpl

java.lang.Object
  extended by org.icepdf.ri.common.search.DocumentSearchControllerImpl
All Implemented Interfaces:
org.icepdf.core.search.DocumentSearchController

public class DocumentSearchControllerImpl
extends java.lang.Object
implements org.icepdf.core.search.DocumentSearchController

Document search controller used to manage document searches. This class class takes care of many of the performance issues of doing searches on larges documents and is also used by PageViewComponentImpl to highlight search results.

This implementation uses simple search algorithm that will work well for most users. This class can be extended and the method searchHighlightPage(int) can be overridden for custom search implementations.

The DocumentSearchControllerImpl can be constructed to be used with the Viewer RI source code via the constructor that takes a SwingController as a parameter. The second variation is ended for a headless environment where Swing is not needed, the constructor for this instance takes a Document as a parameter.

Since:
4.0

Field Summary
protected  org.icepdf.core.pobjects.Document document
           
protected  DocumentSearchModelImpl searchModel
           
protected  SwingController viewerController
           
 
Constructor Summary
DocumentSearchControllerImpl(org.icepdf.core.pobjects.Document document)
          Create a news instance of search controller intended to be used in a headless environment.
DocumentSearchControllerImpl(SwingController viewerController)
          Create a news instance of search controller.
 
Method Summary
 org.icepdf.core.search.SearchTerm addSearchTerm(java.lang.String term, boolean caseSensitive, boolean wholeWord)
          Add the search term to the list of search terms.
 void clearAllSearchHighlight()
          Clears all highlighted text states for this this document.
 void clearSearchHighlight(int pageIndex)
          Clear all searched items for specified page.
 void dispose()
          Disposes controller clearing resources.
protected  org.icepdf.core.pobjects.graphics.text.PageText getPageText(int pageIndex)
          Gets teh page text for the given page index.
 boolean isSearchHighlightRefreshNeeded(int pageIndex, org.icepdf.core.pobjects.graphics.text.PageText pageText)
          Test to see if a search highlight is needed.
 void removeSearchTerm(org.icepdf.core.search.SearchTerm searchTerm)
          Removes the specified search term from the search.
 int searchHighlightPage(int pageIndex)
          Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean).
 java.util.ArrayList<org.icepdf.core.pobjects.graphics.text.LineText> searchHighlightPage(int pageIndex, int wordPadding)
          Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean).
 int searchHighlightPage(int pageIndex, java.lang.String term, boolean caseSensitive, boolean wholeWord)
          Searches the given page using the specified term and properties.
 java.util.ArrayList<org.icepdf.core.pobjects.graphics.text.WordText> searchPage(int pageIndex)
          Search page but only return words that are hits.
protected  java.util.ArrayList<java.lang.String> searchPhraseParser(java.lang.String phrase)
          Utility for breaking the pattern up into searchable words.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

searchModel

protected DocumentSearchModelImpl searchModel

viewerController

protected SwingController viewerController

document

protected org.icepdf.core.pobjects.Document document
Constructor Detail

DocumentSearchControllerImpl

public DocumentSearchControllerImpl(SwingController viewerController)
Create a news instance of search controller. A search model is created for this instance.

Parameters:
viewerController - parent controller/mediator.

DocumentSearchControllerImpl

public DocumentSearchControllerImpl(org.icepdf.core.pobjects.Document document)
Create a news instance of search controller intended to be used in a headless environment. A search model is created for this instance.

Parameters:
document - document to search.
Method Detail

searchHighlightPage

public int searchHighlightPage(int pageIndex,
                               java.lang.String term,
                               boolean caseSensitive,
                               boolean wholeWord)
Searches the given page using the specified term and properties. The search model is updated to store the pages Page text as a weak reference which can be queried using isSearchHighlightNeeded to efficiently make sure that a pages text is highlighted even after a dispose/init cycle. If the text state is no longer present then the search should be executed again.

This method clears the search results for the page before it searches. If you wish to have cumulative search results then searches terms should be added with addSearchTerm(String, boolean, boolean) and the method searchPage(int) should be called after each term is added or after all have been added.

Specified by:
searchHighlightPage in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page to search
caseSensitive - if true use case sensitive searches
wholeWord - if true use whole word searches
term - term to search for
Returns:
number for hits for this page.

searchHighlightPage

public int searchHighlightPage(int pageIndex)
Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean). If search hits where detected then the Page's PageText is added to the cache.

This method represent the core search algorithm for this DocumentSearchController implementation. This method can be over riden if a different search algorithm or functionality is needed.

Specified by:
searchHighlightPage in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page index to search
Returns:
number of hits found for this page.

searchHighlightPage

public java.util.ArrayList<org.icepdf.core.pobjects.graphics.text.LineText> searchHighlightPage(int pageIndex,
                                                                                                int wordPadding)
Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean). If search hits where detected then the Page's PageText is added to the cache.

This class differences from searchHighlightPage(int) in that is returns a list of lineText fragments for each hit but the LinText is padded by pre and post words that surround the hit in the page context.

This method represent the core search algorithm for this DocumentSearchController implementation. This method can be over riden if a different search algorithm or functionality is needed.

Specified by:
searchHighlightPage in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page index to search
wordPadding - word padding on either side of hit to give context to found words in the returned LineText
Returns:
list of contextual hits for the give page. If no hits an empty list is returned.

searchPage

public java.util.ArrayList<org.icepdf.core.pobjects.graphics.text.WordText> searchPage(int pageIndex)
Search page but only return words that are hits. Highlighting is till applied but this method can be used if other data needs to be extracted from the found words.

Specified by:
searchPage in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page to search
Returns:
list of words that match the term and search properties.

addSearchTerm

public org.icepdf.core.search.SearchTerm addSearchTerm(java.lang.String term,
                                                       boolean caseSensitive,
                                                       boolean wholeWord)
Add the search term to the list of search terms. The term is split into words based on white space and punctuation. No checks are done for duplication.

A new search needs to be executed for this change to take place.

Specified by:
addSearchTerm in interface org.icepdf.core.search.DocumentSearchController
Parameters:
term - single word or phrase to search for.
caseSensitive - is search case sensitive.
wholeWord - is search whole word sensitive.
Returns:
searchTerm newly create search term.

removeSearchTerm

public void removeSearchTerm(org.icepdf.core.search.SearchTerm searchTerm)
Removes the specified search term from the search. A new search needs to be executed for this change to take place.

Specified by:
removeSearchTerm in interface org.icepdf.core.search.DocumentSearchController
Parameters:
searchTerm - search term to remove.

clearSearchHighlight

public void clearSearchHighlight(int pageIndex)
Clear all searched items for specified page.

Specified by:
clearSearchHighlight in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page indext to clear

clearAllSearchHighlight

public void clearAllSearchHighlight()
Clears all highlighted text states for this this document. This optimized to use the the SearchHighlightModel to only clear pages that still have selected states.

Specified by:
clearAllSearchHighlight in interface org.icepdf.core.search.DocumentSearchController

isSearchHighlightRefreshNeeded

public boolean isSearchHighlightRefreshNeeded(int pageIndex,
                                              org.icepdf.core.pobjects.graphics.text.PageText pageText)
Test to see if a search highlight is needed. This is done by first check if there is a hit for this page and if the PageText object is the same as the one specified as a param. If they are not the same PageText object then we need to do refresh as the page was disposed and reinitialized with new content.

Specified by:
isSearchHighlightRefreshNeeded in interface org.icepdf.core.search.DocumentSearchController
Parameters:
pageIndex - page index to text for results.
pageText - current pageText object associated with the pageIndex.
Returns:
true if refresh is needed, false otherwise.

dispose

public void dispose()
Disposes controller clearing resources.

Specified by:
dispose in interface org.icepdf.core.search.DocumentSearchController

getPageText

protected org.icepdf.core.pobjects.graphics.text.PageText getPageText(int pageIndex)
Gets teh page text for the given page index.

Parameters:
pageIndex - page index of page to extract text.
Returns:
page's page text, can be null.

searchPhraseParser

protected java.util.ArrayList<java.lang.String> searchPhraseParser(java.lang.String phrase)
Utility for breaking the pattern up into searchable words. Breaks are done on white spaces and punctuation.

Parameters:
phrase - pattern to search words for.
Returns:
list of words that make up phrase, words, spaces, punctuation.