Class ModelUtils

java.lang.Object
org.biopax.paxtools.controller.ModelUtils

public final class ModelUtils extends Object
Several useful algorithms and examples, e.g., to extract root or child BioPAX L3 elements, remove dangling, replace elements or URIs, fix/infer property values, etc. NOTE: despite it is public class and has public methods, this class can be (and has been already) modified (sometimes considerably) in every minor revision; it was not designed to be Paxtools' public API... So, we encourage users copy some methods to their own apps rather than depend on this unstable utility class in long term.
Author:
rodche, Arman, Emek
  • Method Details

    • replace

      public static void replace(Model model, Map<? extends BioPAXElement,? extends BioPAXElement> subs)
      Replaces BioPAX elements in the model with ones from the map, updates corresponding BioPAX object references. It does not neither remove the old nor add new elements in the model (if required, one can do this before/after this method, e.g., using the same 'subs' map) This does visit all object properties of each "explicit" element in the model, but does not traverse deeper into one's sub-properties to replace something there as well (e.g., nested member entity references are not replaced unless parent entity reference present in the model) This does not automatically move/migrate old (replaced) object's children to new objects (the replacement ones are supposed to have their own properties already set or to be set shortly; otherwise, consider using of something like fixDanglingInverseProperties(BioPAXElement, Model) after.
      Parameters:
      model - biopax model where the objects are to be replaced
      subs - the replacements map (many-to-one, old-to-new)
      Throws:
      IllegalBioPAXArgumentException - if there is an incompatible type replacement object
    • getRootElements

      public static <T extends BioPAXElement> Set<T> getRootElements(Model model, Class<T> filterClass)
      Finds "root" BioPAX objects that belong to a particular class (incl. sub-classes) in the model. Note: however, such "root" elements may or may not be, a property of other elements, not included in the model.
      Type Parameters:
      T - biopax type
      Parameters:
      model - biopax model to work with
      filterClass - filter class (including subclasses)
      Returns:
      set of the root biopax objects of given type
    • removeObjectsIfDangling

      public static <T extends BioPAXElement> Set<BioPAXElement> removeObjectsIfDangling(Model model, Class<T> clazz)
      Iteratively removes "dangling" elements of given type and its sub-types, e.g. Xref.class objects, from the BioPAX model. If the "model" does not contain any root Entity class objects, and the second parameter is basic UtilityClass.class (i.e., not its sub-class), then it simply logs a warning and quits shortly (otherwise, it would remove everything from the model). Do not use basic Entity.class either (but a sub-class is OK) for the same reason (it would delete everything). This, however, does not change relationships among objects, particularly, some inverse properties, such as entityReferenceOf or xrefOf, may still refer to a removed object.
      Type Parameters:
      T - biopax type
      Parameters:
      model - to modify
      clazz - filter-class (filter by this type and sub-classes)
      Returns:
      removed objects
    • writeRead

      public static Model writeRead(Model model)
      Cuts the BioPAX model off other models and BioPAX objects by essentially performing write/read to/from OWL. The resulting model contains new objects with same IDs and have object properties "fixed", i.e., dangling values become null/empty, and inverse properties (e.g. xrefOf) re-calculated. The original model is unchanged. Note: this method will fail for very large models (if resulting RDF/XML utf8 string is longer than approx. 1Gb)
      Parameters:
      model - biopax model to process
      Returns:
      copy of the model
    • getDirectChildren

      public static Model getDirectChildren(BioPAXElement bpe)
      Gets direct children of a given BioPAX element and adds them to a new model.
      Parameters:
      bpe - biopax element/object
      Returns:
      new model
    • getAllChildren

      public static Model getAllChildren(BioPAXElement bpe, Filter<PropertyEditor>... filters)
      Deprecated.
      use Fetcher.fetch(BioPAXElement, Model) instead (with Fetcher.nextStepFilter or without)
      Gets all the child BioPAX elements of a given BioPAX element (using the "tuned" Fetcher) and adds them to a new model.
      Parameters:
      bpe - biopax object
      filters - property filters (e.g., for Fetcher to skip some properties). Default is to skip 'nextStep'.
      Returns:
      new biopax Model that contain all the child objects
    • getDirectChildrenAsSet

      public static Set<BioPAXElement> getDirectChildrenAsSet(BioPAXElement bpe)
      Collects direct children of a given BioPAX element.
      Parameters:
      bpe - biopax object (parent)
      Returns:
      set of child biopax objects
    • generateClassMetrics

      public static Map<Class<? extends BioPAXElement>,Integer> generateClassMetrics(Model model)
      Generates simple counts of different elements in the model.
      Parameters:
      model - biopax model to analyze
      Returns:
      a biopax types - to counts of objects of each type map
    • getObject

      public static <T extends BioPAXElement> T getObject(Model model, String uri, Class<T> clazz)
      A more strict, type-safe way to ask for a biopax object from the model, unlike Model.getByID(String).
      Type Parameters:
      T - biopax type
      Parameters:
      model - biopax model to query
      uri - absolute URI of a biopax element
      clazz - class-filter (to filter by the biopax type and its sub-types)
      Returns:
      the biopax object or null (if no such element, or element with this URI is of incompatible type)
    • md5hex

      public static String md5hex(String id)
      Calculates MD5 hash code (as 32-byte hex. string). This method is not BioPAX specific. Can be used for many purposes, such as generating new unique URIs, database primary keys, etc.
      Parameters:
      id - some identifier, e.g., URI
      Returns:
      the 32-byte digest string
    • fixDanglingObjectProperties

      public static void fixDanglingObjectProperties(BioPAXElement bpe, Model model)
      Unlinks object properties of the BioPAX object from values the model does not have.
      Parameters:
      bpe - a biopax object
      model - the model to look for objects in
    • fixDanglingInverseProperties

      public static void fixDanglingInverseProperties(BioPAXElement bpe, Model model)
      Unlinks inverse properties of the BioPAX object from values the model does not have.
      Parameters:
      bpe - BioPAX object
      model - where to look for other objects
    • getFeatureIntersection

      public static Set<EntityFeature> getFeatureIntersection(PhysicalEntity first, org.biopax.paxtools.controller.ModelUtils.FeatureType firstClass, PhysicalEntity second, org.biopax.paxtools.controller.ModelUtils.FeatureType secondClass)
    • getFeatureSetByType

      public static Set<EntityFeature> getFeatureSetByType(PhysicalEntity pe, org.biopax.paxtools.controller.ModelUtils.FeatureType type)
    • checkERFeatureSet

      public static boolean checkERFeatureSet(EntityReference er, boolean fix)
      Finds and adds all (missing) entity features to given entity reference from all its owner simple physical entities ('feature' and 'notFeature' properties). Though, it neither checks for nor resolves any violations of the 'entityFeature' property's inverse functional constraint (i.e., an EntityFeature instance can only belong to one and only one EntityReference object).
      Parameters:
      er - entity reference object
      fix - flag
      Returns:
      true or false
    • findFeaturesAddedToSecond

      public static Set<EntityFeature> findFeaturesAddedToSecond(PhysicalEntity first, PhysicalEntity second, boolean fix)
    • fixControlled

      public static void fixControlled(Model model, Control control)
      In Paxtools v6, controlled property won't accept multiple values (due to the OWL functional property restriction, which we so far forgot of); so, let's make sure every Control has at most one controlled process.
      Parameters:
      model - biopax model
      control - to be cloned to set one controlled per control
    • normalizeGeneric

      public static void normalizeGeneric(Model model, PhysicalEntity generic)
      In all interactions and complexes, replace generic physical entities (having members) with their corresponding members; clone the parent object, if needed, for each member.
      Parameters:
      model - biopax model
      generic - physical entity (PE) that has member PEs
    • normalizeGenerics

      public static void normalizeGenerics(Model model)
      Converts each generic simple (except a Complex) physical entity having memberPhysicalEntity property set into equivalent physical entity with a generic entity reference (have memberEntityReference values). Complexes cannot be normalized in the same way, for they do not have entityReference property and might also contain generic components. In general, avoid using 'memberPhysicalEntity' (made exclusively for Reactome) in BioPAX models, for there is a better alternative - using entityReference/memberEntityReference.
      Parameters:
      model - biopax model to fix
    • addMissingEntityReference

      public static void addMissingEntityReference(Model model, SimplePhysicalEntity pe)
      For a non-generic simple physical entity (memberPhysicalEntity property is empty) that does not have entityReference property defined, this method generates and adds a new entity reference of proper type to both this entity and the model, and also copies names and xrefs from the source physical entity to the generated entity reference (UnificationXrefs are converted to RelationshipXref and then also deleted from the original entity.)
      Parameters:
      model - the BioPAX model
      pe - a simple physical entity (that has neither entityReference nor memberPEs set)
    • replaceEquivalentFeatures

      public static void replaceEquivalentFeatures(Model model)
      This method iterates over the features in a model and tries to find equivalent objects and merges them.
      Parameters:
      model - to be fixed
    • getKeywords

      public static Set<String> getKeywords(BioPAXElement biopaxElement, int depth, Filter<DataPropertyEditor>... dataPropertyFilters)
      Collects data type (not object) property values (can be then used for full-text indexing).
      Parameters:
      biopaxElement - biopax object
      depth - greater or equals 0: 0 means use this object's data properties only; 1 - add child's data properties, etc.; (the meaning is slightly different from that of Fetcher.fetch(..) method)
      dataPropertyFilters - - biopax data property filters to optionally either skip e.g. properties 'sequence', 'temperature', or only accept 'term', 'comment', 'name', etc.
      Returns:
      set of keywords
    • getOrganisms

      public static Set<BioSource> getOrganisms(BioPAXElement biopaxElement)
      Collects BioSource objects from this or related elements (where it makes sense; though the biopax element might have no or empty 'organism' property at all. The idea is to additionally associate with existing BioSource objects, and thus make filtering by organism possible, for at least Interaction, Protein, Complex, Dna, etc. biopax entities.
      Parameters:
      biopaxElement - biopax object
      Returns:
      organism names
    • getDatasources

      public static Set<Provenance> getDatasources(BioPAXElement biopaxElement)
      Collects all Provenance objects associated with this one as follows: - if the element is Entity (has 'dataSource' property) or is Provenence itself, get the values and quit; - if the biopax element is PathwayStep or EntityReference, traverse into some of its object/inverse properties to collect dataSource values from associated entities. - return empty set for all other BioPAX types (it is less important to associate common self-descriptive biopax utility classes with particular pathway data sources)
      Parameters:
      biopaxElement - a biopax object
      Returns:
      Provenance objects set
    • getParentPathways

      public static Set<Pathway> getParentPathways(BioPAXElement biopaxElement)
      Collects all parent Pathway objects recursively traversing the inverse object properties of the biopax element. It ignores all BioPAX types except (incl. sub-classes of): Pathway, Interaction, PathwayStep, PhysicalEntity, EntityReference, and Gene.
      Parameters:
      biopaxElement - biopax object
      Returns:
      inferred parent pathways
    • mergeEquivalentInteractions

      public static void mergeEquivalentInteractions(Model model)
      Merges equivalent interactions (currently - Conversions only). TODO: shall we rename to mergeEquivalentConversions instead (this is what it does)? Warning: experimental; - check if the result is desirable; the result very much depends on actual pathway data quality...
      Parameters:
      model - to edit/update
    • mergeEquivalentPhysicalEntities

      public static void mergeEquivalentPhysicalEntities(Model model)
      Merges equivalent physical entities. This can greatly decrease model's size and improve some visualizations, but can also introduce (or uncover hidden) semantic problems, such as when a physical entity is both component of a complex and independently participates in an interaction (this can happen when location and mod. features of a protein are not defined - only names, xrefs and perhaps entity reference - are there). Note (warning): please check if the result is desirable; the result of the merging very much depends on actual pathway data quality (in fact, such merging is better if decided and done by a data provider before releasing the data)...
      Parameters:
      model - to edit/update
    • encodeBase62

      public static String encodeBase62(String str)
    • shortenUri

      public static String shortenUri(String xmlbase, String uri)
      Creates a short URI from the URI, given the xml:base. One have to check the new URI is unique before using in a model (if not - e.g., add some suffix to the xmlBase parameter and try again).
      Parameters:
      xmlbase -
      uri -
      Returns:
      a short URI
    • updateUri

      public static void updateUri(Model model, BioPAXElement el, String newUri)
      Replaces the URI of a BioPAX object in the Model using java reflection. If the element also belongs to other BioPAX models, those will become inconsistent unless this method is called for each such model. Warnings: - one should not normally use this method at all; - but if you do, then don't use a URI of another object from the same model.
      Parameters:
      model - model (can be null; if the object in fact belongs to a model, the model will be inconsistent)
      el - biopax object
      newUri - URI - not null/empty URI
    • breakPathwayComponentCycle

      public static void breakPathwayComponentCycle(Model model)
      Removes cyclic pathway inclusions, non-trivial infinite loops, in 'pathwayComponent' biopax property. Such loops usually do not make much sense and only can cause trouble in pathway data analysis. This tool recursively removes parent pathways from sub pathways' pathwayComponent set.
      Parameters:
      model - a model that contains Pathways; will be modified as the result
    • isGeneric

      public static boolean isGeneric(BioPAXElement e)
      Checks whether the BioPAX element is generic physical entity or entity reference.
      Parameters:
      e - biopax object
      Returns:
      true when the object is generic physical entity or entity reference