Package org.biopax.paxtools.controller
Class ModelUtils
java.lang.Object
org.biopax.paxtools.controller.ModelUtils
Several useful algorithms and examples, e.g., to extract root or child
BioPAX L3 elements, remove dangling, replace elements
or URIs, fix/infer property values, etc.
NOTE: despite it is public class and has public methods,
this class can be (and has been already) modified (sometimes considerably)
in every minor revision; it was not designed to be Paxtools' public API...
So, we encourage users copy some methods to their own apps rather than
depend on this unstable utility class in long term.
- Author:
- rodche, Arman, Emek
-
Method Summary
Modifier and TypeMethodDescriptionstatic voidaddMissingEntityReference(Model model, SimplePhysicalEntity pe) For a non-generic simple physical entity (memberPhysicalEntity property is empty) that does not have entityReference property defined, this method generates and adds a new entity reference of proper type to both this entity and the model, and also copies names and xrefs from the source physical entity to the generated entity reference (UnificationXrefs are converted to RelationshipXref and then also deleted from the original entity.)static voidbreakPathwayComponentCycle(Model model) Removes cyclic pathway inclusions, non-trivial infinite loops, in 'pathwayComponent' biopax property.static booleancheckERFeatureSet(EntityReference er, boolean fix) Finds and adds all (missing) entity features to given entity reference from all its owner simple physical entities ('feature' and 'notFeature' properties).static StringencodeBase62(String str) static Set<EntityFeature> findFeaturesAddedToSecond(PhysicalEntity first, PhysicalEntity second, boolean fix) static voidfixControlled(Model model, Control control) In Paxtools v6, controlled property won't accept multiple values (due to the OWL functional property restriction, which we so far forgot of); so, let's make sure every Control has at most one controlled process.static voidfixDanglingInverseProperties(BioPAXElement bpe, Model model) Unlinks inverse properties of the BioPAX object from values the model does not have.static voidfixDanglingObjectProperties(BioPAXElement bpe, Model model) Unlinks object properties of the BioPAX object from values the model does not have.static Map<Class<? extends BioPAXElement>, Integer> generateClassMetrics(Model model) Generates simple counts of different elements in the model.static ModelgetAllChildren(BioPAXElement bpe, Filter<PropertyEditor>... filters) Deprecated.static Set<Provenance> getDatasources(BioPAXElement biopaxElement) Collects all Provenance objects associated with this one as follows: - if the element is Entity (has 'dataSource' property) or is Provenence itself, get the values and quit; - if the biopax element is PathwayStep or EntityReference, traverse into some of its object/inverse properties to collect dataSource values from associated entities.static ModelGets direct children of a given BioPAX element and adds them to a new model.static Set<BioPAXElement> Collects direct children of a given BioPAX element.static Set<EntityFeature> getFeatureIntersection(PhysicalEntity first, org.biopax.paxtools.controller.ModelUtils.FeatureType firstClass, PhysicalEntity second, org.biopax.paxtools.controller.ModelUtils.FeatureType secondClass) static Set<EntityFeature> getFeatureSetByType(PhysicalEntity pe, org.biopax.paxtools.controller.ModelUtils.FeatureType type) getKeywords(BioPAXElement biopaxElement, int depth, Filter<DataPropertyEditor>... dataPropertyFilters) Collects data type (not object) property values (can be then used for full-text indexing).static <T extends BioPAXElement>
TA more strict, type-safe way to ask for a biopax object from the model, unlikeModel.getByID(String).getOrganisms(BioPAXElement biopaxElement) Collects BioSource objects from this or related elements (where it makes sense; though the biopax element might have no or empty 'organism' property at all.getParentPathways(BioPAXElement biopaxElement) Collects all parent Pathway objects recursively traversing the inverse object properties of the biopax element.static <T extends BioPAXElement>
Set<T> getRootElements(Model model, Class<T> filterClass) Finds "root" BioPAX objects that belong to a particular class (incl.static booleanChecks whether the BioPAX element is generic physical entity or entity reference.static StringCalculates MD5 hash code (as 32-byte hex.static voidmergeEquivalentInteractions(Model model) Merges equivalent interactions (currently - Conversions only).static voidMerges equivalent physical entities.static voidnormalizeGeneric(Model model, PhysicalEntity generic) In all interactions and complexes, replace generic physical entities (having members) with their corresponding members; clone the parent object, if needed, for each member.static voidnormalizeGenerics(Model model) Converts each generic simple (except a Complex) physical entity having memberPhysicalEntity property set into equivalent physical entity with a generic entity reference (have memberEntityReference values).static <T extends BioPAXElement>
Set<BioPAXElement> removeObjectsIfDangling(Model model, Class<T> clazz) Iteratively removes "dangling" elements of given type and its sub-types, e.g.static voidreplace(Model model, Map<? extends BioPAXElement, ? extends BioPAXElement> subs) Replaces BioPAX elements in the model with ones from the map, updates corresponding BioPAX object references.static voidreplaceEquivalentFeatures(Model model) This method iterates over the features in a model and tries to find equivalent objects and merges them.static StringshortenUri(String xmlbase, String uri) Creates a short URI from the URI, given the xml:base.static voidupdateUri(Model model, BioPAXElement el, String newUri) Replaces the URI of a BioPAX object in the Model using java reflection.static ModelCuts the BioPAX model off other models and BioPAX objects by essentially performing write/read to/from OWL.
-
Method Details
-
replace
Replaces BioPAX elements in the model with ones from the map, updates corresponding BioPAX object references. It does not neither remove the old nor add new elements in the model (if required, one can do this before/after this method, e.g., using the same 'subs' map) This does visit all object properties of each "explicit" element in the model, but does not traverse deeper into one's sub-properties to replace something there as well (e.g., nested member entity references are not replaced unless parent entity reference present in the model) This does not automatically move/migrate old (replaced) object's children to new objects (the replacement ones are supposed to have their own properties already set or to be set shortly; otherwise, consider using of something likefixDanglingInverseProperties(BioPAXElement, Model)after.- Parameters:
model- biopax model where the objects are to be replacedsubs- the replacements map (many-to-one, old-to-new)- Throws:
IllegalBioPAXArgumentException- if there is an incompatible type replacement object
-
getRootElements
Finds "root" BioPAX objects that belong to a particular class (incl. sub-classes) in the model. Note: however, such "root" elements may or may not be, a property of other elements, not included in the model.- Type Parameters:
T- biopax type- Parameters:
model- biopax model to work withfilterClass- filter class (including subclasses)- Returns:
- set of the root biopax objects of given type
-
removeObjectsIfDangling
public static <T extends BioPAXElement> Set<BioPAXElement> removeObjectsIfDangling(Model model, Class<T> clazz) Iteratively removes "dangling" elements of given type and its sub-types, e.g. Xref.class objects, from the BioPAX model. If the "model" does not contain any root Entity class objects, and the second parameter is basic UtilityClass.class (i.e., not its sub-class), then it simply logs a warning and quits shortly (otherwise, it would remove everything from the model). Do not use basic Entity.class either (but a sub-class is OK) for the same reason (it would delete everything). This, however, does not change relationships among objects, particularly, some inverse properties, such as entityReferenceOf or xrefOf, may still refer to a removed object.- Type Parameters:
T- biopax type- Parameters:
model- to modifyclazz- filter-class (filter by this type and sub-classes)- Returns:
- removed objects
-
writeRead
Cuts the BioPAX model off other models and BioPAX objects by essentially performing write/read to/from OWL. The resulting model contains new objects with same IDs and have object properties "fixed", i.e., dangling values become null/empty, and inverse properties (e.g. xrefOf) re-calculated. The original model is unchanged. Note: this method will fail for very large models (if resulting RDF/XML utf8 string is longer than approx. 1Gb)- Parameters:
model- biopax model to process- Returns:
- copy of the model
-
getDirectChildren
Gets direct children of a given BioPAX element and adds them to a new model.- Parameters:
bpe- biopax element/object- Returns:
- new model
-
getAllChildren
Deprecated.useFetcher.fetch(BioPAXElement, Model)instead (with Fetcher.nextStepFilter or without)Gets all the child BioPAX elements of a given BioPAX element (using the "tuned"Fetcher) and adds them to a new model.- Parameters:
bpe- biopax objectfilters- property filters (e.g., for Fetcher to skip some properties). Default is to skip 'nextStep'.- Returns:
- new biopax Model that contain all the child objects
-
getDirectChildrenAsSet
Collects direct children of a given BioPAX element.- Parameters:
bpe- biopax object (parent)- Returns:
- set of child biopax objects
-
generateClassMetrics
Generates simple counts of different elements in the model.- Parameters:
model- biopax model to analyze- Returns:
- a biopax types - to counts of objects of each type map
-
getObject
A more strict, type-safe way to ask for a biopax object from the model, unlikeModel.getByID(String).- Type Parameters:
T- biopax type- Parameters:
model- biopax model to queryuri- absolute URI of a biopax elementclazz- class-filter (to filter by the biopax type and its sub-types)- Returns:
- the biopax object or null (if no such element, or element with this URI is of incompatible type)
-
md5hex
Calculates MD5 hash code (as 32-byte hex. string). This method is not BioPAX specific. Can be used for many purposes, such as generating new unique URIs, database primary keys, etc.- Parameters:
id- some identifier, e.g., URI- Returns:
- the 32-byte digest string
-
fixDanglingObjectProperties
Unlinks object properties of the BioPAX object from values the model does not have.- Parameters:
bpe- a biopax objectmodel- the model to look for objects in
-
fixDanglingInverseProperties
Unlinks inverse properties of the BioPAX object from values the model does not have.- Parameters:
bpe- BioPAX objectmodel- where to look for other objects
-
getFeatureIntersection
public static Set<EntityFeature> getFeatureIntersection(PhysicalEntity first, org.biopax.paxtools.controller.ModelUtils.FeatureType firstClass, PhysicalEntity second, org.biopax.paxtools.controller.ModelUtils.FeatureType secondClass) -
getFeatureSetByType
public static Set<EntityFeature> getFeatureSetByType(PhysicalEntity pe, org.biopax.paxtools.controller.ModelUtils.FeatureType type) -
checkERFeatureSet
Finds and adds all (missing) entity features to given entity reference from all its owner simple physical entities ('feature' and 'notFeature' properties). Though, it neither checks for nor resolves any violations of the 'entityFeature' property's inverse functional constraint (i.e., an EntityFeature instance can only belong to one and only one EntityReference object).- Parameters:
er- entity reference objectfix- flag- Returns:
- true or false
-
findFeaturesAddedToSecond
public static Set<EntityFeature> findFeaturesAddedToSecond(PhysicalEntity first, PhysicalEntity second, boolean fix) -
fixControlled
In Paxtools v6, controlled property won't accept multiple values (due to the OWL functional property restriction, which we so far forgot of); so, let's make sure every Control has at most one controlled process.- Parameters:
model- biopax modelcontrol- to be cloned to set one controlled per control
-
normalizeGeneric
In all interactions and complexes, replace generic physical entities (having members) with their corresponding members; clone the parent object, if needed, for each member.- Parameters:
model- biopax modelgeneric- physical entity (PE) that has member PEs
-
normalizeGenerics
Converts each generic simple (except a Complex) physical entity having memberPhysicalEntity property set into equivalent physical entity with a generic entity reference (have memberEntityReference values). Complexes cannot be normalized in the same way, for they do not have entityReference property and might also contain generic components. In general, avoid using 'memberPhysicalEntity' (made exclusively for Reactome) in BioPAX models, for there is a better alternative - using entityReference/memberEntityReference.- Parameters:
model- biopax model to fix
-
addMissingEntityReference
For a non-generic simple physical entity (memberPhysicalEntity property is empty) that does not have entityReference property defined, this method generates and adds a new entity reference of proper type to both this entity and the model, and also copies names and xrefs from the source physical entity to the generated entity reference (UnificationXrefs are converted to RelationshipXref and then also deleted from the original entity.)- Parameters:
model- the BioPAX modelpe- a simple physical entity (that has neither entityReference nor memberPEs set)
-
replaceEquivalentFeatures
This method iterates over the features in a model and tries to find equivalent objects and merges them.- Parameters:
model- to be fixed
-
getKeywords
public static Set<String> getKeywords(BioPAXElement biopaxElement, int depth, Filter<DataPropertyEditor>... dataPropertyFilters) Collects data type (not object) property values (can be then used for full-text indexing).- Parameters:
biopaxElement- biopax objectdepth- greater or equals 0: 0 means use this object's data properties only; 1 - add child's data properties, etc.; (the meaning is slightly different from that of Fetcher.fetch(..) method)dataPropertyFilters- - biopax data property filters to optionally either skip e.g. properties 'sequence', 'temperature', or only accept 'term', 'comment', 'name', etc.- Returns:
- set of keywords
-
getOrganisms
Collects BioSource objects from this or related elements (where it makes sense; though the biopax element might have no or empty 'organism' property at all. The idea is to additionally associate with existing BioSource objects, and thus make filtering by organism possible, for at least Interaction, Protein, Complex, Dna, etc. biopax entities.- Parameters:
biopaxElement- biopax object- Returns:
- organism names
-
getDatasources
Collects all Provenance objects associated with this one as follows: - if the element is Entity (has 'dataSource' property) or is Provenence itself, get the values and quit; - if the biopax element is PathwayStep or EntityReference, traverse into some of its object/inverse properties to collect dataSource values from associated entities. - return empty set for all other BioPAX types (it is less important to associate common self-descriptive biopax utility classes with particular pathway data sources)- Parameters:
biopaxElement- a biopax object- Returns:
- Provenance objects set
-
getParentPathways
Collects all parent Pathway objects recursively traversing the inverse object properties of the biopax element. It ignores all BioPAX types except (incl. sub-classes of): Pathway, Interaction, PathwayStep, PhysicalEntity, EntityReference, and Gene.- Parameters:
biopaxElement- biopax object- Returns:
- inferred parent pathways
-
mergeEquivalentInteractions
Merges equivalent interactions (currently - Conversions only). TODO: shall we rename to mergeEquivalentConversions instead (this is what it does)? Warning: experimental; - check if the result is desirable; the result very much depends on actual pathway data quality...- Parameters:
model- to edit/update
-
mergeEquivalentPhysicalEntities
Merges equivalent physical entities. This can greatly decrease model's size and improve some visualizations, but can also introduce (or uncover hidden) semantic problems, such as when a physical entity is both component of a complex and independently participates in an interaction (this can happen when location and mod. features of a protein are not defined - only names, xrefs and perhaps entity reference - are there). Note (warning): please check if the result is desirable; the result of the merging very much depends on actual pathway data quality (in fact, such merging is better if decided and done by a data provider before releasing the data)...- Parameters:
model- to edit/update
-
encodeBase62
-
shortenUri
Creates a short URI from the URI, given the xml:base. One have to check the new URI is unique before using in a model (if not - e.g., add some suffix to the xmlBase parameter and try again).- Parameters:
xmlbase-uri-- Returns:
- a short URI
-
updateUri
Replaces the URI of a BioPAX object in the Model using java reflection. If the element also belongs to other BioPAX models, those will become inconsistent unless this method is called for each such model. Warnings: - one should not normally use this method at all; - but if you do, then don't use a URI of another object from the same model.- Parameters:
model- model (can be null; if the object in fact belongs to a model, the model will be inconsistent)el- biopax objectnewUri- URI - not null/empty URI
-
breakPathwayComponentCycle
Removes cyclic pathway inclusions, non-trivial infinite loops, in 'pathwayComponent' biopax property. Such loops usually do not make much sense and only can cause trouble in pathway data analysis. This tool recursively removes parent pathways from sub pathways' pathwayComponent set.- Parameters:
model- a model that contains Pathways; will be modified as the result
-
isGeneric
Checks whether the BioPAX element is generic physical entity or entity reference.- Parameters:
e- biopax object- Returns:
- true when the object is generic physical entity or entity reference
-
Fetcher.fetch(BioPAXElement, Model)instead (with Fetcher.nextStepFilter or without)