Class MinerAdapter

java.lang.Object
org.biopax.paxtools.pattern.miner.MinerAdapter
All Implemented Interfaces:
Miner
Direct Known Subclasses:
AbstractSIFMiner, ControlsStateChangeDetailedMiner, DirectedRelationMiner, RelatedGenesOfInteractionsMiner, UbiquitousIDMiner

public abstract class MinerAdapter extends Object implements Miner
Adapter class for a miner.
Author:
Ozgun Babur
  • Field Details

    • name

      protected String name
      Name of the miner.
    • description

      protected String description
      Description of the miner.
    • blacklist

      protected Blacklist blacklist
      Blacklist for identifying ubiquitous small molecules.
    • idFetcher

      protected IDFetcher idFetcher
      ID fetcher is used for skipping objects that cannot generate a valid ID during the search.
    • idMap

      protected Map<BioPAXElement,Set<String>> idMap
      Memory for object IDs. This is needed for performance issues. Without this, half of SIF conversion is spent for fetchIDs().
  • Constructor Details

    • MinerAdapter

      protected MinerAdapter(String name, String description)
      Constructor with name and description.
      Parameters:
      name - name of the miner
      description - description of the miner
  • Method Details

    • setBlacklist

      public void setBlacklist(Blacklist blacklist)
      Sets the blacklist to use during SIF search.
      Parameters:
      blacklist - for identifying ubiquitous small molecules
    • setIDFetcher

      public void setIDFetcher(IDFetcher idFetcher)
      Sets the ID fetcher to use during SIF search.
      Parameters:
      idFetcher - ID generator from BioPAX object
    • constructPattern

      public abstract Pattern constructPattern()
      Constructs the pattern to use for mining.
      Returns:
      the pattern
    • getPattern

      public Pattern getPattern()
      Gets the pattern, constructs if null.
      Specified by:
      getPattern in interface Miner
      Returns:
      pattern
    • getName

      public String getName()
      Gets the name of the miner.
      Specified by:
      getName in interface Miner
      Returns:
      name
    • getDescription

      public String getDescription()
      Gets the description of the miner.
      Specified by:
      getDescription in interface Miner
      Returns:
      description
    • setName

      public void setName(String name)
    • setDescription

      public void setDescription(String description)
    • getIdMap

      public Map<BioPAXElement,Set<String>> getIdMap()
    • setIdMap

      public void setIdMap(Map<BioPAXElement,Set<String>> idMap)
    • toString

      public String toString()
      Uses the name as sting representation of the miner.
      Overrides:
      toString in class Object
      Returns:
      name
    • getGeneSymbol

      protected String getGeneSymbol(ProteinReference pr)
      Searches for the gene symbol of the given EntityReference.
      Parameters:
      pr - to search for a symbol
      Returns:
      symbol
    • getUniprotNameForHuman

      protected String getUniprotNameForHuman(ProteinReference pr)
      Searches for the uniprot name of the given human EntityReference.
      Parameters:
      pr - to search for the uniprot name
      Returns:
      uniprot name
    • getGeneSymbol

      protected String getGeneSymbol(Match m, String label)
      Searches for the gene symbol of the given EntityReference.
      Parameters:
      m - current match
      label - label of the related EntityReference in the pattern
      Returns:
      symbol
    • getUniprotNameForHuman

      protected String getUniprotNameForHuman(Match m, String label)
      Searches for the uniprot name of the given human EntityReference.
      Parameters:
      m - current match
      label - label of the related EntityReference in the pattern
      Returns:
      uniprot name
    • isInhibition

      public boolean isInhibition(Control ctrl)
      Checks if the type of a Control is inhibition.
      Parameters:
      ctrl - Control to check
      Returns:
      true if type is inhibition related
    • toStringSet

      public Set<String> toStringSet(Set<ModificationFeature> set)
      Sorts the modifications and gets them in a String.
      Parameters:
      set - modifications
      Returns:
      a String listing the modifications
    • getModificationTerm

      public String getModificationTerm(ModificationFeature mf)
      Gets the String term of the modification feature.
      Parameters:
      mf - modification feature
      Returns:
      modification term
    • getPositionStart

      public int getPositionStart(ModificationFeature mf)
      Gets the first position of the modification feature.
      Parameters:
      mf - modification feature
      Returns:
      first location
    • getPositionInString

      public String getPositionInString(ModificationFeature mf)
      Gets the position of the modification feature as a String.
      Parameters:
      mf - modification feature
      Returns:
      location
    • getModifications

      protected Set<String> getModifications(Match m, String label)
      Gets modifications of the given element in a string. The element has to be a PhysicalEntity.
      Parameters:
      m - match
      label - label of the PhysicalEntity
      Returns:
      modifications
    • getModifications

      protected Set<String> getModifications(Match m, String memLabel, String comLabel)
      Gets modifications of the given elements in a string set. The elements has to be a PhysicalEntity and they must be two ends of a chain with homology and/or complex membership relations.
      Parameters:
      m - match
      memLabel - the member-end of the PhysicalEntity chain
      comLabel - the complex-end of the PhysicalEntity chain
      Returns:
      modifications
    • getCellularLocations

      protected Set<String> getCellularLocations(Match m, String memLabel, String comLabel)
      Gets cellular locations of the given elements in a string set. The elements has to be a PhysicalEntity and they must be two ends of a chain with homology and/or complex membership relations.
      Parameters:
      m - match
      memLabel - the member-end of the PhysicalEntity chain
      comLabel - the complex-end of the PhysicalEntity chain
      Returns:
      cellular locations
    • getDeltaModifications

      protected Set<String>[] getDeltaModifications(Match m, String memLabel1, String comLabel1, String memLabel2, String comLabel2)
      Gets delta modifications of the given elements in string sets. The elements has to be two PhysicalEntity chains. The result array is composed of two strings: gained (0) and lost (1).
      Parameters:
      m - match
      memLabel1 - the member-end of the first PhysicalEntity chain
      comLabel1 - the complex-end of the first PhysicalEntity chain
      memLabel2 - the member-end of the second PhysicalEntity chain
      comLabel2 - the complex-end of the second PhysicalEntity chain
      Returns:
      delta modifications
    • getDeltaCompartments

      protected Set<String>[] getDeltaCompartments(Match m, String memLabel1, String comLabel1, String memLabel2, String comLabel2)
      Gets delta compartments of the given two PE chains. The result array is composed of two string sets: gained (0) and lost (1).
      Parameters:
      m - match
      memLabel1 - the member-end of the first PhysicalEntity chain
      comLabel1 - the complex-end of the first PhysicalEntity chain
      memLabel2 - the member-end of the second PhysicalEntity chain
      comLabel2 - the complex-end of the second PhysicalEntity chain
      Returns:
      delta compartments
    • getChain

      protected PhysicalEntityChain getChain(Match m, String memLabel, String comLabel)
    • removeCommon

      protected void removeCommon(Set<String> set1, Set<String> set2)
    • concat

      protected String concat(Set<String> set, String sep)
      Converts the set of string to a single string.
      Parameters:
      set - the set
      sep - separator string
      Returns:
      concatenated string
    • sign

      protected int sign(Control ctrl)
      Identifies negative and positive controls. Assumes positive by default.
      Parameters:
      ctrl - control to check
      Returns:
      sign
    • sign

      protected int sign(Match m, String... ctrlLabel)
      Checks the cumulative sign of the chained controls.
      Parameters:
      m - result match
      ctrlLabel - labels for controls
      Returns:
      sign
    • labeledInactive

      protected boolean labeledInactive(Match m, String simpleLabel, String complexLabel)
      Checks if a PE chain is labeled as inactive.
      Parameters:
      m - the result match
      simpleLabel - simple end of the chain
      complexLabel - complex end of the chain
      Returns:
      true if labeled inactive
    • writeResultAsSIF

      protected void writeResultAsSIF(Map<BioPAXElement,List<Match>> matches, OutputStream out, boolean directed, String label1, String label2) throws IOException
      This method writes the output as pairs of gene symbols of the given two ProteinReference. Parameters labels have to map to ProteinReference.
      Parameters:
      matches - the search result
      out - output stream for text output
      directed - if true, reverse pairs is treated as different pairs
      label1 - label for the first ProteinReference in the result matches
      label2 - label for the second ProteinReference in the result matches
      Throws:
      IOException - if cannot write to output stream
    • writeSIFsUsingSIFFramework

      protected void writeSIFsUsingSIFFramework(Map<BioPAXElement,List<Match>> matches, OutputStream out) throws IOException
      This method writes the output as pairs of gene symbols of the given two ProteinReference. Parameters labels have to map to ProteinReference.
      Parameters:
      matches - the search result
      out - output stream for text output
      Throws:
      IOException - if cannot write to output stream
    • getRelationType

      protected String getRelationType()
      Checks if the relation captured by match has a type. THis method just returns null but any child class using writeResultAsSIF method can implement this to have a relationship type between gene symbol pairs.
      Returns:
      type of the relation
    • getHeader

      public String getHeader()
      Gets the first line of the result file. This method should be overridden to customize the header of the result file.
      Returns:
      header
    • writeResultDetailed

      protected void writeResultDetailed(Map<BioPAXElement,List<Match>> matches, OutputStream out, int columns) throws IOException
      Writes the result as a tab delimited format, where the column values are customized.
      Parameters:
      matches - result matches
      out - output stream
      columns - number of columns in the result
      Throws:
      IOException - if cannot write to the stream
    • getValue

      public String getValue(Match m, int col)
      This method has to be overridden if writeResultDetailed method is used. It creates the column value of the given Match. If this method returns null for any column, then the current match is ignored.
      Parameters:
      m - current match
      col - current column
      Returns:
      column value
    • createSIFInteraction

      public Set<SIFInteraction> createSIFInteraction(Match m, IDFetcher fetcher)
      Creates a SIF interaction for the given match.
      Parameters:
      m - match to use for SIF creation
      fetcher - ID generator from BioPAX object
      Returns:
      SIF interaction
    • fetchIDs

      protected Set<String> fetchIDs(BioPAXElement ele, IDFetcher fetcher)
    • getMediatorLabels

      public String[] getMediatorLabels()
      If a SIF miner wants to tell which essential BioPAX elements mediated this relation, then they need to override this method and pass the labels of elements.
      Returns:
      labels of elements to collect publication refs
    • getSourcePELabels

      public String[] getSourcePELabels()
      If a SIF miner wants to tell which PhysicalEntity objects acted as source of the relation, they need to override this method and pass the labels of elements.
      Returns:
      labels of elements
    • getTargetPELabels

      public String[] getTargetPELabels()
      If a SIF miner wants to tell which PhysicalEntity objects acted as source of the relation, they need to override this method and pass the labels of elements.
      Returns:
      labels of elements
    • getIdentifiers

      protected Set<String> getIdentifiers(Match m, String label)
      Uses uniprot name or gene symbol as identifier.
      Parameters:
      m - current match
      label - label of the related EntityReference in the pattern
      Returns:
      identifier
    • getCompoundName

      protected String getCompoundName(SmallMoleculeReference smr)
      Gets the name of the small molecule to use in SIF.
      Parameters:
      smr - small molecule ref
      Returns:
      a name