Packages

class DaSimEstimator extends AnyRef

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DaSimEstimator
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DaSimEstimator()

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. var _calcAvailability: Boolean
  5. var _pDistSimFeatureExtractionMethod: String
  6. var _pDistSimThreshold: Double
  7. var _pInitialFilterByObject: String
  8. var _pInitialFilterByPredicate: String
  9. var _pInitialFilterBySPARQL: String
  10. var _parameterVerboseProcess: Boolean
  11. var _seedLimit: Int
  12. var _sem_availability: Broadcast[Map[String, Double]]
  13. var _sem_distSimFeatureExtractionMethod: Broadcast[String]
  14. var _sem_entityCols: Broadcast[Array[String]]
  15. var _sem_featureExtractionMethod: Broadcast[String]
  16. var _sem_finalValCol: Broadcast[String]
  17. var _sem_importance: Broadcast[Map[String, Double]]
  18. var _sem_initialFilter: Broadcast[String]
  19. var _sem_reliability: Broadcast[Map[String, Double]]
  20. var _sem_similarityCols: Broadcast[Array[String]]
  21. def aggregateSimilarityScore(simDf: DataFrame, valueStreching: Boolean = true, availability: Map[String, Double] = null, importance: Map[String, Double] = null, reliability: Map[String, Double] = null): DataFrame

    aggregate similarity scores and weight those

    aggregate similarity scores and weight those

    simDf

    similarity dataframw with the feature specific sim scores

    valueStreching

    parameter, optional to strech features, by deafault set

    availability

    weightning by availability

    importance

    user specific weighning over importance

    reliability

    optional opportunity to incluence weighning by reliability

    returns

    similarity dataframe with aggregated and weigthed final similarity score

  22. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  23. def calculateAvailability(extractedFeaturesDF: DataFrame): Map[String, Double]
  24. def calculateDaSimSimilarities(candidatePairsDataFrame: DataFrame, extractedFeatureDataframe: DataFrame): DataFrame

    calculate with the new approach the weighted and feature specific simialrity scores

    calculate with the new approach the weighted and feature specific simialrity scores

    candidatePairsDataFrame

    candidate pairs which span up the combinations to be calculated on

    extractedFeatureDataframe

    extracted feature dataframe

    returns

    calculate for each feature the pairwise similarity score

  25. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  26. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  27. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  28. def gatherCandidatePairs(dataset: Dataset[Triple], seeds: DataFrame, _pDistSimFeatureExtractionMethod: String = "os", fastNotDistSim: Boolean = true, _pDistSimThreshold: Double = 0): DataFrame

    we use distsim to gather promising candidates

    we use distsim to gather promising candidates

    dataset

    prefiltered KG for gathering candidates

    seeds

    the seeds to be used for calculating promising cadidates via DistSim

    _pDistSimFeatureExtractionMethod

    method for distsim feature extractor

    _pDistSimThreshold

    threshold for distsim postfilter pairs by min threshold

    returns

    dataframe with candidate pairs resulting from DistSim

  29. def gatherFeatures(ds: Dataset[Triple], candidates: DataFrame, sparqlFeatureExtractionQuery: String = null, predicateFilter: String = "", objectFilter: String = ""): DataFrame

    feature extraction for extensive similarity scores creates dataframe with all features two options for feature gathering either SparqlFrame or SmartFeature Extractor which operates pivot based

    feature extraction for extensive similarity scores creates dataframe with all features two options for feature gathering either SparqlFrame or SmartFeature Extractor which operates pivot based

    ds

    dataset of KG

    candidates

    dandidate pairs from distsim

    sparqlFeatureExtractionQuery

    optional, but if set we use sparql frame and not smartfeatureextractor

    returns

    dataframe with columns corresponding to the features and the uri identifier

  30. def gatherSeeds(ds: Dataset[Triple], sparqlFilter: String = null, objectFilter: String = null, predicateFilter: String = null): DataFrame

    internal method that collects seeds by either sparql or object filter

    internal method that collects seeds by either sparql or object filter

    ds

    dataset of triples representing input kg

    sparqlFilter

    filter by sparql initial kg

    objectFilter

    gilter init kg by spo object

    returns

    dataframe with one column containing string representation of seed URIs

  31. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  32. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  33. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  34. def listDistinctCandidates(candidatePairs: DataFrame): DataFrame

    list all elements which exists within the resulting uris of distsim

    list all elements which exists within the resulting uris of distsim

    candidatePairs

    candidate pairs in a dataframe coming from distsim

    returns

    dataframw ith one column having the relevant uris as strings

  35. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  36. def normSimColumns(df: DataFrame): DataFrame

    optional method to normalize similarity columns

    optional method to normalize similarity columns

    df

    similarity scored dataframe which needs to be normalized

    returns

    normalized dataframe

  37. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  38. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  39. var pAvailability: Map[String, Double]
  40. var pImportance: Map[String, Double]
  41. var pReliability: Map[String, Double]
  42. var pSimilarityCalculationExecutionOrder: Array[String]
  43. var pSparqlFeatureExtractionQuery: Null
  44. var pValueStreching: Boolean
  45. def semantification(resultDf: DataFrame): RDD[Triple]
  46. def setAvailability(availability: Map[String, Double]): DaSimEstimator.this.type

    specify manually the availability of each feature this parameter weights the relevance of a certain feature similarity based on their availability it is possible that the availability is known if the value is not given, it will be considered to be equally distributed

    specify manually the availability of each feature this parameter weights the relevance of a certain feature similarity based on their availability it is possible that the availability is known if the value is not given, it will be considered to be equally distributed

    returns

    adjusted transformer

  47. def setDistSimFeatureExtractionMethod(distSimFeatureExtractionMethod: String): DaSimEstimator.this.type

    DistSim feature extraction method feature extracting method for first guesses via DistSim

    DistSim feature extraction method feature extracting method for first guesses via DistSim

    distSimFeatureExtractionMethod

    DistSim feature Extraction Method

    returns

    adjusted transformer

  48. def setDistSimThreshold(distSimThreshold: Double): DaSimEstimator.this.type

    DistSim Threshold min Similarity This is the threshold for minimal similarity score being used within Distsim for promising canidates

    DistSim Threshold min Similarity This is the threshold for minimal similarity score being used within Distsim for promising canidates

    distSimThreshold

    DistSim threshold min similarity score for prefilter candidate pairs

    returns

    adjusted transformer

  49. def setImportance(importance: Map[String, Double]): DaSimEstimator.this.type

    specify manually the importance of each feature this parameter weights the relevance of a certain feature similarity based on their importance this value offers user to influence weightning on personal preferance

    specify manually the importance of each feature this parameter weights the relevance of a certain feature similarity based on their importance this value offers user to influence weightning on personal preferance

    returns

    adjusted transformer

  50. def setLimitSeeds(seedLimit: Int): DaSimEstimator.this.type
  51. def setObjectFilter(objectFilter: String): DaSimEstimator.this.type

    FIlter init KG by spo object Filter the KG by the object of spo structure, so an alternative and faster compared to sparql

    FIlter init KG by spo object Filter the KG by the object of spo structure, so an alternative and faster compared to sparql

    objectFilter

    string representing the object for spo filter

    returns

    adjusted transformer

  52. def setPredicateFilter(predicateFilter: String): DaSimEstimator.this.type

    FIlter init KG by spo object Filter the KG by the predicate of spo structure, so an alternative and faster compared to sparql

    FIlter init KG by spo object Filter the KG by the predicate of spo structure, so an alternative and faster compared to sparql

    predicateFilter

    string representing the object for spo filter

    returns

    adjusted transformer

  53. def setReliability(reliability: Map[String, Double]): DaSimEstimator.this.type

    specify manually the reliability of each feature this parameter weights the relevance of a certain feature similarity based on their reliability it is possible that the reliability is known, for example that certain data might be influenced by ffake news or that data is rarely updated if the value is not given, it will be considered to be equally distributed

    specify manually the reliability of each feature this parameter weights the relevance of a certain feature similarity based on their reliability it is possible that the reliability is known, for example that certain data might be influenced by ffake news or that data is rarely updated if the value is not given, it will be considered to be equally distributed

    returns

    adjusted transformer

  54. def setSimilarityCalculationExecutionOrder(similarityCalculationExecutionOrder: Array[String]): DaSimEstimator.this.type

    Execution order of similarity scores here you can specify in which order the similarity values should be executed

    Execution order of similarity scores here you can specify in which order the similarity values should be executed

    returns

    adjusted transformer

  55. def setSimilarityValueStreching(valueStreching: Boolean): DaSimEstimator.this.type

    Normalize similairty scores per feature this parameter offers that the feature dedicated similarity scores are streched/normed s.t.

    Normalize similairty scores per feature this parameter offers that the feature dedicated similarity scores are streched/normed s.t. they all reach from zero to one

    returns

    adjusted transformer

  56. def setSparqlFilter(sparqlFilter: String): DaSimEstimator.this.type

    candidate filtering sparql with this parameter you can reduce the list of of candidates by use use of a sparql query

    candidate filtering sparql with this parameter you can reduce the list of of candidates by use use of a sparql query

    sparqlFilter

    SPARQL filter applied ontop of input KG

    returns

    adjusted transformer

  57. def setVerbose(verbose: Boolean): DaSimEstimator.this.type
  58. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  59. def toString(): String
    Definition Classes
    AnyRef → Any
  60. def transform(dataset: Dataset[Triple]): DataFrame

    transforms da kg to a similarity score dataframe based on parameters overall method encapsulating the methods and should be used from outside

    transforms da kg to a similarity score dataframe based on parameters overall method encapsulating the methods and should be used from outside

    dataset

    knowledge graph

    returns

    dataframw with results of similarity scores as metagraph

  61. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  62. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  63. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from AnyRef

Inherited from Any

Ungrouped