Packages

class SmartVectorAssembler extends Transformer

This Transformer creates a needed Dataframe for common ML approaches in Spark MLlib. The resulting Dataframe consists of a column features which is a numeric vector for each entity The other columns are a identifier column like the node id And optional column for label

Linear Supertypes
Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SmartVectorAssembler
  2. Transformer
  3. PipelineStage
  4. Logging
  5. Params
  6. Serializable
  7. Serializable
  8. Identifiable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SmartVectorAssembler()

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. var _digitStringStrategy: String
    Attributes
    protected
  6. var _entityColumn: String
    Attributes
    protected
  7. var _featureColumns: List[String]
    Attributes
    protected
  8. var _featureVectorDescription: ListBuffer[String]
  9. var _labelColumn: String
    Attributes
    protected
  10. var _nullDigitReplacement: Int
    Attributes
    protected
  11. var _nullStringReplacement: String
    Attributes
    protected
  12. var _nullTimestampReplacement: Timestamp
    Attributes
    protected
  13. var _numericCollapsingStrategy: String
    Attributes
    protected
  14. var _stringCollapsingStrategy: String
    Attributes
    protected
  15. var _stringIndexerTrainingDfSizeRatio: Double
    Attributes
    protected
  16. var _word2VecMinCount: Int
    Attributes
    protected
  17. var _word2VecSize: Int
    Attributes
    protected
  18. var _word2vecTrainingDfSizeRatio: Double
    Attributes
    protected
  19. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  20. final def clear(param: Param[_]): SmartVectorAssembler.this.type
    Definition Classes
    Params
  21. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  22. def copy(extra: ParamMap): Transformer
    Definition Classes
    SmartVectorAssembler → Transformer → PipelineStage → Params
  23. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  24. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  25. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  26. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  27. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  28. def explainParams(): String
    Definition Classes
    Params
  29. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  30. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  31. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  32. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  33. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  34. def getFeatureVectorDescription(): ListBuffer[String]

    get the description of explainable feature vector

    get the description of explainable feature vector

    returns

    ListBuffer of Strings, describing for each index of the KG the content

  35. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  36. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  37. def getSemanticTransformerDescription(): RDD[Triple]

    gain all inforamtion from this transformer as knowledge graph

    gain all inforamtion from this transformer as knowledge graph

    returns

    RDD[Trile] describing the meta information

  38. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  39. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  40. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  41. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  42. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  43. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  44. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  45. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  46. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  47. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  48. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  51. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  52. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  53. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  55. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  56. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  57. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  58. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  59. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  60. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  61. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  62. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  63. final def set(paramPair: ParamPair[_]): SmartVectorAssembler.this.type
    Attributes
    protected
    Definition Classes
    Params
  64. final def set(param: String, value: Any): SmartVectorAssembler.this.type
    Attributes
    protected
    Definition Classes
    Params
  65. final def set[T](param: Param[T], value: T): SmartVectorAssembler.this.type
    Definition Classes
    Params
  66. final def setDefault(paramPairs: ParamPair[_]*): SmartVectorAssembler.this.type
    Attributes
    protected
    Definition Classes
    Params
  67. final def setDefault[T](param: Param[T], value: T): SmartVectorAssembler.this.type
    Attributes
    protected
    Definition Classes
    Params
  68. def setDigitStringStrategy(digitStringStrategy: String): SmartVectorAssembler.this.type

    setter for of strategy to transform categorical strings to digit.

    setter for of strategy to transform categorical strings to digit. option one is hash option two is index

    digitStringStrategy

    strategy, either hash or index

    returns

    transformer

  69. def setEntityColumn(p: String): SmartVectorAssembler.this.type

    set which columns represents the entity if not set first column is used

    set which columns represents the entity if not set first column is used

    p

    entity columnName as string

    returns

    set transformer

  70. def setFeatureColumns(p: List[String]): SmartVectorAssembler.this.type

    set which columns represents the features, if not set all but label and entity are used

    set which columns represents the features, if not set all but label and entity are used

    p

    label columnName as string

    returns

    set transformer

  71. def setLabelColumn(p: String): SmartVectorAssembler.this.type

    set which columns represents the labl, if not set no label column

    set which columns represents the labl, if not set no label column

    p

    label columnName as string

    returns

    set transformer

  72. def setNullReplacement(datatype: String, value: Any): SmartVectorAssembler.this.type

    Set replacemnet for string or digit

  73. def setStringIndexerTrainingDfSizeRatio(stringIndexerTrainingDfSizeRatio: Double): SmartVectorAssembler.this.type

    setter for ratio of training data in training string indexer

    setter for ratio of training data in training string indexer

    stringIndexerTrainingDfSizeRatio

    fraction in sampling of training data df

    returns

    transformer

  74. def setWord2VecMinCount(word2VecMinCount: Int): SmartVectorAssembler.this.type

    setter for feature non categorical strings which are replaced by a word to vec

    setter for feature non categorical strings which are replaced by a word to vec

    word2VecMinCount

    min number of min word occurencs

    returns

    transformer

  75. def setWord2VecSize(word2vecSize: Int): SmartVectorAssembler.this.type

    setter for feature non categorical strings which are replaced by a word to vec

    setter for feature non categorical strings which are replaced by a word to vec

    word2vecSize

    size of vector

    returns

    transformer

  76. def setWord2vecTrainingDfSizeRatio(word2vecTrainingDfSizeRatio: Double): SmartVectorAssembler.this.type

    setter for ratio of training data in traing word 2 vec model

    setter for ratio of training data in traing word 2 vec model

    word2vecTrainingDfSizeRatio

    fraction in sampling of training data df

    returns

    transformer

  77. val spark: SparkSession
    Attributes
    protected
  78. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  79. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  80. def transform(dataset: Dataset[_]): DataFrame

    transforms a dataframe of query results to a numeric feature vectors and a id and label column

    transforms a dataframe of query results to a numeric feature vectors and a id and label column

    dataset

    dataframe with columns for id features and optional label

    returns

    dataframe with columns id features and optional label where features are numeric vectors which incooperate with mllib

    Definition Classes
    SmartVectorAssembler → Transformer
  81. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  82. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  83. def transformSchema(schema: StructType): StructType
    Definition Classes
    SmartVectorAssembler → PipelineStage
  84. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  85. val uid: String
    Definition Classes
    SmartVectorAssembler → Identifiable
  86. def validateEntityColumn(cols: Seq[String]): Unit

    Validate set column to check if we need fallback to first column if not set and if set if it is in available cols

    Validate set column to check if we need fallback to first column if not set and if set if it is in available cols

    cols

    the available columns

  87. def validateFeatureColumns(cols: Seq[String]): Unit

    validate the feature columns if feature columns are set, check if those are in avaiable columns if not raise exception if not set determine feature columns by all columns minus the label and entty column

  88. def validateLabelColumn(cols: Seq[String]): Unit

    validate if label is in available columns

    validate if label is in available columns

    cols

    the avaiable columns

  89. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  90. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  91. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped