Packages

object DistADUtil

This class gathers all the utilities needed for distributed anomaly detection

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DistADUtil
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val LOG: Logger
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def calculateBiSectingKmeanClustering(data: DataFrame, numberOfClusters: Int): DataFrame

    Run BiSectingKMean clustering on a given Dataframe

    Run BiSectingKMean clustering on a given Dataframe

    data

    the given RDD[Triple]

    numberOfClusters

    number of clusters

    returns

    a dataframe containing cluster id for each data point

  7. def calculateBiSectingKmeanClustering(data: RDD[Triple], numberOfClusters: Int): DataFrame

    Run BiSectingKMean clustering on a given RDD[Triple]

    Run BiSectingKMean clustering on a given RDD[Triple]

    data

    the given RDD[Triple]

    numberOfClusters

    number of clusters

    returns

    a dataframe containing cluster id for each data point

  8. def calculateMinHashLSHClustering(partialDataRDD: RDD[Triple], originalData: RDD[Triple], config: DistADConfig): DataFrame
  9. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @IntrinsicCandidate()
  10. val convertStringToDouble: UserDefinedFunction

    A UDF for converting numeric strings to double

  11. def createDF(data: RDD[Triple]): DataFrame

    Gets an RDD[Triple] and converts it to a dataframe

    Gets an RDD[Triple] and converts it to a dataframe

    data

    the given RDD[Triple]

    returns

    a dataframe containing s,p,o

  12. def createDFWithConversion(data: RDD[Triple]): DataFrame

    Gets an RDD[Triple] and converts it to a dataframe with converting numeric strings to double

    Gets an RDD[Triple] and converts it to a dataframe with converting numeric strings to double

    data

    the given RDD[Triple]

    returns

    a dataframe containing s,p,o

  13. def createSpark(): SparkSession

    Creates an Spark session and returns it

  14. def detectNumberOfClusters(data: DataFrame, percentage: Double): Int

    A function which sample the data and run clustering with different K.

    A function which sample the data and run clustering with different K. At the end select the K with highest Silhouette value

    data

    the given dataframe for clustering

    percentage

    the percentage for sampling

    returns

    the optimal K for clustering

  15. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  17. def filterAllTriplesWhichAtLeastHaveOneNumericLiterals(originalDataRDD: RDD[Triple], onlyLiteralDataRDD: RDD[Triple]): RDD[Triple]
  18. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @IntrinsicCandidate()
  19. def getLocalName(x: Node): String

    Gets a @link{Node} and returns the local name

    Gets a @link{Node} and returns the local name

    x

    a given @link{Node}

    returns

    the local name

  20. def getNumber(a: String): Double

    Gets a literal string value and extract the number from it

    Gets a literal string value and extract the number from it

    a

    the literal String value

    returns

    the number form the literal string, "0" O.W

  21. def getOnlyLiteralObjects(nTriplesRDD: RDD[Triple]): RDD[Triple]

    Gets a RDD[Triple] and filter only literals

    Gets a RDD[Triple] and filter only literals

    nTriplesRDD

    the given RDD[Triple]

    returns

    a new RDD[Triple] containing only literals

  22. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @IntrinsicCandidate()
  23. def iqr(data: DataFrame, verbose: Boolean, anomalyListSize: Int): DataFrame

    Anomaly Detection method based on Interquartile Range

    Anomaly Detection method based on Interquartile Range

    data

    a given dataframe

    verbose

    to show more internal outputs

    anomalyListSize

    the min value list size for considering a list for anomaly detection process

    returns

    a dataframe

  24. def isAllDigits(x: String): Boolean

    Checks if the given string contains only digits

    Checks if the given string contains only digits

    x

    the string value

  25. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  26. def isNumeric(x: String): Boolean

    Gets a literal string and decide if the literal string is a numeric literal or not

    Gets a literal string and decide if the literal string is a numeric literal or not

    x

    The literal String

  27. def mad(data: DataFrame, verbose: Boolean, anomalyListSize: Int): DataFrame

    Anomaly Detection method based on Mean Absolute Deviation (MAD)

    Anomaly Detection method based on Mean Absolute Deviation (MAD)

    data

    a given dataframe

    verbose

    to show more internal outputs

    anomalyListSize

    the min value list size for considering a list for anomaly detection process

    returns

    a dataframe

  28. def merge[A, B](input: List[Map[A, B]]): Map[A, List[B]]
  29. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  30. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  31. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  32. val objList: List[String]
  33. def propClustering(triplesWithNumericLiteral: RDD[Triple]): RDD[(String, Set[(String, String, Double)])]
  34. def propWithSubject(a: RDD[Triple]): RDD[(String, String)]
  35. def readData(spark: SparkSession, input: String): RDD[Triple]

    Based on the input file extension, reads the file into memory in a distributed manner

    Based on the input file extension, reads the file into memory in a distributed manner

    spark

    the Spark session

    input

    the path of the input file

    returns

    RDD[Triple]

  36. def search(a: Double, b: Array[Double]): Boolean

    Gets a number and a list and checks if the list contains the number

    Gets a number and a list and checks if the list contains the number

    a

    the given number

    b

    the given list

    returns

    @code{True} if the number is in the list, @code{false} O.W

  37. def searchEdge(x: String, y: List[String]): Boolean

    Gets an String and list of Strings, decide if the list contains the given String

    Gets an String and list of Strings, decide if the list contains the given String

    x

    the given string

    y

    list of strings

  38. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  39. def toString(): String
    Definition Classes
    AnyRef → Any
  40. def triplesWithNumericLit(objLit: RDD[Triple]): RDD[Triple]

    Gets a RDD[Triple] and filter only numeric literals

    Gets a RDD[Triple] and filter only numeric literals

    objLit

    the given RDD[Triple]

    returns

    a new RDD[Triple] containing only numeric literals

  41. def triplesWithNumericLitWithTypeIgnoreEndingWithID(data: RDD[Triple]): RDD[Triple]

    Gets a RDD[Triple] and filter only numeric literals based on the data types.

    Gets a RDD[Triple] and filter only numeric literals based on the data types. It also ignores all the predicates which ends with "ID":

    data

    the given RDD[Triple]

    returns

    a new RDD[Triple] containing only numeric literals

  42. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  43. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  44. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. def writeAnomaliesToFile(data: List[String], path: String): Unit

    Writes list[String] to a file with given path.

    Writes list[String] to a file with given path. It handles HDFS and the normal file system

    data

    the data that should be written to a file

    path

    the path of the output file

  46. def writeToFile(path: String, data: DataFrame): Unit

    Writes a dataframe to a file with given path.

    Writes a dataframe to a file with given path. It handles HDFS and the normal file system

    path

    the path of the output file

    data

    the dataframe that should be written to a file

  47. def zscore(data: DataFrame, verbose: Boolean, anomalyListSize: Int): DataFrame

    Anomaly Detection method based on Z-Score

    Anomaly Detection method based on Z-Score

    data

    a given dataframe

    verbose

    to show more internal outputs

    anomalyListSize

    the min value list size for considering a list for anomaly detection process

    returns

    a dataframe

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from AnyRef

Inherited from Any

Ungrouped