Packages

package splits

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class CategoricalSplit extends Split

    Split based on inclusion in a set

  2. class NoSplit extends Split

    If no split was found

  3. class RealSplit extends Split

    Split based on a real value in the index position

  4. trait Split extends Serializable

    Splits are used by decision trees to partition the input space

Value Members

  1. object ClassificationSplitter

    Find the best split for classification problems.

    Find the best split for classification problems.

    Created by maxhutch on 12/2/16.

  2. object MultiTaskSplitter

    Created by maxhutch on 11/29/16.

  3. object RegressionSplitter

    Find the best split for regression problems.

    Find the best split for regression problems.

    The best split is the one that reduces the total weighted variance: totalVariance = N_left * \sigma_left2 + N_right * \sigma_right2 which, in scala-ish, would be: totalVariance = leftWeight * (leftSquareSum /leftWeight - (leftSum / leftWeight )2) + rightWeight * (rightSquareSum/rightWeight - (rightSum / rightWeight)2) Because we are comparing them, we can subtract off leftSquareSum + rightSquareSum, which yields the following simple expression after some simplification: totalVariance = -leftSum * leftSum / leftWeight - Math.pow(totalSum - leftSum, 2) / (totalWeight - leftWeight) which depends only on updates to leftSum and leftWeight (since totalSum and totalWeight are constant).

    Created by maxhutch on 11/29/16.

Ungrouped