Package org.neo4j.gds.ml.splitting
Class StratifiedKFoldSplitter
java.lang.Object
org.neo4j.gds.ml.splitting.StratifiedKFoldSplitter
Splits an HugeLongArray of nodes into
k NodeSplits, each of which contains a
train set and a test set. Logically, the nodes are first divided into k nearly equal sized
buckets, and for each NodeSplit, one of the buckets is taken as test set and the remaining ones
concatenated into the train set. The split is stratified, meaning that if each node is seen as having
a class given by targets.get(nodeId), then for each distinct class,
each bucket contains roughly the same number of nodes with that class.-
Constructor Summary
ConstructorsConstructorDescriptionStratifiedKFoldSplitter(int k, org.neo4j.gds.core.utils.paged.ReadOnlyHugeLongArray ids, org.eclipse.collections.api.block.function.primitive.LongToLongFunction targets, Optional<Long> randomSeed, SortedSet<Long> distinctInternalTargets) -
Method Summary
Modifier and TypeMethodDescriptionstatic org.neo4j.gds.core.utils.mem.MemoryEstimationmemoryEstimation(int k, ToLongFunction<org.neo4j.gds.core.GraphDimensions> idsSetSizeExtractor) static org.neo4j.gds.core.utils.mem.MemoryEstimationmemoryEstimationForNodeSet(int k, double trainFraction) splits()
-
Constructor Details
-
StratifiedKFoldSplitter
-
-
Method Details
-
memoryEstimationForNodeSet
public static org.neo4j.gds.core.utils.mem.MemoryEstimation memoryEstimationForNodeSet(int k, double trainFraction) -
memoryEstimation
public static org.neo4j.gds.core.utils.mem.MemoryEstimation memoryEstimation(int k, ToLongFunction<org.neo4j.gds.core.GraphDimensions> idsSetSizeExtractor) -
splits
-