Class HugeGraph
- java.lang.Object
-
- org.neo4j.gds.core.huge.HugeGraph
-
- All Implemented Interfaces:
BatchNodeIterable,CSRGraph,Degrees,Graph,IdMap,NodeIterator,PartialIdMap,NodePropertyContainer,RelationshipIterator,RelationshipPredicate,RelationshipProperties
public class HugeGraph extends java.lang.Object implements CSRGraph
Huge Graph contains two array like data structures.The adjacency data is stored in a ByteArray, which is a byte[] addressable by longs indices and capable of storing about 2^46 (~ 70k bn) bytes – or 64 TiB. The bytes are stored in byte[] pages of 32 KiB size.
The data is in the format:
Thedegree~targetId1~targetId2~targetIdndegreeis stored as a fill-sized 4 byte longint(the neo kernel api returns an int forNodes.countAll(org.neo4j.internal.kernel.api.NodeCursor)). Every target ID is first sorted, then delta encoded, and finally written as variable-length vlongs. The delta encoding does not write the actual value but only the difference to the previous value, which plays very nice with the vlong encoding.The seconds data structure is a LongArray, which is a long[] addressable by longs and capable of storing about 2^43 (~9k bn) longs – or 64 TiB worth of 64 bit longs. The data is the offset address into the aforementioned adjacency array, the index is the respective source node id.
To traverse all nodes, first access to offset from the LongArray, then read 4 bytes into the
degreefrom the ByteArray, starting from the offset, then readdegreevlongs as targetId.Reading the degree from the offset position not only does not require the offset array to be sorted but also allows the adjacency array to be sparse. This fact is used during the import – each thread pre-allocates a local chunk of some pages (512 KiB) and gives access to this data during import. Synchronization between threads only has to happen when a new chunk has to be pre-allocated. This is similar to what most garbage collectors do with TLAB allocations.
- See Also:
- more abount vlong, more abount TLAB allocation
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.neo4j.gds.api.BatchNodeIterable
BatchNodeIterable.BitSetIdIterator, BatchNodeIterable.IdIterable, BatchNodeIterable.IdIterator
-
Nested classes/interfaces inherited from interface org.neo4j.gds.api.IdMap
IdMap.NodeLabelConsumer
-
-
Field Summary
Fields Modifier and Type Field Description protected AdjacencyListadjacencyprotected booleanhasRelationshipPropertyprotected IdMapidMapprotected booleanisMultiGraphprotected java.util.Map<java.lang.String,NodePropertyValues>nodePropertiesprotected org.neo4j.gds.Orientationorientationprotected @Nullable AdjacencyPropertiespropertiesprotected longrelationshipCountprotected org.neo4j.gds.api.schema.GraphSchemaschema-
Fields inherited from interface org.neo4j.gds.api.IdMap
NOT_FOUND, START_NODE_ID
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedHugeGraph(IdMap idMap, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodePropertyValues> nodeProperties, long relationshipCount, @NotNull AdjacencyList adjacency, boolean hasRelationshipProperty, double defaultRelationshipPropertyValue, @Nullable AdjacencyProperties relationshipProperty, org.neo4j.gds.Orientation orientation, boolean isMultiGraph)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.Optional<NodeFilteredGraph>asNodeFilteredGraph()If this graph is created using a node label filter, this will return a NodeFilteredGraph that represents the node set used in this graph.java.util.Set<org.neo4j.gds.NodeLabel>availableNodeLabels()java.util.Set<java.lang.String>availableNodeProperties()java.util.Collection<PrimitiveLongIterable>batchIterables(long batchSize)voidcanRelease(boolean canRelease)HugeGraphconcurrentCopy()booleancontains(long originalNodeId)Returns true iff the neo4jNodeId is mapped, otherwise false.static HugeGraphcreate(IdMap nodes, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodePropertyValues> nodeProperties, Relationships.Topology topology, java.util.Optional<Relationships.Properties> maybeRelationshipProperty)intdegree(long node)intdegreeWithoutParallelRelationships(long nodeId)Much slower than just degree() because it may have to look up all relationships.booleanexists(long sourceNodeId, long targetNodeId)O(n) !voidforEachNode(java.util.function.LongPredicate consumer)Iterate over each nodeIdvoidforEachNodeLabel(long mappedNodeId, IdMap.NodeLabelConsumer consumer)voidforEachRelationship(long nodeId, double fallbackValue, RelationshipWithPropertyConsumer consumer)Calls the given consumer function for every relationship of a given node.voidforEachRelationship(long nodeId, RelationshipConsumer consumer)Calls the given consumer function for every relationship of a given node.booleanhasLabel(long mappedNodeId, org.neo4j.gds.NodeLabel label)booleanhasRelationshipProperty()longhighestNeoId()IdMapidMap()booleanisMultiGraph()Whether the graph is guaranteed to have no parallel relationships.longnodeCount()Number of mapped nodeIds.java.util.PrimitiveIterator.OfLongnodeIterator()java.util.PrimitiveIterator.OfLongnodeIterator(java.util.Set<org.neo4j.gds.NodeLabel> labels)java.util.List<org.neo4j.gds.NodeLabel>nodeLabels(long mappedNodeId)java.util.Map<java.lang.String,NodePropertyValues>nodeProperties()NodePropertyValuesnodeProperties(java.lang.String propertyKey)Return the property values for a property key NOTE: Avoid using this on the hot path, favor caching the NodeProperties object when possiblelongnthTarget(long nodeId, int offset)Get the n-th target node id for a givensourceNodeId.longrelationshipCount()doublerelationshipProperty(long sourceNodeId, long targetNodeId)Returns the property value for a relationship defined by its source and target nodes.doublerelationshipProperty(long sourceId, long targetId, double fallbackValue)get value of property on relationship between source and target node idRelationshipsrelationships()java.util.Map<org.neo4j.gds.RelationshipType,Relationships.Topology>relationshipTopologies()Relationships.TopologyrelationshipTopology()GraphrelationshipTypeFilteredGraph(java.util.Set<org.neo4j.gds.RelationshipType> relationshipTypes)voidreleaseProperties()Release only the properties associated with that graph.voidreleaseTopology()Release only the topological data associated with that graph.IdMaprootIdMap()Returns the original node mapping if the current node mapping is filtered, otherwise it returns itself.java.util.OptionalLongrootNodeCount()Number of mapped node ids in the root mapping.org.neo4j.gds.api.schema.GraphSchemaschema()java.util.stream.Stream<RelationshipCursor>streamRelationships(long nodeId, double fallbackValue)longtoMappedNodeId(long originalNodeId)Maps an original node id to a mapped node id.longtoOriginalNodeId(long mappedNodeId)Map mapped nodeId back to neo4j nodeIdlongtoRootNodeId(long mappedNodeId)Maps a filtered mapped node id to its root mapped node id.java.util.Optional<? extends FilteredIdMap>withFilteredLabels(java.util.Collection<org.neo4j.gds.NodeLabel> nodeLabels, int concurrency)-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.neo4j.gds.api.IdMap
safeToMappedNodeId
-
-
-
-
Field Detail
-
idMap
protected final IdMap idMap
-
schema
protected final org.neo4j.gds.api.schema.GraphSchema schema
-
nodeProperties
protected final java.util.Map<java.lang.String,NodePropertyValues> nodeProperties
-
orientation
protected final org.neo4j.gds.Orientation orientation
-
relationshipCount
protected final long relationshipCount
-
adjacency
protected AdjacencyList adjacency
-
properties
@Nullable protected @Nullable AdjacencyProperties properties
-
hasRelationshipProperty
protected final boolean hasRelationshipProperty
-
isMultiGraph
protected final boolean isMultiGraph
-
-
Constructor Detail
-
HugeGraph
protected HugeGraph(IdMap idMap, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodePropertyValues> nodeProperties, long relationshipCount, @NotNull @NotNull AdjacencyList adjacency, boolean hasRelationshipProperty, double defaultRelationshipPropertyValue, @Nullable @Nullable AdjacencyProperties relationshipProperty, org.neo4j.gds.Orientation orientation, boolean isMultiGraph)
-
-
Method Detail
-
create
public static HugeGraph create(IdMap nodes, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodePropertyValues> nodeProperties, Relationships.Topology topology, java.util.Optional<Relationships.Properties> maybeRelationshipProperty)
-
nodeCount
public long nodeCount()
Description copied from interface:IdMapNumber of mapped nodeIds.
-
rootNodeCount
public java.util.OptionalLong rootNodeCount()
Description copied from interface:PartialIdMapNumber of mapped node ids in the root mapping. This is necessary for nested (filtered) id mappings.- Specified by:
rootNodeCountin interfacePartialIdMap
-
highestNeoId
public long highestNeoId()
- Specified by:
highestNeoIdin interfaceIdMap
-
idMap
public IdMap idMap()
-
rootIdMap
public IdMap rootIdMap()
Description copied from interface:IdMapReturns the original node mapping if the current node mapping is filtered, otherwise it returns itself.
-
nodeProperties
public java.util.Map<java.lang.String,NodePropertyValues> nodeProperties()
-
relationshipCount
public long relationshipCount()
- Specified by:
relationshipCountin interfaceGraph- Returns:
- returns the total number of relationships in the graph.
-
batchIterables
public java.util.Collection<PrimitiveLongIterable> batchIterables(long batchSize)
- Specified by:
batchIterablesin interfaceBatchNodeIterable- Returns:
- a collection of iterables over every node, partitioned by the given batch size.
-
forEachNode
public void forEachNode(java.util.function.LongPredicate consumer)
Description copied from interface:NodeIteratorIterate over each nodeId- Specified by:
forEachNodein interfaceNodeIterator
-
nodeIterator
public java.util.PrimitiveIterator.OfLong nodeIterator()
- Specified by:
nodeIteratorin interfaceNodeIterator
-
nodeIterator
public java.util.PrimitiveIterator.OfLong nodeIterator(java.util.Set<org.neo4j.gds.NodeLabel> labels)
- Specified by:
nodeIteratorin interfaceNodeIterator
-
relationshipProperty
public double relationshipProperty(long sourceNodeId, long targetNodeId)Description copied from interface:RelationshipPropertiesReturns the property value for a relationship defined by its source and target nodes.- Specified by:
relationshipPropertyin interfaceRelationshipProperties
-
relationshipProperty
public double relationshipProperty(long sourceId, long targetId, double fallbackValue)Description copied from interface:RelationshipPropertiesget value of property on relationship between source and target node id- Specified by:
relationshipPropertyin interfaceRelationshipProperties- Parameters:
sourceId- source nodetargetId- target nodefallbackValue- value to use if relationship has no property value- Returns:
- the property value
-
nodeProperties
public NodePropertyValues nodeProperties(java.lang.String propertyKey)
Description copied from interface:NodePropertyContainerReturn the property values for a property key NOTE: Avoid using this on the hot path, favor caching the NodeProperties object when possible- Specified by:
nodePropertiesin interfaceNodePropertyContainer- Parameters:
propertyKey- the node property key- Returns:
- the values associated with that key
-
availableNodeProperties
public java.util.Set<java.lang.String> availableNodeProperties()
- Specified by:
availableNodePropertiesin interfaceNodePropertyContainer
-
forEachRelationship
public void forEachRelationship(long nodeId, RelationshipConsumer consumer)Description copied from interface:RelationshipIteratorCalls the given consumer function for every relationship of a given node.- Specified by:
forEachRelationshipin interfaceRelationshipIterator- Parameters:
nodeId- id of the node for which to iterate relationshipsconsumer- relationship consumer function
-
forEachRelationship
public void forEachRelationship(long nodeId, double fallbackValue, RelationshipWithPropertyConsumer consumer)Description copied from interface:RelationshipIteratorCalls the given consumer function for every relationship of a given node. If the graph was loaded with a relationship property, the property value of the relationship will be passed into the consumer. Otherwise the given fallback value will be used.- Specified by:
forEachRelationshipin interfaceRelationshipIterator- Parameters:
nodeId- id of the node for which to iterate relationshipsfallbackValue- value used as relationship property if no properties were loadedconsumer- relationship consumer function
-
streamRelationships
public java.util.stream.Stream<RelationshipCursor> streamRelationships(long nodeId, double fallbackValue)
- Specified by:
streamRelationshipsin interfaceRelationshipIterator
-
relationshipTypeFilteredGraph
public Graph relationshipTypeFilteredGraph(java.util.Set<org.neo4j.gds.RelationshipType> relationshipTypes)
- Specified by:
relationshipTypeFilteredGraphin interfaceGraph
-
relationshipTopologies
public java.util.Map<org.neo4j.gds.RelationshipType,Relationships.Topology> relationshipTopologies()
- Specified by:
relationshipTopologiesin interfaceCSRGraph
-
relationshipTopology
public Relationships.Topology relationshipTopology()
-
degreeWithoutParallelRelationships
public int degreeWithoutParallelRelationships(long nodeId)
Description copied from interface:DegreesMuch slower than just degree() because it may have to look up all relationships. This is not thread-safe, so if this is called concurrently please useRelationshipIterator.concurrentCopy().- Specified by:
degreeWithoutParallelRelationshipsin interfaceDegrees- See Also:
Graph.isMultiGraph()
-
toMappedNodeId
public long toMappedNodeId(long originalNodeId)
Description copied from interface:PartialIdMapMaps an original node id to a mapped node id. In case of nested id maps, the mapped node id is always in the space of the innermost mapping.- Specified by:
toMappedNodeIdin interfacePartialIdMap- Parameters:
originalNodeId- must be smaller or equal to the id returned byIdMap.highestNeoId()
-
toOriginalNodeId
public long toOriginalNodeId(long mappedNodeId)
Description copied from interface:IdMapMap mapped nodeId back to neo4j nodeId- Specified by:
toOriginalNodeIdin interfaceIdMap
-
toRootNodeId
public long toRootNodeId(long mappedNodeId)
Description copied from interface:IdMapMaps a filtered mapped node id to its root mapped node id. This is necessary for nested (filtered) id mappings. If this mapping is a nested mapping, this method returns the root mapped node id of the parent mapping. For the root mapping this method returns the given node id.- Specified by:
toRootNodeIdin interfaceIdMap
-
contains
public boolean contains(long originalNodeId)
Description copied from interface:IdMapReturns true iff the neo4jNodeId is mapped, otherwise false.
-
concurrentCopy
public HugeGraph concurrentCopy()
- Specified by:
concurrentCopyin interfaceCSRGraph- Specified by:
concurrentCopyin interfaceGraph- Specified by:
concurrentCopyin interfaceRelationshipIterator- Returns:
- a copy of this iterator that reuses new cursors internally, so that iterations happen independent from other iterations.
-
asNodeFilteredGraph
public java.util.Optional<NodeFilteredGraph> asNodeFilteredGraph()
Description copied from interface:GraphIf this graph is created using a node label filter, this will return a NodeFilteredGraph that represents the node set used in this graph. Be aware that it is not guaranteed to contain all relationships of the graph. Otherwise, it will return an empty Optional.- Specified by:
asNodeFilteredGraphin interfaceGraph
-
exists
public boolean exists(long sourceNodeId, long targetNodeId)O(n) !- Specified by:
existsin interfaceRelationshipPredicate
-
nthTarget
public long nthTarget(long nodeId, int offset)Description copied from interface:GraphGet the n-th target node id for a givensourceNodeId. The order of the targets is not defined and depends on the implementation of the graph, but it is consistent across separate calls to this method on the same graph. ThesourceNodeIdmust be a node id existing in the graph. Theoffsetparameter is 0-indexed and must be positive. Ifoffsetis greater than the number of targets forsourceNodeId,-1is returned. It is undefined behavior if thesourceNodeIddoes not exist in the graph or theoffsetis negative.
-
canRelease
public void canRelease(boolean canRelease)
- Specified by:
canReleasein interfaceGraph
-
releaseTopology
public void releaseTopology()
Description copied from interface:GraphRelease only the topological data associated with that graph.- Specified by:
releaseTopologyin interfaceGraph
-
releaseProperties
public void releaseProperties()
Description copied from interface:GraphRelease only the properties associated with that graph.- Specified by:
releasePropertiesin interfaceGraph
-
isMultiGraph
public boolean isMultiGraph()
Description copied from interface:GraphWhether the graph is guaranteed to have no parallel relationships. If this returnsfalseit still may be parallel-free, but we do not know.- Specified by:
isMultiGraphin interfaceGraph- Returns:
trueiff the graph has maximum one relationship between each pair of nodes.
-
relationships
public Relationships relationships()
-
hasRelationshipProperty
public boolean hasRelationshipProperty()
- Specified by:
hasRelationshipPropertyin interfaceGraph
-
nodeLabels
public java.util.List<org.neo4j.gds.NodeLabel> nodeLabels(long mappedNodeId)
- Specified by:
nodeLabelsin interfaceIdMap
-
forEachNodeLabel
public void forEachNodeLabel(long mappedNodeId, IdMap.NodeLabelConsumer consumer)- Specified by:
forEachNodeLabelin interfaceIdMap
-
availableNodeLabels
public java.util.Set<org.neo4j.gds.NodeLabel> availableNodeLabels()
- Specified by:
availableNodeLabelsin interfaceIdMap
-
hasLabel
public boolean hasLabel(long mappedNodeId, org.neo4j.gds.NodeLabel label)
-
withFilteredLabels
public java.util.Optional<? extends FilteredIdMap> withFilteredLabels(java.util.Collection<org.neo4j.gds.NodeLabel> nodeLabels, int concurrency)
- Specified by:
withFilteredLabelsin interfaceIdMap
-
-