Design principles#
The key design principle of weather-model-graphs
is to work with networkx.DiGraph
objects as the primary data structure for the graph representation right until the graph is to be stored on disk into a specific format.
Using only networkx.DiGraph
objects as the intermediate representations makes it possible to
easily modularise the whole generation process, with every step outputting a
networkx.DiGraph
object,easily visualise the graph resulting from any step and
easily connect graph nodes across graph components, combine graphs and split graphs based on node and edge attributes.
The graph generation in weather-model-graphs
is split into to the following steps:
Create the three graph components of the message-passing graph that constitute the auto-regressive atmospheric flow model, all represented by
networkx.DiGraph
objects:grid-to-mesh (
g2m
): the encoding compenent, where edges represent the encoding of physical variables into the latent space of the modelmesh-to-mesh (
m2m
): the processing component, where edges represent information flow between nodes through the time evolution of the atmospheric statemesh-to-grid (
m2g
): the decoding component, where edges represent the decoding of the latent space back into physical variables
Combine all three graph components into a single
networkx.DiGraph
object and create a unique node identifier for each node in the combined graph.Split the combined graph into the three output graph components again (or more if the specific graph architecture requires it).
Store each of the output graph components in the desired format, for example:
networkx
.pickle
file: savenetworkx.DiGraph
objects usingpickle
to disk (weather_model_graphs.save.to_pickle(...)
)pytorch-geometric for neural-lam: edges indexes and features are stored in separate
torch.Tensor
objects serialised to disk that can then be loaded intotorch_geometric.data.Data
objects (weather_model_graphs.save.to_pyg(...)
)
Diagram of the graph generation process:#
Below, the graph generation process is visualised in weather-model-graphs
for the example given above:
Node and edge attributes#
There are a number of node and edge attributes with special meanings in weather-model-graphs
which enable the splitting and visualisation of the graph components.
Node attributes#
pos
: the(x,y)
coordinates of the node in the gridtype
: the type of node, eithergrid
ormesh
Edge attributes#
component
: the component of the graph the edge belongs to, eitherg2m
,m2m
orm2g
level
: for multi-range mesh graphs this denotes the refinement level of mesh connection. For hierarchical graphs the different ranges of connections are split into different levels and so herelevel
also denotes the level in the hierarchy that the edge belongs to.len
: the length of the edge in the (x,y) coordinate space of the grid nodes, i.e. the distance between the two nodes in the gridvdiff
: the vector spanning between the (x,y) coordinates of the two nodesdirection
: for hierarchical graphs this denotes the direction of the edge, eitherup
,down
andsame
Splitting graphs#
The splitting of the graph is done with by utilising the edge attributes, and thus it is easy to split the complete graph by either which component the edge belongs to, or by the level of the edge in the graph. This is done using the weather_model_graphs.split_graph_by_edge_attribute(...)
function.
Code layout#
The code layout of weather-model-graphs
is organised into submodules by the functionality they provide. The main submodules are:
weather_model_graphs
.create
.archetype:
for creating specific archetype graph
architectures (e.g. Keisler 2021, Lam et al 2023,
Oscarsson et al 2023)
.base
general interface for creating graph architectures
(here you define the g2m, m2m and m2g connectivity directly)
.mesh
for creating the mesh nodes and edges
.grid
for creating the grid nodes
.visualise
for plotting graphs, allowing for easy visualisation using any
edge or node attribute for colouring
.save
for saving the graph to specific formats (e.g. pytorch-geometric)