Get Complete Project Material File(s) Now! »
Background and Related Work
This chapter provides background and overview of related and prior work on (sub)graph embedding techniques. After introducing our notation in Section 2.1, we dive into classical (non-deep) signal processing on graphs in Section 2.2, which is mostly formulated in the spectral domain. This knowledge is then useful for the overview of spectral deep learningbased convolution methods, introduced in Section 2.3. We mention their shortcoming and move on to review the progress in spatial convolution methods in Section 2.4, also in relation to the contributions of this thesis. To make the picture complete, we briefly mention non-convolutional approaches for embedding, namely direct embedding and graph kernels, in Section 2.5.
While this thesis focuses exclusively on graphs, it is worth to mention the progress made in (deep) learning on manifolds. Whereas the fields of differential geometry and graph theory are relatively distant, both graphs and manifolds are examples of non-Euclidean domains and the approaches for (spectral) signal processing and for convolutions in particular are closely related, especially when considering discretized manifolds – meshes. We refer the interested reader to the nice survey paper of Bronstein et al. (2017) for concrete details.
Attributed Graphs
Let us repeat and extend graph-related definitions from the introduction. A directed graph is an ordered pair G = (V, E) such that V is a non-empty set of vertices (also called nodes) and E ⊆ V × V is a set of edges (also called links). Undirected graph can be defined by constraining the relation E to be symmetric. It is frequently very useful to be able to attach additional information to elements of graphs in the form of attributes. For example, in graph representation of molecules, vertices may be assigned atomic numbers while edges may be associated with information about the chemical bond between two atoms. Concretely, a vertex-attributed graph assumes function F : V → Rdv assigning attributes to each vertex. An edge-attributed graph assumes function E : E → Rde assigning attributes to each edge. In the special case of non-negative one dimensional attributes E → R+, we call the graph weighted. Discrete attributes E → N can be also called types.
In addition to attributes, which are deemed being a fixed part of the input, we introduce mapping H : V → Rd to denote vertex representation due to signal processing or graph embedding algorithms operating on the graph. This mapping is called signal (especially if d = 1) or simply data. Often, vertex attributes F may be used to initialize H. The methods for computation of H will constitute a major part of discussion in this thesis.
Finally, assuming a particular node ordering, the graph can be conveniently described in matrix, resp. tensor form using its (weighed) adjacency matrix A, signal matrix H, node attribute matrix F and edge attribute tensor E of rank 3.
Signal Processing on Graphs
Here we briefly introduce several concepts regarding multiscale signal filtering on graphs, which are the basis of spectral graph convolution methods presented in Section 2.3 and also used for pooling in Chapter 3. Mathematically, both a one-dimensional discrete-time signal with N samples as well as a signal on a graph with N nodes can be regarded as vectors f ∈ RN. Nevertheless, applying classical signal processing techniques on the graph domain would yield suboptimal results due to disregarding (in)dependencies arising from the irregularity present in the domain.
However, the efforts to generalize basic signal processing primitives such as filtering or downsampling faces two major challenges, related to those mentioned in Section 1.1.1: Translation While translating time signal f(t) to the right by a unit can be expressed as f(t − 1), there is no analogy to shift-invariant translation in graphs, making the classical definition of convolution operation (f ∗ w)(t) := R τ g(τ )w(t − τ )dτ not directly applicable.
Coarsening While time signal decimation may amount to simply removing every other point, it is not immediately clear which nodes in a graph to remove and how to aggregate signal on the remaining ones.
Graph Fourier Transform
Assuming undirected finite weighted graph G on N nodes, let A be its weighted adjacency matrix and D its diagonal degree matrix, i.e. Di,i = P j Ai,j . The central concept in spectral graph analysis is the family of Laplacians, in particular the unnormalized graph Laplacian L := D − A and the normalized graph Laplacian L := D−1/2(D − A)D−1/2 = I −D−1/2AD−1/2. Intuitively, Laplacians capture the difference between the local average of a function around a node and the value of the function at the node itself. We denote {ul} the set of real-valued orthonormal eigenvectors of a Laplacian and {λl} the set of corresponding eigenvalues. Interestingly, Laplacian eigenvalues provide a notion of frequency in the sense that eigenvectors associated with small eigenvalues are smooth and vary slowly across the graph (i.e. , the values of an eigenvector at two nodes connected by an edge with large weight are likely to be similar) and those associated with larger eigenvalues oscillate more rapidly.
Table of contents :
1 Introduction
1.1 Motivation
1.1.1 Embedding Graphs and Their Nodes
1.1.2 Generating Graphs
1.2 Summary of Contributions
1.3 Outline
2 Background and Related Work
2.1 Attributed Graphs
2.2 Signal Processing on Graphs
2.2.1 Graph Fourier Transform
2.2.2 Signal Filtering
2.2.3 Graph Coarsening
2.3 Spectral Graph Convolutions
2.4 Spatial Graph Convolutions
2.5 Non-convolutional Embedding Methods
2.5.1 Direct Node Embedding
2.5.2 Graph Kernels
3 Edge-Conditioned Convolutions
3.1 Introduction
3.2 Related Work
3.2.1 Graph Convolutions
3.2.2 Deep Learning on Sparse 3D Data
3.3 Method
3.3.1 Edge-Conditioned Convolution
3.3.2 Relationship to Existing Formulations
3.3.3 Deep Networks with ECC
3.3.4 Application in Point Clouds
3.3.5 Application in General Graphs
3.4 Experiments
3.4.1 Sydney Urban Objects
3.4.2 ModelNet
3.4.3 Graph Classification
3.4.4 MNIST
3.4.5 Detailed Analyses and Ablations
3.5 Discussion
3.6 Conclusion
4 Large-scale Point Cloud Segmentation
4.1 Introduction
4.2 Related Work
4.3 Method
4.3.1 Geometric Partition with a Global Energy
4.3.2 Superpoint Graph Construction
4.3.3 Superpoint Embedding
4.3.4 Contextual Segmentation
4.3.5 Implementation Details
4.4 Experiments
4.4.1 Semantic3D
4.4.2 Stanford Large-Scale 3D Indoor Spaces
4.4.3 Segmentation Baselines
4.4.4 Ablation Studies
4.5 Discussion
4.6 Conclusion
5 Generation of Small Graphs
5.1 Introduction
5.2 Related work
5.3 Method
5.3.1 Variational Autoencoder
5.3.2 Probabilistic Graph Decoder
5.3.3 Reconstruction Loss
5.3.4 Graph Matching
5.3.5 Further Details
5.4 Evaluation
5.4.1 Application in Cheminformatics
5.4.2 QM9 Dataset
5.4.3 ZINC Dataset
5.5 Discussion
5.6 Conclusion
6 Conclusion
6.1 Future Directions
References