Dissertation: reorganisation
This commit is contained in:
parent
26ff15b286
commit
141309f6f2
1 changed files with 125 additions and 118 deletions
|
@ -75,11 +75,128 @@ Thank you!
|
||||||
\label{cha:introduction}
|
\label{cha:introduction}
|
||||||
|
|
||||||
|
|
||||||
|
\chapter{Graphs and Temporal Networks}%
|
||||||
|
\label{cha:temporal-networks}
|
||||||
|
|
||||||
|
\section{Definition and basic properties}%
|
||||||
|
\label{sec:defin-basic-prop}
|
||||||
|
|
||||||
|
In this section, we will introduce the notion of temporal networks or
|
||||||
|
graphs. This is a complex notion, with many concurrent definitions and
|
||||||
|
interpretations. First, we restate the standard definition of a
|
||||||
|
non-temporal, static graph.
|
||||||
|
|
||||||
|
\begin{defn}[Graph]
|
||||||
|
A \emph{graph} is a couple $G = (V, E)$, where $V$ is a finite set
|
||||||
|
of \emph{nodes} (or \emph{vertices}), and $E \subseteq V\times V$ is
|
||||||
|
a set of \emph{edges}. A \emph{weighted graph} is defined by
|
||||||
|
$G = (V, E, w)$, where $w : E\mapsto \mathbb{R}_+$ is fcalled the
|
||||||
|
\emph{weight function}.
|
||||||
|
\end{defn}
|
||||||
|
|
||||||
|
We also define some basic concepts that will be needed later on to
|
||||||
|
build simplicial complexes on graphs.
|
||||||
|
|
||||||
|
\begin{defn}[Clique]
|
||||||
|
A \emph{clique} is a set of nodes where each pair is connected. That
|
||||||
|
is, a clique $C$ of a graph $G = (V,E)$ is a subset of $V$ such that
|
||||||
|
$\forall i,j\in C, i \neq j \implies (i,j)\in E$. A clique is said
|
||||||
|
to be \emph{maximal} if it cannot be augmented by any node.
|
||||||
|
\end{defn}
|
||||||
|
|
||||||
|
Temporal networks are defined in the more general framework of
|
||||||
|
\emph{multilayer networks}~\cite{kivela_multilayer_2014}. However,
|
||||||
|
this definition is much too general for our simple applications, and
|
||||||
|
we restrict ourselves to edge-centric time-varying
|
||||||
|
graphs~\cite{casteigts_time-varying_2012}. In this model, the set of
|
||||||
|
nodes is fixed and doesn't change over time, whereas edges can appear
|
||||||
|
or disappear at different timestamps.
|
||||||
|
|
||||||
|
\begin{defn}[Temporal network]
|
||||||
|
A \emph{temporal network} (or graph) is a tuple
|
||||||
|
$G = (V, E, \mathcal{T}, \rho)$, where:
|
||||||
|
\begin{itemize}
|
||||||
|
\item $V$ is a finite set of nodes,
|
||||||
|
\item $E\subseteq V\times V$ is a set of edges,
|
||||||
|
\item $\mathbb{T}$ is the \emph{temporal domain} (often taken as
|
||||||
|
$\mathbb{N}$ or $\mathbb{R}_+$), and
|
||||||
|
$\mathcal{T}\subseteq\mathbb{T}$ is the \emph{lifetime} of the
|
||||||
|
network,
|
||||||
|
\item $\rho: E\times\mathcal{T}\mapsto\{0,1\}$ is the \emph{presence
|
||||||
|
function}, which determines whether an edge is present in the
|
||||||
|
network at each timestamp.
|
||||||
|
\end{itemize}
|
||||||
|
The \emph{available dates} of an edge are the set
|
||||||
|
$\mathcal{I}(e) = \{t\in\mathcal{T}: \rho(e,t)=1\}$.
|
||||||
|
\end{defn}
|
||||||
|
|
||||||
|
Temporal networks can also have weighted edges. In this case, it is
|
||||||
|
possible to have constant weights (edges can only appear or disappear
|
||||||
|
over time, and always have the same weight), or time-varying
|
||||||
|
weights. In the latter case, we can set the domain of the presence
|
||||||
|
function to be $\mathbb{R}_+$ instead of $\{0,1\}$, where by
|
||||||
|
convention a zero weight corresponds to an absent edge.
|
||||||
|
|
||||||
|
\begin{defn}[Additive temporal network]
|
||||||
|
A temporal network is said to be \emph{additive} if for all $e\in E$
|
||||||
|
and $t\in\mathcal{T}$, if $\rho(e,t)=1$, then
|
||||||
|
$\forall t'>t, \rho(e, t') = 1$. Edges can only be added to the
|
||||||
|
network, never removed.
|
||||||
|
\end{defn}
|
||||||
|
|
||||||
|
\section{Examples of applications}%
|
||||||
|
\label{sec:exampl-appl}
|
||||||
|
|
||||||
|
\section{Network partitioning}%
|
||||||
|
\label{sec:network-partitioning}
|
||||||
|
|
||||||
|
Temporal networks are a very active research subject, leading to
|
||||||
|
multiple interesting problems. The additional time dimension adds a
|
||||||
|
significant layer of complexity that cannot be adequately treated by
|
||||||
|
the common methods on static graphs.
|
||||||
|
|
||||||
|
Moreover, data collection can lead to large amount of noise in
|
||||||
|
datasets. Combined with large dataset sized due to the huge number of
|
||||||
|
data points for each node in the network, temporal graphs cannot be
|
||||||
|
studied effectively in their raw form. Recent advances have been made
|
||||||
|
to fit network models to rich but noisy
|
||||||
|
data~\cite{newman_network_2018}, generally using some variation on the
|
||||||
|
expectation-maximization (EM) algorithm.
|
||||||
|
|
||||||
|
One solution that has been proposed to study such temporal data has
|
||||||
|
been to \emph{partition} the time scale of the network into a sequence
|
||||||
|
of smaller, static graphs, representing all the interactions during a
|
||||||
|
short interval of time. The approach consists in subdividing the
|
||||||
|
lifetime of the network in \emph{sliding windows} of a given length.
|
||||||
|
We can then ``flatten'' the temporal network on each time interval,
|
||||||
|
keeping all the edges that appear at least once (or adding their
|
||||||
|
weights in the case of weighted networks).
|
||||||
|
|
||||||
|
This partitioning is sensitive to two parameters: the length of each
|
||||||
|
time interval, and their overlap. Of those, the former is the most
|
||||||
|
important: it will define the \emph{resolution} of the study. If it is
|
||||||
|
too small, too much noise will be taken into account; if it is too
|
||||||
|
large, we will lose important information. There is a need to find a
|
||||||
|
compromise, which will depend on the application and on the task
|
||||||
|
performed on the network. In the case of a classification task to
|
||||||
|
determine periodicity, it will be useful to adapt the resolution to
|
||||||
|
the expected period: if we expect week-long periodicity, a resolution
|
||||||
|
of one day seems reasonable.
|
||||||
|
|
||||||
|
Once the network is partitioned, we can apply any statistical learning
|
||||||
|
task on the sequence of static graphs. In this study, we will focus on
|
||||||
|
classification of time steps. This can be used to detect periodicity,
|
||||||
|
outliers, or even maximise temporal communities.
|
||||||
|
|
||||||
|
%% TODO Talk about partitioning methods?
|
||||||
|
|
||||||
\chapter{Topological Data Analysis and Persistent Homology}%
|
\chapter{Topological Data Analysis and Persistent Homology}%
|
||||||
\label{cha:tda-ph}
|
\label{cha:tda-ph}
|
||||||
|
|
||||||
\section{Homology}%
|
\section{Basic constructions}
|
||||||
|
\label{sec:basic-constructions}
|
||||||
|
|
||||||
|
\subsection{Homology}%
|
||||||
\label{sec:homology}
|
\label{sec:homology}
|
||||||
|
|
||||||
Our goal is to understand the topological structure of a metric
|
Our goal is to understand the topological structure of a metric
|
||||||
|
@ -98,7 +215,7 @@ space can be extremely difficult. It is necessary to approximate it in
|
||||||
a structure that would be both combinatorial and topological in
|
a structure that would be both combinatorial and topological in
|
||||||
nature.
|
nature.
|
||||||
|
|
||||||
\section{Simplicial Complexes}%
|
\subsection{Simplicial Complexes}%
|
||||||
\label{sec:simplicial-complexes}
|
\label{sec:simplicial-complexes}
|
||||||
|
|
||||||
In order to understand the topological structure of a metric space, we
|
In order to understand the topological structure of a metric space, we
|
||||||
|
@ -235,7 +352,7 @@ of a hyperedge is not necessarily a hyperedge itself.
|
||||||
Using these definitions, we can define homology on simplicial
|
Using these definitions, we can define homology on simplicial
|
||||||
complexes. %% TODO add reference for more details/do it myself?
|
complexes. %% TODO add reference for more details/do it myself?
|
||||||
|
|
||||||
\section{Filtrations}%
|
\subsection{Filtrations}%
|
||||||
\label{sec:filtrations}
|
\label{sec:filtrations}
|
||||||
|
|
||||||
If we consider that a simplicial complex is a kind of
|
If we consider that a simplicial complex is a kind of
|
||||||
|
@ -307,7 +424,7 @@ space.
|
||||||
|
|
||||||
\begin{defn}[Persistence diagrams]
|
\begin{defn}[Persistence diagrams]
|
||||||
A \emph{persistence diagram} is the union of a finite multiset of
|
A \emph{persistence diagram} is the union of a finite multiset of
|
||||||
points in $\bar{\mathbb{R}}^2$ zith the diagonal
|
points in $\overline{\mathbb{R}}^2$ zith the diagonal
|
||||||
$\Delta = \{(x,x) \;|\; x\in\mathbb{R}^2\}$, where every point of
|
$\Delta = \{(x,x) \;|\; x\in\mathbb{R}^2\}$, where every point of
|
||||||
$\Delta$ has infinite multiplicity.
|
$\Delta$ has infinite multiplicity.
|
||||||
\end{defn}
|
\end{defn}
|
||||||
|
@ -347,122 +464,12 @@ diagonal $\Delta$.
|
||||||
\section{Stability}%
|
\section{Stability}%
|
||||||
\label{sec:stability}
|
\label{sec:stability}
|
||||||
|
|
||||||
|
\section{Algorithms and implementations}%
|
||||||
|
\label{sec:algor-impl}
|
||||||
|
|
||||||
|
|
||||||
\chapter{Temporal Networks}%
|
\chapter{Topological Data Analysis on Networks}%
|
||||||
\label{cha:temporal-networks}
|
\label{cha:topol-data-analys}
|
||||||
|
|
||||||
\section{Definition and basic properties}%
|
|
||||||
\label{sec:defin-basic-prop}
|
|
||||||
|
|
||||||
In this section, we will introduce the notion of temporal networks or
|
|
||||||
graphs. This is a complex notion, with many concurrent definitions and
|
|
||||||
interpretations. First, we restate the standard definition of a
|
|
||||||
non-temporal, static graph.
|
|
||||||
|
|
||||||
\begin{defn}[Graph]
|
|
||||||
A \emph{graph} is a couple $G = (V, E)$, where $V$ is a finite set
|
|
||||||
of \emph{nodes} (or \emph{vertices}), and $E \subseteq V\times V$ is
|
|
||||||
a set of \emph{edges}. A \emph{weighted graph} is defined by
|
|
||||||
$G = (V, E, w)$, where $w : E\mapsto \mathbb{R}_+$ is fcalled the
|
|
||||||
\emph{weight function}.
|
|
||||||
\end{defn}
|
|
||||||
|
|
||||||
We also define some basic concepts that will be needed later on to
|
|
||||||
build simplicial complexes on graphs.
|
|
||||||
|
|
||||||
\begin{defn}[Clique]
|
|
||||||
A \emph{clique} is a set of nodes where each pair is connected. That
|
|
||||||
is, a clique $C$ of a graph $G = (V,E)$ is a subset of $V$ such that
|
|
||||||
$\forall i,j\in C, i \neq j \implies (i,j)\in E$. A clique is said
|
|
||||||
to be \emph{maximal} if it cannot be augmented by any node.
|
|
||||||
\end{defn}
|
|
||||||
|
|
||||||
Temporal networks are defined in the more general framework of
|
|
||||||
\emph{multilayer networks}~\cite{kivela_multilayer_2014}. However,
|
|
||||||
this definition is much too general for our simple applications, and
|
|
||||||
we restrict ourselves to edge-centric time-varying
|
|
||||||
graphs~\cite{casteigts_time-varying_2012}. In this model, the set of
|
|
||||||
nodes is fixed and doesn't change over time, whereas edges can appear
|
|
||||||
or disappear at different timestamps.
|
|
||||||
|
|
||||||
\begin{defn}[Temporal network]
|
|
||||||
A \emph{temporal network} (or graph) is a tuple
|
|
||||||
$G = (V, E, \mathcal{T}, \rho)$, where:
|
|
||||||
\begin{itemize}
|
|
||||||
\item $V$ is a finite set of nodes,
|
|
||||||
\item $E\subseteq V\times V$ is a set of edges,
|
|
||||||
\item $\mathbb{T}$ is the \emph{temporal domain} (often taken as
|
|
||||||
$\mathbb{N}$ or $\mathbb{R}_+$), and
|
|
||||||
$\mathcal{T}\subseteq\mathbb{T}$ is the \emph{lifetime} of the
|
|
||||||
network,
|
|
||||||
\item $\rho: E\times\mathcal{T}\mapsto\{0,1\}$ is the \emph{presence
|
|
||||||
function}, which determines whether an edge is present in the
|
|
||||||
network at each timestamp.
|
|
||||||
\end{itemize}
|
|
||||||
The \emph{available dates} of an edge are the set
|
|
||||||
$\mathcal{I}(e) = \{t\in\mathcal{T}: \rho(e,t)=1\}$.
|
|
||||||
\end{defn}
|
|
||||||
|
|
||||||
Temporal networks can also have weighted edges. In this case, it is
|
|
||||||
possible to have constant weights (edges can only appear or disappear
|
|
||||||
over time, and always have the same weight), or time-varying
|
|
||||||
weights. In the latter case, we can set the domain of the presence
|
|
||||||
function to be $\mathbb{R}_+$ instead of $\{0,1\}$, where by
|
|
||||||
convention a zero weight corresponds to an absent edge.
|
|
||||||
|
|
||||||
\begin{defn}[Additive temporal network]
|
|
||||||
A temporal network is said to be \emph{additive} if for all $e\in E$
|
|
||||||
and $t\in\mathcal{T}$, if $\rho(e,t)=1$, then
|
|
||||||
$\forall t'>t, \rho(e, t') = 1$. Edges can only be added to the
|
|
||||||
network, never removed.
|
|
||||||
\end{defn}
|
|
||||||
|
|
||||||
\section{Examples of applications}%
|
|
||||||
\label{sec:exampl-appl}
|
|
||||||
|
|
||||||
\section{Network partitioning}%
|
|
||||||
\label{sec:network-partitioning}
|
|
||||||
|
|
||||||
Temporal networks are a very active research subject, leading to
|
|
||||||
multiple interesting problems. The additional time dimension adds a
|
|
||||||
significant layer of complexity that cannot be adequately treated by
|
|
||||||
the common methods on static graphs.
|
|
||||||
|
|
||||||
Moreover, data collection can lead to large amount of noise in
|
|
||||||
datasets. Combined with large dataset sized due to the huge number of
|
|
||||||
data points for each node in the network, temporal graphs cannot be
|
|
||||||
studied effectively in their raw form. Recent advances have been made
|
|
||||||
to fit network models to rich but noisy
|
|
||||||
data~\cite{newman_network_2018}, generally using some variation on the
|
|
||||||
expectation-maximization (EM) algorithm.
|
|
||||||
|
|
||||||
One solution that has been proposed to study such temporal data has
|
|
||||||
been to \emph{partition} the time scale of the network into a sequence
|
|
||||||
of smaller, static graphs, representing all the interactions during a
|
|
||||||
short interval of time. The approach consists in subdividing the
|
|
||||||
lifetime of the network in \emph{sliding windows} of a given length.
|
|
||||||
We can then ``flatten'' the temporal network on each time interval,
|
|
||||||
keeping all the edges that appear at least once (or adding their
|
|
||||||
weights in the case of weighted networks).
|
|
||||||
|
|
||||||
This partitioning is sensitive to two parameters: the length of each
|
|
||||||
time interval, and their overlap. Of those, the former is the most
|
|
||||||
important: it will define the \emph{resolution} of the study. If it is
|
|
||||||
too small, too much noise will be taken into account; if it is too
|
|
||||||
large, we will lose important information. There is a need to find a
|
|
||||||
compromise, which will depend on the application and on the task
|
|
||||||
performed on the network. In the case of a classification task to
|
|
||||||
determine periodicity, it will be useful to adapt the resolution to
|
|
||||||
the expected period: if we expect week-long periodicity, a resolution
|
|
||||||
of one day seems reasonable.
|
|
||||||
|
|
||||||
Once the network is partitioned, we can apply any statistical learning
|
|
||||||
task on the sequence of static graphs. In this study, we will focus on
|
|
||||||
classification of time steps. This can be used to detect periodicity,
|
|
||||||
outliers, or even maximise temporal communities.
|
|
||||||
|
|
||||||
%% TODO Talk about partitioning methods?
|
|
||||||
|
|
||||||
\section{Persistent homology for networks}%
|
\section{Persistent homology for networks}%
|
||||||
\label{sec:pers-homol-netw}
|
\label{sec:pers-homol-netw}
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue