Add post on Online Analysis of Medical Time Series

This commit is contained in:
Dimitri Lozeve 2020-11-17 18:20:23 +01:00
parent 72b7241b73
commit cd8fff45ff
2 changed files with 208 additions and 0 deletions

View file

@ -454,3 +454,18 @@
doi = {10.1007/978-3-319-20735-3}, doi = {10.1007/978-3-319-20735-3},
isbn = 9783319207346, isbn = 9783319207346,
} }
@article{fried2017_onlin_analy_medic_time_series,
author = {Roland Fried and Sermad Abbas and Matthias Borowski
and Michael Imhoff},
title = {Online Analysis of Medical Time Series},
journal = {Annual Review of Statistics and Its Application},
volume = {4},
number = {1},
pages = {169-188},
year = {2017},
doi = {10.1146/annurev-statistics-060116-054148},
url =
{https://doi.org/10.1146/annurev-statistics-060116-054148},
DATE_ADDED = {Tue Nov 17 08:59:07 2020},
}

View file

@ -0,0 +1,193 @@
---
title: "Online Analysis of Medical Time Series"
date: 2020-11-17
toc: false
---
This is a short overview of the following paper by
cite:fried2017_onlin_analy_medic_time_series:
#+begin_quote
Fried, Roland, Sermad Abbas, Matthias Borowski, and Michael Imhoff. 2017. “Online Analysis of Medical Time Series.” /Annual Review of Statistics and Its Application/ 4 (1): 169--88. [[https://doi.org/10.1146/annurev-statistics-060116-054148]].
#+end_quote
[fn:: {-} Unfortunately, most of the papers from /Annual Reviews/ are
not open access. I hope the situation will improve in the future, but
in the meantime there is [[https://en.wikipedia.org/wiki/Sci-Hub][Sci-Hub]].]
As the title suggests, it is a very complete review of statistical
models for studying medical time series in an online setting. It
appeared in [[https://www.annualreviews.org/][/Annual Reviews/]], which publish very nice reviews of
various topics in a [[https://www.annualreviews.org/action/showPublications][wide variety of fields]].
Since I work on developing algorithms for a [[https://www.sysnav.fr/markets/heathcare/?lang=en][medical device]], this is
particularly relevant for my job!
* Context: clinical applications and devices, and the need for robust statistical analysis
The goal of online medical time series analysis is to detect relevant
patterns, such as trends, trend changes, and abrupt jumps. This is to
support online decision support systems.
The paper (section 5)[fn:section5] goes on to explain the motivation
for developing robust methods of time series analysis for healthcare
applications.
[fn:section5] {-} The section explaining the motivation behind the
review is at the end of the paper. I find it strange to go straight to
the detailed exposition of complex statistical methods without
explaining the context (medical time series and devices) in more
detail.
An important issue in clinical applications is the false positive
rates:
#+begin_quote
Excessive rates of false positive alarms---in some studies more than
90% of all alarms---lead to alarm overload and eventually
desensitization of caregivers, which may ultimately jeopardize patient
safety.
#+end_quote
There are two kinds of medical devices: clinical decision support and
closed-loop controllers. /Decision support/ aims to provide the
physician with recommendations to provide the best care to the
patient. The goal of the medical device and system is to go from raw,
low-level measurements to "high-level qualitative principles", on
which medical reasoning is directly possible. This is the motivation
behind a need for abstraction, compression of information, and
interpretability.
The other kind of medical device is /physiologic closed-loop
controllers/ (PCLC). In this case, the patient is in the loop, and the
device can take action directly based on the feedback from its
measurements. Since there is no direct supervision by medical
practitioners, a lot more caution has to be applied. Moreover, these
devices generally work in hard real-time environments, making online
functioning an absolute requirement.
* Robust time series filtering
The objective here is to recover the time-varying level underlying the
data, which contains the true information about the patient's state.
We assume that the time series $y_1, \ldots, y_N$ is generated by an additive model
\[ y_t = \mu_t + \epsilon_t + \eta_t,\qquad t=1,\ldots,N, \]
where $\mu$ represents the signal value, $\epsilon$ is a noise
variable, and $\eta$ is an outlier variable, which is zero most of the
time, but can take large absolute values at random times.
The paper reviews many methods for recovering the underlying signal
via [[https://en.wikipedia.org/wiki/State_observer][state estimation]]. Moving window techniques start from a simple
running median and go through successive iterations to improve the
properties of the estimator. Each time, we can estimate the mean of
the signal and the variance.
Going further, regression-based filtering provide an interesting
approach to estimate locally the slope and the level of the time
series. Of these, the [[https://en.wikipedia.org/wiki/Repeated_median_regression][repeated median]] (RM) regression offers a good
compromise between robustness and efficiency against normal noise.
Without using moving windows, [[https://en.wikipedia.org/wiki/Kalman_filter][Kalman filters]][fn:kalman] can also reconstruct the
signal by including in their state a steady state, a level shift,
slope change, and outliers. However, it is often difficult to specify
the error structure.
[fn:kalman] {-} I already talked about Kalman filters when I briefly
mentioned applications [[./quaternions.html#applications][in my post on quaternions]].
* Online pattern detection
Instead of trying to recover the underlying signal, we can try to
detect directly some events: level shifts, trend changes, volatility
changes.
This is generally based on [[https://en.wikipedia.org/wiki/Autoregressive_model][autoregressive modelling]], which work better
if we can use a small time delay for the detection.
* Multivariate techniques
All the techniques discussed above were designed with a single time
series in mind. However, in most real-world applications, you measure
several variables simultaneously. Applying the same analyses on
multivariate time series can be challenging. Moreover, if the
dimension is high enough, it becomes too difficult for a physician to
understand it and make decisions. It is therefore very important to
have methods to extract the most pertinent and important information
from the time series.
The idea is to apply [[https://en.wikipedia.org/wiki/Dimensionality_reduction][dimensionality reduction]] to the multivariate time
series in order to extract meaningful information. [[https://en.wikipedia.org/wiki/Principal_component_analysis][Principal component
analysis]] is too static, so dynamic versions are needed to exploit the
temporal structure. This leads to optimal linear double-infinite
filters, that
#+begin_quote
explore the dependencies between observations at different time lags
and compress the information in a multivariate time series more
efficiently that ordinary (static) principal component analysis.
#+end_quote
[[https://en.wikipedia.org/wiki/Graphical_model][Graphical models]] can also be combined with dimensionality reduction to
ensure that the compressed variables contain information about the
patient's state that is understandable to physicians.
Finally, one can also use [[https://en.wikipedia.org/wiki/Cluster_analysis][clustering]] to group time series according to
their trend behaviour.
* Conclusions
To summarize, here are the key points studied in the paper.
Context: We have continuous measurements of physiological or
biochemical variables. These are acquired from medical devices
interacting with the patient, and processed by our medical system. The
system, in turn, should either help the physician in her
decision-making, or directly take action (in the case of a closed-loop
controller).
There are several issues with the basic approach:
- Measurements are noisy and contaminated by measurement artefacts
that impact the ability to make decisions based on the measurements.
- We often measure a multitude of variables, which means a lot of
complexity.
The article reviews methods to mitigate these issues: extracting the
true signal, detecting significant events, and reducing complexity to
extract clinically relevant information.
The final part of the conclusion is a very good summary of the
challenges we face when working with medical devices and algorithms:
#+begin_quote
Addressing the challenges of robust signal extraction and complexity
reduction requires:
- Deep understanding of the clinical problem to be solved,
- Deep understanding of the statistical algorithms,
- Clear identification of algorithmic problems and goals,
- Capabilities and expertise to develop new algorithms,
- Understanding of the respective medical device(s) and the
development environment,
- Acquisition of clinical data that is sufficient to support
development and validation of new algorithms.
The multitude of resulting requirements cannot be addressed by one
profession alone. Rather, close cooperation between statisticians,
engineers, and clinicians is essential for the successful development
of medical devices embedding advanced statistical algorithms.
Moreover, regulatory requirements have to be considered early on when
developing algorithms and implementing them in medical devices. The
overarching goal is to help make patient care more efficient and
safer.
#+end_quote
The complex interplay between mathematical, technical, clinical, and
regulatory requirements, and the need to interact with experts in all
these fields, are indeed what makes my job so interesting!
* References
I didn't include references to the methods I mention in this post,
since the paper itself contains a lot of citations to the relevant
literature.