Add post on Online Analysis of Medical Time Series

2020-11-17 18:20:23 +01:00 · 2020-11-17 18:20:23 +01:00 · cd8fff45ff
commit cd8fff45ff
parent 72b7241b73
2 changed files with 208 additions and 0 deletions
--- a/bib/bibliography.bib
+++ b/bib/bibliography.bib
@ -454,3 +454,18 @@
  doi =		 {10.1007/978-3-319-20735-3},
  isbn =	 9783319207346,
 }
@article{fried2017_onlin_analy_medic_time_series,
  author =	 {Roland Fried and Sermad Abbas and Matthias Borowski
                  and Michael Imhoff},
  title =	 {Online Analysis of Medical Time Series},
  journal =	 {Annual Review of Statistics and Its Application},
  volume =	 {4},
  number =	 {1},
  pages =	 {169-188},
  year =	 {2017},
  doi =		 {10.1146/annurev-statistics-060116-054148},
  url =
                  {https://doi.org/10.1146/annurev-statistics-060116-054148},
  DATE_ADDED =	 {Tue Nov 17 08:59:07 2020},
 }
--- a/posts/online-analysis-of-medical-time-series.org
+++ b/posts/online-analysis-of-medical-time-series.org
@ -0,0 +1,193 @@
 ---
 title: "Online Analysis of Medical Time Series"
 date: 2020-11-17
 toc: false
 ---
 This is a short overview of the following paper by
 cite:fried2017_onlin_analy_medic_time_series:
 #+begin_quote
 Fried, Roland, Sermad Abbas, Matthias Borowski, and Michael Imhoff. 2017. “Online Analysis of Medical Time Series.” /Annual Review of Statistics and Its Application/ 4 (1): 169--88. [[https://doi.org/10.1146/annurev-statistics-060116-054148]].
 #+end_quote
 [fn:: {-} Unfortunately, most of the papers from /Annual Reviews/ are
 not open access. I hope the situation will improve in the future, but
 in the meantime there is [[https://en.wikipedia.org/wiki/Sci-Hub][Sci-Hub]].]
 As the title suggests, it is a very complete review of statistical
 models for studying medical time series in an online setting. It
 appeared in [[https://www.annualreviews.org/][/Annual Reviews/]], which publish very nice reviews of
 various topics in a [[https://www.annualreviews.org/action/showPublications][wide variety of fields]].
 Since I work on developing algorithms for a [[https://www.sysnav.fr/markets/heathcare/?lang=en][medical device]], this is
 particularly relevant for my job!
 * Context: clinical applications and devices, and the need for robust statistical analysis
 The goal of online medical time series analysis is to detect relevant
 patterns, such as trends, trend changes, and abrupt jumps. This is to
 support online decision support systems.
 The paper (section 5)[fn:section5] goes on to explain the motivation
 for developing robust methods of time series analysis for healthcare
 applications.
 [fn:section5] {-} The section explaining the motivation behind the
 review is at the end of the paper. I find it strange to go straight to
 the detailed exposition of complex statistical methods without
 explaining the context (medical time series and devices) in more
 detail.
 An important issue in clinical applications is the false positive
 rates:
 #+begin_quote
 Excessive rates of false positive alarms---in some studies more than
 90% of all alarms---lead to alarm overload and eventually
 desensitization of caregivers, which may ultimately jeopardize patient
 safety.
 #+end_quote
 There are two kinds of medical devices: clinical decision support and
 closed-loop controllers. /Decision support/ aims to provide the
 physician with recommendations to provide the best care to the
 patient. The goal of the medical device and system is to go from raw,
 low-level measurements to "high-level qualitative principles", on
 which medical reasoning is directly possible. This is the motivation
 behind a need for abstraction, compression of information, and
 interpretability.
 The other kind of medical device is /physiologic closed-loop
 controllers/ (PCLC). In this case, the patient is in the loop, and the
 device can take action directly based on the feedback from its
 measurements. Since there is no direct supervision by medical
 practitioners, a lot more caution has to be applied. Moreover, these
 devices generally work in hard real-time environments, making online
 functioning an absolute requirement.
 * Robust time series filtering
 The objective here is to recover the time-varying level underlying the
 data, which contains the true information about the patient's state.
 We assume that the time series $y_1, \ldots, y_N$ is generated by an additive model
 \[ y_t = \mu_t + \epsilon_t + \eta_t,\qquad t=1,\ldots,N, \]
 where $\mu$ represents the signal value, $\epsilon$ is a noise
 variable, and $\eta$ is an outlier variable, which is zero most of the
 time, but can take large absolute values at random times.
 The paper reviews many methods for recovering the underlying signal
 via [[https://en.wikipedia.org/wiki/State_observer][state estimation]]. Moving window techniques start from a simple
 running median and go through successive iterations to improve the
 properties of the estimator. Each time, we can estimate the mean of
 the signal and the variance.
 Going further, regression-based filtering provide an interesting
 approach to estimate locally the slope and the level of the time
 series. Of these, the [[https://en.wikipedia.org/wiki/Repeated_median_regression][repeated median]] (RM) regression offers a good
 compromise between robustness and efficiency against normal noise.
 Without using moving windows, [[https://en.wikipedia.org/wiki/Kalman_filter][Kalman filters]][fn:kalman] can also reconstruct the
 signal by including in their state a steady state, a level shift,
 slope change, and outliers. However, it is often difficult to specify
 the error structure.
 [fn:kalman] {-} I already talked about Kalman filters when I briefly
 mentioned applications [[./quaternions.html#applications][in my post on quaternions]].
 * Online pattern detection
 Instead of trying to recover the underlying signal, we can try to
 detect directly some events: level shifts, trend changes, volatility
 changes.
 This is generally based on [[https://en.wikipedia.org/wiki/Autoregressive_model][autoregressive modelling]], which work better
 if we can use a small time delay for the detection.
 * Multivariate techniques
 All the techniques discussed above were designed with a single time
 series in mind. However, in most real-world applications, you measure
 several variables simultaneously. Applying the same analyses on
 multivariate time series can be challenging. Moreover, if the
 dimension is high enough, it becomes too difficult for a physician to
 understand it and make decisions. It is therefore very important to
 have methods to extract the most pertinent and important information
 from the time series.
 The idea is to apply [[https://en.wikipedia.org/wiki/Dimensionality_reduction][dimensionality reduction]] to the multivariate time
 series in order to extract meaningful information. [[https://en.wikipedia.org/wiki/Principal_component_analysis][Principal component
 analysis]] is too static, so dynamic versions are needed to exploit the
 temporal structure. This leads to optimal linear double-infinite
 filters, that
 #+begin_quote
 explore the dependencies between observations at different time lags
 and compress the information in a multivariate time series more
 efficiently that ordinary (static) principal component analysis.
 #+end_quote
 [[https://en.wikipedia.org/wiki/Graphical_model][Graphical models]] can also be combined with dimensionality reduction to
 ensure that the compressed variables contain information about the
 patient's state that is understandable to physicians.
 Finally, one can also use [[https://en.wikipedia.org/wiki/Cluster_analysis][clustering]] to group time series according to
 their trend behaviour.
 * Conclusions
 To summarize, here are the key points studied in the paper.
 Context: We have continuous measurements of physiological or
 biochemical variables. These are acquired from medical devices
 interacting with the patient, and processed by our medical system. The
 system, in turn, should either help the physician in her
 decision-making, or directly take action (in the case of a closed-loop
 controller).
 There are several issues with the basic approach:
 - Measurements are noisy and contaminated by measurement artefacts
  that impact the ability to make decisions based on the measurements.
 - We often measure a multitude of variables, which means a lot of
  complexity.
 The article reviews methods to mitigate these issues: extracting the
 true signal, detecting significant events, and reducing complexity to
 extract clinically relevant information.
 The final part of the conclusion is a very good summary of the
 challenges we face when working with medical devices and algorithms:
 #+begin_quote
 Addressing the challenges of robust signal extraction and complexity
 reduction requires:
 - Deep understanding of the clinical problem to be solved,
 - Deep understanding of the statistical algorithms,
 - Clear identification of algorithmic problems and goals,
 - Capabilities and expertise to develop new algorithms,
 - Understanding of the respective medical device(s) and the
  development environment,
 - Acquisition of clinical data that is sufficient to support
  development and validation of new algorithms.
 The multitude of resulting requirements cannot be addressed by one
 profession alone. Rather, close cooperation between statisticians,
 engineers, and clinicians is essential for the successful development
 of medical devices embedding advanced statistical algorithms.
 Moreover, regulatory requirements have to be considered early on when
 developing algorithms and implementing them in medical devices. The
 overarching goal is to help make patient care more efficient and
 safer.
 #+end_quote
 The complex interplay between mathematical, technical, clinical, and
 regulatory requirements, and the need to interact with experts in all
 these fields, are indeed what makes my job so interesting!
 * References
 I didn't include references to the methods I mention in this post,
 since the paper itself contains a lot of citations to the relevant
 literature.