Statistics
- Knowledge of Linear Models and Generalised Linear Models (including logistic regression), both in theory and in applications
- Classical Statistical inference (maximum likelihood estimation, method of moments, minimal variance unbiased estimators) and testing (including goodness of fit)
- Nonparametric statistics
- Bootstrap methods, hidden Markov models
- Knowledge of Bayesian Analysis techniques for inference and testing: Markov Chain Monte Carlo, Approximate Bayesian Computation, Reversible Jump MCMC
- Good knowledge of R for statistical modelling and plotting
Data Analysis
- Experience with large datasets, for classification and regression
- Descriptive statistics, plotting (with dimensionality reduction)
- Data cleaning and formatting
- Experience with unstructured data coming directly from embedded sensors to a microcontroller
- Experience with large graph and network data
- Experience with live data from APIs
- Data analysis with Pandas, xarray (Python) and the tidyverse (R)
- Basic knowledge of SQL
Graph and Network Analysis
- Research project on community detection and graph clustering (theory and implementation)
- Research project on Topological Data Analysis for time-dependent networks
- Random graph models
- Estimation in networks (Stein’s method for Normal and Poisson estimation)
- Network Analysis with NetworkX, graph-tool (Python) and igraph (R and Python)
Time Series Analysis
- experience in analysing inertial sensors data (accelerometer, gyroscope, magnetometer), both in real-time and in post-processing
- use of statistical method for step detection, gait detection, and trajectory reconstruction
- Kalman filtering, Fourier and wavelet analysis
- Machine Learning methods applied to time series (decision trees, SVMs and Recurrent Neural Networks in particular)
- Experience with signal processing functions in Numpy and Scipy (Python)
Machine Learning
- Experience in Dimensionality Reduction (PCA, MDS, Kernel PCA, Isomap, spectral clustering)
- Experience with the most common methods and techniques
- Random forests, SVMs, Neural Networks (including CNNs and RNNs), both theoretical knowledge and practical experience
- Bagging and boosting estimators
- Cross-validation
- Kernel methods, reproducing kernel Hilbert spaces, collaborative filtering, variational Bayes, Gaussian processes
- Machine Learning libraries: Scikit-Learn, PyTorch, TensorFlow, Keras
Simulation
- Inversion, Transformation, Rejection, and Importance sampling
- Gibbs sampling
- Metropolis-Hastings
- Reversible jump MCMC
- Hidden Markov Models and Sequential Monte Carlo Methods