The idea here is to properly index my posts. However, I didn’t have time to finish yet.

This page is and will always be under construction!

**Data sets**

German Credit Data

**Data pre-processing**

Unsupervised data pre-processing: individual predictors

Near-zero variance predictors. Should we remove them?

**Unsupervised Learning**

*PCA*

Introduction to Principal Component Analysis (PCA)

Computing and visualizing PCA in R

**Supervised Learning**

*Classification*

*Discriminant Analysis*

Discriminant Analysis

Reduced-rank discriminant analysis

Computing and visualizing LDA in R

*Regression*

*Linear Regression*

Linear regression (according to Coursera’s ML course)

*Logistic Regression*

Logistic regression (according to Coursera’s ML course)

*Spatial Modeling*

Auto-logistic model

*Latent Gaussian Models*

Fast Bayesian Inference with INLA

Latent Gaussian Models and INLA

**Web Scraping**

The basics of XML for web-scraping

**Decision theory**

Declining marginal utility and the logarithmic utility function

**R Software**

*Software development*

devtools and testthat R packages, definitely worth using

Optimizing R with Multi-threaded OpenBLAS

Profiling R code

R scripts

*Data handling*

Reshape and aggregate data with the R package reshape2

*Visualization*

Plot matrix with the R package GGally

*Text Analysis*

Character strings in R

## Statistical Models

- Auto-logistic model (18/01/2013)
- First two weeks of Coursera’s Machine Learning (linear regression) (08/05/2013)
- Third week of Coursera’s Machine Learning (logistic regression) (15/05/2013)
- 4th and 5th week of Coursera’s Machine Learning (neural networks) (05/06/2013)
- Bayesian linear regression model – simple, yet useful results (07/08/2013)
- Latent Gaussian Models and INLA (16/10/2013)

## Approximate Methods for Statistical Inference

- INLA group (04/02/2013)
- Introduction to Variational Bayes (31/07/2013)
- Fast Bayesian Inference with INLA (09/10/2013)
- Latent Gaussian Models and INLA (16/10/2013)

## Model Selection and Model Assessment

- Bias-variance trade-off in model selection (10/04/2013)
- Overview of Supervised Learning according to (Hastie et. al., 2009) (24/04/2013)
- AIC, Kullback-Leibler and a more general Information Criterion (01/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [1/3] (22/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [2/3] (29/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [3/3] (19/06/2013)
- How to properly assess point forecast (03/07/2013)
- 6th week of Coursera’s Machine Learning (advice on applying machine learning) (17/07/2013)
- Posterior predictive checks (28/08/2013)
- 6th week of Coursera’s Machine Learning (Error analysis) (18/09/2013)

## Numerics

- Fast, simple and useful numerical integration methods (02/10/2013)
- Numerical computation of quantiles (23/10/2013)

## Other Statistical Concepts

- Kullback-Leibler divergence (10/07/2013)

## Tools/Software

##### – R

- ODB R package (02/04/2013)
- devtools and testthat R packages, definitely worth using (26/06/2013)
- Optimizing R with Multi-threaded OpenBLAS (21/08/2013)
- Profiling R code (25/09/2013)
- Latent Gaussian Models and INLA (16/10/2013)
- Numerical computation of quantiles (23/10/2013)
- Reshape and aggregate data with the R package reshape2 (31/10/2013)
- Unsupervised data pre-processing for predictive modeling (07/11/2013)

##### -UNIX-like

- Run long computations remotely with screen (22/01/2013)
- Scheduling R scripts to run on a regular basis (11/09/2013)

##### – Others

- How to draw neural network diagrams using Graphviz (12/06/2013)
- Using Dropbox as a private git repository (24/07/2013)
- LaTeX and WordPress.com (14/08/2013)
- The basics of XML for web-scraping (04/09/2013)

## Book summaries and/or comments

##### – The elements of statistical learning: data mining, inference and prediction, by Trevor Hastie, Robert Tibshirani and Jerome Friedman (book link)

- Overview of Supervised Learning according to (Hastie et. al., 2009) (24/04/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [1/3] (22/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [2/3] (29/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [3/3] (19/06/2013)

##### – Coursera’s Machine Learning course, by Andrew Ng (course link)

- First two weeks of Coursera’s Machine Learning (linear regression) (08/05/2013)
- Third week of Coursera’s Machine Learning (logistic regression) (15/05/2013)
- 4th and 5th week of Coursera’s Machine Learning (neural networks) (05/06/2013)
- 6th week of Coursera’s Machine Learning (advice on applying machine learning) (17/07/2013)
- 6th week of Coursera’s Machine Learning (Error analysis) (18/09/2013)

## Uncategorized (yet)

- How much gold is there in the world? (02/04/2013)
- College education. Is it for everyone? (05/04/2013)
- The $100 Startup – Guidelines to set your own microbusiness (09/04/2013)
- Productivity Paradox and Prediction Failures (21/04/2013)

This site is like a goldmine for budding data scientists like me. Thank You!!!