Singular Value Decomposition Part 2: Theorem, Proof, Algorithm

Math ∩ Programming

I’m just going to jump right into the definitions and rigor, so if you haven’t read the previous post motivating the singular value decomposition, go back and do that first. This post will be theorem, proof, algorithm, data. The data set we test on is a thousand-story CNN news data set. All of the data, code, and examples used in this post is in a github repository, as usual.

We start with the best-approximating $latex k$-dimensional linear subspace.

Definition: Let $latex X = { x_1, dots, x_m }$ be a set of $latex m$ points in $latex mathbb{R}^n$. The best approximating $latex k$-dimensional linear subspace of $latex X$ is the $latex k$-dimensional linear subspace $latex V subset mathbb{R}^n$ which minimizes the sum of the squared distances from the points in $latex X$ to $latex V$.

Let me clarify what I mean by minimizing the sum of squared distances. First we’ll start with the…

View original post 5,066 more words

Dynamic Time Warping averaging of time series allows faster and more accurate classification

the morning paper

Dynamic Time Warping averaging of time series allows faster and more accurate classification – Petitjean et al. ICDM 2014

For most time series classification problems, using the Nearest Neighbour algorithm (find the nearest neighbour within the training set to the query) is the technique of choice. Moreover, when determining the distance to neighbours, we want to use Dynamic Time Warping (DTW) as the distance measure.

Despite the optimisations we looked at earlier this week to improve the efficiency of DTW, the authors argue there remain situations where DTW (or even Euclidean distance) has severe tractability issues, particularly on resource constrained devices such as wearable computers and embedded medical devices. There is a great example of this in the evaluation section, where recent work has shown that it is possible to classify flying insects with high accuracy by converting the audio of their flight (buzzing bees, and so on).

The…

View original post 1,398 more words