Axis aligned artifacts
There are artifacts created by choosing axis aligned cuts in robust random cut forests, similar to what was noted with IsoForest.. Left: Original data distribution. Right: Learned co-displacement, darker is lower. Notice the echoes around (10,-10) and (-10, 10) If instead of either of these, you use the depth in the robust random cut forest, you get what’s shown above. The first two examples are recreated by the code below:
Training an autoencoder with mostly noise
I am working on a project where we wish to use anomaly detection to find what image patches have structure and which don’t. As an aside, I ran an experiment on MNIST. You have 500 images of fives. You have 5000 images that are pure noise. You train a deep convolutional autoencoder. What you end up with is the following reconstruction: The top row are the inputs and the bottom row are the reconstructions.
Goal of Anomaly Detection in Non-stationary Data
I was explaining anomaly detection in non-stationary data to someone and threw together this crude example figure. The blue points are nominal and represent 90% of the points. The red are anomalous and represent 10% of the points. In this example, the red data is stationary while the blue passes through it. Thus, it would be very difficult to differentiate the red and blue points when they overlap. However, even if we only had a few frames of this video, we would like to be able to realize there are two dynamics going on.
Atypicality Presentation Recap
Yesterday, I gave a presentation introducing the ideas of atypicality to the Monteleoni research group. These are the slides and handwritten notes. I plan to explore this idea further and write up better LaTeX notes, which I will then share as well. For now, the idea of atypicality centers around using two coders: one trained to perform best on typical data and one that is universal and not data specific. A sequence is atypical if its code length using the typical coder is longer than the universal coder, i.