Skip to content

nabayanc/TimeSeriesAnomalyDetection

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Time Series Anomaly Detection

Time Series Anomaly Detection with Reinforcement Learning

Yonsei university Artificial Intelligence 9th 이상민

Summer, 2022


Introduction

I thought the time seires anomaly detection and sparse reward problem of reinforcement learning had analogy. Many cases(time stamps) are not anomaly, so if our agent could get rewards only at anomaly points, time series anomaly detection would be sparse reward problem. I tried using Intrinsic Curiosity Module, which uses intrinsic rewards to solve the sparse rewards problem. In the beginning, to make the time series anoamly detection task a sparse reward problem, I approximated TN reward and FP reward to zero, which just sent positive (+ $\epsilon$ <<1) and negative signal (- $\epsilon$ ). But then the agent went to anomaly too much (and so Recall $\approx$ 1).

So, I must re-approach to solve this problem. I have put the encoder out from ICM and make it shared by DQN and ICM. Encoder(I use LSTM) is trained with inverse model in ICM by supervised learning and can extract proper features. With these features, Q-function in DQN approximates Q-value better.

I use two buffers; Anomalous buffer and Normal buffer. Agent's anomalous exprience is memorized in Anomalous buffer and normal experience in Normal buffer. When training, the agent samples batch( $\alpha$ % anomaly) where $\alpha$ is determined by your choice.

  1. Easy to control : Reward

    • if you need to never miss anomalies, then just give more negative reward to False Negative
  2. Sampling anomaly data as much as you want : Replay Buffer

    • One of the reasons anomaly detection problem is so hard is excessively unbalanced data to train the model. By separately memorizing anomalous and normal experinces, we can control the ratio of anomaly data in batch when training.

Main

model

  • LSTM used as encoder
  • Action taken by eps-greedy policy with Q-value
  • Total Reward = Intrinsic Reward(agent's prediction error) + Extrinsic Reward(environment)

Encoder

  • encoder trained with inverse model in ICM

buffer

  • Store experiences in two buffers. anomalous experiences in one buffer and normal experiences in the other
  • Sampling anomaly data and not anomaly data from each buffers
  • batch = $\alpha \times$ batch size + $(1-\alpha) \times$ batch size where $\alpha$ is anomaly samples ratio that you want

Metric

F1-Score : Harmonic mean of Precision and Recall

$$\LARGE{\text{F1 score } = \frac {2}{(\frac{1}{Precision} + \frac{1}{Recall})} \ = 2 \times \frac{Precison \times Recall}{Precision + Recall}} $$

$$\large{\text{where } Precision = \frac{TP}{TP+FP} \text{ and } Recall = \frac{TP}{TP+FN}}$$


Super-state


Data

  • Yahoo A1 Benchmark
    Real(traffic to Yahoo services)
    time-series representing the metrics of various Yahoo services.
  • Yahoo A2 Benchmark
    Synthetic(simulated)
  • AIOps KPI
    For Time series (labeled) Anomaly detection Datasets from AIOps Challenge

Test

python test.py

Structure


dataset
   A1Benchmark
       real_#.csv
   A2Benchmark
       synthetic_#.csv
   AIOps
       KPI.csv


datasets
   KPI.py
   Yahoo.py
   build_data.py

util                         
   ExperienceReplay.py      
   metric.py
   sliding_window.py


models              
   agent.py      
   env.py         
   model.py      
   
pretrained
    Super-state

main.py
test.py
config.py


Reference

About

Time Series Anomaly Detection, Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%