Learning Representation for Anomaly Detection of Vehicle Trajectories IROS 2023
- Ruochen Jiao1
- Juyang Bai1
- Xiangguo Liu1
- Takami Sato2
- Xiaowei Yuan1
- Qi Alfred Chen2
- Qi Zhu1 1Northwestern University 2University of California, Irvine
Abstract
Predicting the future trajectories of surrounding vehicles based on their history trajectories is a critical task in autonomous driving. However, when small crafted perturbations are introduced to those history trajectories, the resulting anomalous (or adversarial) trajectories can significantly mislead the future trajectory prediction module of the ego vehicle, which may result in unsafe planning and even fatal accidents. Therefore, it is of great importance to detect such anomalous trajectories of the surrounding vehicles for system safety, but few works have addressed this issue. In this work, we propose two novel methods for learning effective and efficient representations for online anomaly detection of vehicle trajectories. Different from general time-series anomaly detection, anomalous vehicle trajectory detection deals with much richer contexts on the road and fewer observable patterns on the anomalous trajectories themselves. To address these challenges, our methods exploit contrastive learning techniques and trajectory semantics to capture the patterns underlying the driving scenarios for effective anomaly detection under supervised and unsupervised settings, respectively. We conduct extensive experiments to demonstrate that our supervised method based on contrastive learning and unsupervised method based on reconstruction with semantic latent space can significantly improve the performance of anomalous trajectory detection in their corresponding settings over various baseline methods. We also demonstrate our methods' generalization ability to detect unseen patterns of anomalies.
Introduction

Autonomous driving is attracting more and more attention nowadays and is expected to improve road safety. For safe operation in the real-world environment, an autonomous vehicle should not only understand the current state of the nearby traffic participants but also be able to predict their future states. However, the current learning-based motion forecasting module is effective but vulnerable to adversarial attacks. Anomalous trajectories of surrounding vehicles may mislead autonomous vehicles' motion forecasting and decision-making modules, which will result in fatal consequences. Therefore, effectively detecting anomalous trajectories of surrounding vehicles is essential for autonomous vehicles to enhance their safety of autonomous vehicles. In this project, we applied both supervised and unsupervised approaches and compared their performance.
Dataset Preparation
To generate adversarial trajectory, we add perturbation on normal trajectories. The perturbation can change the normal trajectories' spatial coordinates, such as location and features, lead to fake the target vehicle. The dataset we used is Appollo Scape, and the GRIP++ model used to conduct trajectories prediction.
We use six metrics to calculate the prediction error, then used to evaluate the effectiveness of attack.
Average displacement error (ADE):
$$
ADE = \frac{1}{T} \sum\limits_{t=1}^{T}{(x_i - x^{GT}_{i})^2 + (y_i - y^{GT}_{i})^2}
$$
Final displacement error (FDE):
$$
FDE = \frac{1}{T} \sum\limits_{t=1}^{T}{(x_T - x^{GT}_{T})^2 + (y_T - y^{GT}_{T})^2}
$$
Average deviation of lateral direction (left&right) and longitudinal direction (front&rear):
$$
D(t, n, R) = \frac{1}{L_O} \sum\limits_{\alpha=t+1}^{t+L_O}(P^{\alpha}_{n} - s^{\alpha}_{n})^T \cdot R(s^{\alpha+1}_{n}, s^{\alpha}_{n})
$$
where \(t\) denotes the time frame, \(n\) denotes the target vehicle, \(p\) and \(s\) are binary vectors representing predicted and ground-truth vehicle locations, and \(R\) is a function used to generate the unit vector to a specific direction.
We use a white-box optimization method based on Projected Gradient Descent (PGD) to get the best perturbation that makes the prediction trajectory of autonomous vehicles with the maximized prediction error. The optimization algorithm as follows:
Input: iters, trajnormal Output: perturbationbest, lossbest Initialize perturbation randomly; while itercnt ≠ iters do Enforce hard constraints on perturbation; trajattacked = trajnormal + perturbation; trajpredicted = M(trajattacked); loss = L(n, f); end lossbest = maximum(loss); perturbationbest = trajattacked[idx];
The hard constraints of perturbation is used to ensure the perturbed trajectories to obey the physical rules and perform normal driving behavior. The hard constraints optimization can be expressed as follows:
$$ max \, \theta \,\,\, s.t. C(H^n + \theta \Delta) \land 0 \leq \theta \leq 1 $$ where \(\Delta\) is the perturbation, \(C\) is the constraint function, \(H^n\) is the history trajectory of the target vehicle, and \(\theta\) is a coefficient controlling the perturbation to satisfy all constraints.
The loss function of iteration:
$$
L(n, f) = -\frac{1}{L_O} \sum\limits_{\alpha=L_I}^{L_I + L_P - 1} f(P^{\alpha}_{n}, F^{\alpha}_{n})
$$
where \(f\) is one of the six metrics mentioned above, \(n\) is the ID of the target vehicles, \(P\) and \(F\) represents predicted and ground-truth trajectories.
Approach
One-Class SVM
We first utilize One-Class Support Vector Machine (OCSVM) algorithm, which is an unsuperivsed model designed for anomaly (outlier) detection.
The OCSVM model learns the boundary for the normal data points during the training process, and then identifies the data outside the border to be anomalies.
We trained the model only using the normal trajectories to fit a non-linear boundary. Then, during testing, the trajectories lie out of the boundary to be considered as anomalies.
LSTM-AE
We also The encoder network learns a compressed representation of the input. The decoder network reconstructs the target from the compressed representation.
The difference between the input and the reconstructed input is the reconstruction error. During the training, the auto encoder minimizes the reconstruction error as an objective function.
During the training, we fit the model to reconstruct benign trajectories. During the testing, we first use LSTM to encode the trajectory to a fixed dimension vector.
Then use a decoder based on LSTM to decode and reconstruct the trajectory. Since the trained model can not reconstruct the anomalous trajectory well, the trajectory with high reconstruction error is viewed as anomalous.
We use Cosine similarity to represent the reconstruction error.
$$ \small \begin{align} RE(P, F) = \frac{\sum\limits_{i=1}^{n} P_i F_i} {\sqrt{\sum\limits_{i=1}^{n} P^{2}_{i}} \sqrt{\sum\limits_{i=1}^{n} F^{2}_{i}}} \end{align} $$The anomaly score is computed as follows:
$$ \small \begin{align} AS(P, F) = \sum\limits_{j=i}^{i+L}(1 - RE(P, F)) \end{align} $$Contrastive Learning
We try to learn a compact representation for normal trajectories so that any trajectories deviating from normal trajectories beyond a threshold are viewed as anomalous trajectories.
Core idea: Maximize the similarity between normal trajectories and minimizing the similarity between normal trajectories and anomalous trajectories in the latent space using contrastive loss.
$$
\small
\begin{align}
L_{ij} = -\log \frac{\exp\left((v_{ni})^T (v_{nj})/\tau\right)}{\exp\left((v_{ni})^T (v_{nj})/\tau\right) + \sum\limits_{m=1}^{M} \exp\left((v_{ni})^T (v_{am})/\tau\right)}
\end{align}
$$
Contrastive loss takes the output of the network for a positive example and calculates its distance to an example of the same class and contrasts that with the distance to negative examples.
Said another way, the loss is low if positive samples are encoded to similar (closer) representations and negative examples are encoded to different (farther) representations.
Results

