This is an official PyTorch implementation of PanoRadar for the following paper:
Enabling Visual Recognition at Radio Frequency
Haowen Lai, Gaoxiang Luo, Yifei Liu, Mingmin Zhao
ACM International Conference on Mobile Computing and Networking (MobiCom), 2024
[Paper] [Website] [Demo Video] [Dataset] [BibTeX]
🌟 Best Demo Award!
🌟 1st Place in Student Research Competition!
- [Sept. 26, 2024] Initial release.
Both Conda (recommended) and Docker environments are supported.
- Linux with Python ≥ 3.10
- Detectron2: follow Detectron2 installation instructions.
- PyTorch ≥ 2.0.0 and TorchVision that matches the PyTorch installation.
conda create --name panoradar python=3.10
conda activate panoradar
pip install -r requirements.txt
pip install 'git+https://github.com/facebookresearch/detectron2.git'For Docker users, we also provide the dockerfile to build the image. You might need sudo access to run the commands.
# build the image
~/docker$ docker build -f docker/Dockerfile -t panoradar . # don't omit the ending dot
# run a new container
~$ docker run -it --gpus all -v ~/PanoRadar:/mnt/PanoRadar --shm-size=4096M panoradar /bin/bashOur dataset includes two parts: the RF Raw Data (i.e., raw I/Q samples, inputs to the signal processing algorithms) and the RF Processed Data (i.e., 3D heatmaps, inputs to the machine learning models). The dataset is available here. Below is the description:
-
RF Raw Data: We recorded the RF, LiDAR, and IMU data in 12 different buildings. All of them have timestamps and are synchronized. For each building, we collected the data when the robot was moving, while for the same trajectory, "static data" were also collected (the robot remained static at one location for about 5s before moving to the next one). We trim those redundant frames in static data to reduce the size of the dataset.
-
RF Processed Data: This is the data for the machine learning pipeline. It contains the RF beamforming results (i.e., 3D RF Heatmaps), LiDAR range ground truth label, glass masks (i.e., binary segmentation label), semantic segmentation ground truth label, and object detection ground truth label. The RF beamforming results are from our signal processing algorithms (e.g., motion estimation and 3D beamforming). If the users want to skip the signal processing and directly try the ML pipeline, they can directly download the RF Processed Data.
After downloading the data, please put the RF Raw Data in ./data/raw. Similarily, put RF processed data in ./data/processed. If you have more free disk space somewhere else, symbolic links can be utilized.
The following scripts will run our signal processing scripts.
This script will perform motion estimation for all trajectories in the specific building. For each trajectory, it will output a motion_output.npz file in the folder of the trajectory. folder_name below is the name of the name of the building folder holding the raw RF data.
python run.py motion_estimation \
--in_building_folder {folder_name}Example:
python run.py motion_estimation \
--in_building_folder data/raw/DRL_movingThis script will visualize the signal processing result for a specific frame in a trajectory. The trajectory and frame is specified with the parameters trajectory_name and frame_num below.
python run.py imaging \
--in_traj_folder {trajectory_name} \
--frame_num {frame_num}Example:
python run.py imaging \
--traj_name data/raw/DRL_moving/exp20230528-000 \
--frame_num 5An additional parameter out_plot_folder can also be specified to save the imaging result as a .png file if you are running on a headless mode (e.g., on a server without a display).
python run.py imaging \
--in_traj_folder {trajectory_name} \
--frame_num {frame_num}
--out_plot_folder {path_to_save_image}Example:
python run.py imaging \
--traj_name data/raw/DRL_moving/exp20230528-000 \
--frame_num 5
--out_plot_folder . # To save in the root PanoRadar directoryWe have also provided a script that will process the RF Raw Data to generate the machine learning inputs for all trajectories in the specific building.
python run.py process_raw \
--in_building_folder {folder_name} \
--out_proc_folder {out_folder_path}Example:
python run.py process_raw \
--in_building_folder data/raw/DRL_moving \
--out_proc_folder data/processedIn the following, we provide step-by-step instructions to train the model for one building.
NOTE: we currently have not tested the code for multi-GPU support. If you encounter any problems, please feel free to open an issue.
Our model training follows a two-stage regime. In the first stage, we (pre)train the range estimation model with surface normal estimation as an auxiliary task to get LiDAR-comparable range estimation.
python train_net.py \
--config-file configs/depth_sn.yaml \
--num-gpus 1 \
DATASETS.BASE_PATH "./data/processed" \
DATASETS.TRAIN "('lobo_train_DRL',)" \
DATASETS.TEST "('lobo_test_DRL',)" \
OUTPUT_DIR "logs/mobicom24-lobo-DRL-unet-bs8" \
VIS_PERIOD -1Set VIS_PERIOD to a positive integer (e.g., 5000 iterations) if you wish to visualize the training process (e.g., log images) in tensorboard.
Then, we will train the two-stage model where the model architecture of the range-surface-normal stage is identical to the one aforementioned. To make use of the range-and-surface-normal pretrained weights and ImageNet pretrained weights for semantic segmentation and object detection that we will be downloading from the internet, we use the following script to stitch the weights.
python prepare_params.py \
--config-file logs/mobicom24-lobo-levine-unet-bs8/config.yamlThis will produce a two_stage.pth file in the same directory as the config file. We can then use this file to initialize the weights for the two-stage training.
Instead of using the pretrained weights from ImageNet as initialization, we can also trained the object detection and semantic segmentation stage using the LiDAR range estimation data, which is better because it's task-specific. This can be done by running the following script.
python train_net.py \
--config-file configs/lidar.yaml \
--num-gpus 1 \
DATASETS.BASE_PATH "./data/processed" \
DATASETS.TRAIN "('lobo_train_DRL',)" \
DATASETS.TEST "('lobo_test_DRL',)" \
OUTPUT_DIR "logs/mobicom24-lobo-DRL-lidar-bs4" And the following script will stich the weights and save as two_stage.pth file in the same directory as the config-file.
python prepare_params.py \
--config-file logs/mobicom24-lobo-levine-unet-bs8/config.yaml \
--lidar-config-file logs/mobicom24-lobo-levine-lidar-bs4/config.yamlThe following script will jointly train the two-stage model by specifying the combined weights MODEL.WEIGHTS.
python train_net.py \
--config-file configs/two_stage.yaml \
--num-gpus 1 \
DATASETS.BASE_PATH "./data/processed" \
DATASETS.TRAIN "('lobo_train_DRL',)" \
DATASETS.TEST "('lobo_test_DRL',)" \
OUTPUT_DIR "logs/mobicom24-lobo-DRL-two-stage-bs4" \
MODEL.WEIGHTS "logs/mobicom24-lobo-DRL-unet-bs8/two_stage.pth" \
VIS_PERIOD -1Similarily, set VIS_PERIOD to a positive number if you wish to visualize the training process.
We can evaluate the model performance on the leave-one-building-out (lobo) test set by specifying the metrics (e.g., depth, sn, seg, obj) we want to compute. A eval_results.txt file will be saved in the same directory as the config file.
python eval.py \
--metrics depth sn \
--config-file logs/mobicom24-lobo-levine-unet-bs8/config.yaml python eval.py \
--metrics depth sn seg obj \
--config-file logs/mobicom24-lobo-levine-two-stage-bs4/config.yamlThe following table explain the metrics in the eval_results.txt file:
| Name | Meaning |
|---|---|
| depth_l1_mean/median/80/90 | The mean/median/80th-percentile/90th-percentile of the range prediction errors |
| depth_psnr/ssim | The PSNR or SSIM of the predicted range images |
| sn_angle_mean/median/80/90 | The mean/median/80th-percentile/90th-percentile of the surface normal angle errors |
| IoU-xxx, mIoU | The IoU of different categories/mean IoU for semantic segmentation |
| Acc-xxx, mAcc | The accuracy of different categories/mean accuracy for semantic segmentation |
| AP/AP30/AP50/AP75-xxx | The object detection average precision (AP) at different IoU thresholds for categories |
| mAP/mAP30/mAP50/mAP75 | The object detection mean average precision (mAP) at different IoU thresholds |
Note: please refer to this for more details about AP and mAP.
This will visualize all the data points in the test set and save it under the same directory as the config file. To visualize the depth and surface normal estimation model, run the following script and an folder named vis_XX will be created under the same directory as the config file.
python inference.py \
--config-file logs/mobicom24-lobo-levine-unet-bs8/config.yamlTo visualize the two-stage model, run the following script:
python inference.py \
--config-file logs/mobicom24-lobo-levine-two-stage-bs4/config.yaml Or run the following script for point clouds color-coded with semantic categories:
python inference.py \
--config-file logs/mobicom24-lobo-levine-two-stage-bs4/config.yaml --mode rangeThe following script will run the model on a subset of the test set to extract FLOPs and inference speed.
python runtime.py \
--config-file configs/depth_sn.yaml \
--num-gpus 1python runtime.py \
--config-file configs/two_stage.yaml \
--num-gpus 1PanoRadar is licensed under a BSD 3-Clause License.
If you use PanoRadar in your research, find the code useful, or would like to acknowledge our work, please consider citing our paper:
@inproceedings{PanoRadar,
author = {Lai, Haowen and Luo, Gaoxiang and Liu, Yifei and Zhao, Mingmin},
title = {Enabling Visual Recognition at Radio Frequency},
booktitle = {ACM International Conference on Mobile Computing and Networking (MobiCom)},
year = {2024},
doi = {https://doi.org/10.1145/3636534.3649369},
}

