SideSeeing Tools is a suite of scripts designed to load, preprocess, and analyze data collected using the MultiSensor Data Collection App. These tools facilitate the extraction and visualization of sensor data, making them valuable for urban informatics research and applications.
This project is licensed under the MIT License. For more details, please refer to the LICENSE file.
- Key Features
- Installation
- General Usage
- Frame Extraction
- Recommended Folder Structure
- Sensor Data Specification
- SideSeeingInstance Attributes
- Testing
- Contributing
- Authors
- About Us
- Data Loading: Easily load data collected using the MultiSensor Data Collection App.
- Preprocessing: Preprocess the data to make it ready for analysis.
- Analysis: Perform various analyses on the data, including extracting and visualizing sensor data.
- Visualization: Generate visual representations of the data, such as plots and maps.
- Report Generation: Create comprehensive HTML reports from your dataset with summaries, maps, and interactive charts.
- Frame Extraction: Extract frames from video files at specified times or positions.
- Snippet Extraction: Extract snippets from video and sensor data for focused analysis.
pip install sideseeing-toolsfrom sideseeing_tools import sideseeing
# It is recommended to follow the suggested folder structure
ds = sideseeing.SideSeeingDS(root_dir='./my-project', subdir='data', name='MyDataset')
# Available iterators
# ds.instances -> Dictionary of instances (key=name, value=SideSeeingInstance)
# ds.iterator -> Iterator for the instances
# Available attributes and methods
# ds.metadata() -> Generates and prints the dataset metadata
# ds.size -> Shows the number of instances
# ds.sensors -> A dictionary containing the names of the available sensors# Get a random instance from the dataset
my_sample = ds.instance
print(f"Random sample: {my_sample.name}")The .sensors attribute shows which sensors are available across all instances.
# The output shows which instances have data for each sensor type
print(ds.sensors){
"sensors1": {
"lps22h barometer sensor": {
"FhdFastest#S10e-2024-08-01-10-42-43-354",
"FhdGame#S10e-2024-08-01-10-25-08-383"
}
},
"sensors3": {
"ak09918c magnetic field sensor": { ... },
"bmi160_accelerometer accelerometer non-wakeup": {
"FhdFastest#Mia3-2024-08-01-10-42-44-639",
"FhdNormal#Mia3-2024-08-01-10-02-22-118"
}
}
}# Get a specific instance
my_instance = ds.instances['FhdNormal#Mia3-2024-08-01-10-02-22-118']
# Get accelerometer data from the instance
accel_data = my_instance.sensors3['bmi160_accelerometer accelerometer non-wakeup']
print(accel_data.head())| Datetime UTC | x | y | z | Time (s) | |
|---|---|---|---|---|---|
| 0 | 2024-03-21 19:33:01.550000 | 9.34247 | -0.270545 | 3.10767 | 0 |
| 1 | 2024-03-21 19:33:01.561000 | 9.51725 | -0.347159 | 3.00233 | 0.011 |
| 2 | 2024-03-21 19:33:01.571000 | 9.46458 | -0.407014 | 2.81079 | 0.021 |
You can also access processed Wi-Fi and Cellular network data from an instance.
wifi_df = my_instance.wifi_networks
print(wifi_df.head())| Datetime UTC | SSID | BSSID | level | frequency | standard | Time (s) |
|---|---|---|---|---|---|---|
| 2025-09-16 13:33:43.844 | MyWifiAP-5G | aa:bb:cc:dd:ee:01 | -87 | 5745 | 11ac | 0.000 |
| 2025-09-16 13:33:43.844 | Home-WiFi-2.4G | aa:bb:cc:dd:ee:02 | -79 | 2437 | 11n | 0.000 |
| 2025-09-16 13:33:43.844 | Public-WiFi | aa:bb:cc:dd:ee:03 | -74 | 2412 | 11n | 0.000 |
| 2025-09-16 13:33:43.844 | Home-WiFi-5G | aa:bb:cc:dd:ee:04 | -88 | 5180 | 11ac | 0.000 |
| 2025-09-16 13:33:43.844 | Car-Hotspot | aa:bb:cc:dd:ee:05 | -73 | 5745 | 11ac | 0.000 |
cell_df = my_instance.cell_networks
print(cell_df.head())The cellular network data contains many columns. Here is a sample:
Click to expand cellular network data table
| Datetime UTC | timestamp | registered | connection_status | lac | cid | psc | uarfcn | mcc | mnc | ss | alpha_long | alpha_short | ber | rscp | ecno | level | Time (s) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2025-09-16 13:33:43.850 | 347823809967245 | True | 1 | 30121 | 12345678 | 361 | 4414 | 724 | 05 | -61 | Operator BR | Op BR | 99 | -24 | 0 | 4 | 0.000 |
| 2025-09-16 13:33:43.850 | 347823809967245 | False | 0 | 30122 | 87654321 | 362 | 4415 | 724 | 06 | -75 | Operator B | Op B | 99 | -30 | -2 | 3 | 0.000 |
| 2025-09-16 13:33:43.850 | 347823809967245 | False | 0 | 30121 | 12345679 | 363 | 4414 | 724 | 05 | -80 | Operator BR | Op BR | 99 | -35 | -4 | 2 | 0.000 |
Extract a segment of video and sensor data.
my_instance.extract_snippet(
start_time=2, # Start time in seconds
end_time=17, # End time in seconds
output_dir='./my-snippet' # Directory to save the snippet
)This creates a directory ./my-snippet with files for video, audio, and all sensor data for the specified time range.
for instance in ds.iterator:
print(f"Instance: {instance.name}, Video Path: {instance.video}")The SideSeeingPlotter offers various methods to visualize your data.
from sideseeing_tools import plot
plotter = plot.SideSeeingPlotter(ds, taxonomy='./my-project/taxonomy.csv')
# Example: Plot a map of all instances in the dataset
plotter.plot_dataset_map()
# Example: Plot accelerometer and audio data for a specific instance
my_instance = ds.instances['FhdNormal#Mia3-2024-08-01-10-02-22-118']
plotter.plot_instance_sensors3_and_audio(
instance=my_instance,
sensor_name='bmi160_accelerometer accelerometer non-wakeup'
)You can extract frames from videos either directly through the media module or via a SideSeeingInstance. Frames can be saved to disk or returned in memory.
extract_frames_at_times: Extracts frames at a list of specific timestamps (in seconds).extract_frames_at_positions: Extracts frames at a list of specific frame numbers.extract_frames_timespan: Extracts frames within a given start and end time.extract_frames_positionspan: Extracts frames within a given start and end frame number.extract_frames: Extracts all frames at a given rate (step).
# Get a random instance
inst = ds.instance
# Extract frames at 1.0, 2.0, and 3.0 seconds and save to 'output' directory
inst.extract_frames_at_times(
frame_times=[1.0, 2.0, 3.0],
target_dir='output',
prefix='frame_'
)from sideseeing_tools import media
video_path = ds.instance.video
# Extract frames and return them as a list of images in memory
frames_in_memory = media.extract_frames_at_times(
source_path=video_path,
frame_times=[1.0, 2.0, 3.0]
)
# Extract frames from a time span and save to disk
media.extract_frames_timespan(
source_path=video_path,
start_time=10.0,
end_time=20.0,
target_dir='output',
step=30 # Extract one frame every 30 frames
)We suggest the following folder structure for your project. This allows SideSeeingDS to automatically generate a metadata.csv file in your project root.
my-project/
├─ data/
│ ├─ place01/
│ │ ├─ route01/
| | | ├─ cell.csv
│ │ │ ├─ consumption.csv
│ │ │ ├─ gps.csv
│ │ │ ├─ metadata.json
│ │ │ ├─ sensors.one.csv
│ │ │ ├─ sensors.three.csv
│ │ │ ├─ sensors.three.uncalibrated.csv
│ │ │ ├─ video.mp4
│ │ │ ├─ ...
│ │ ├─ route02/
│ ├─ place02/
├─ metadata.csv
├─ taxonomy.csv
This section details the data format as generated by the MultiSensor Data Collection tool, before conversion by SideSeeing.
Click to expand sensor data details
| datetime_utc | cellular_network |
|---|---|
| 2025-11-09T10:24:24.476Z | CellInfoWcdma:{...} |
| datetime_utc | wifi_network |
|---|---|
| 2025-11-09T10:24:24.467Z | SSID: "Android123_6948", ... |
| datetime_utc | battery_microamperes |
|---|---|
| 2024-03-21T19:38:04.961Z | -1431 |
| datetime_utc | gps_interval | accuracy | latitude | longitude |
|---|---|---|---|---|
| 2024-03-21T19:38:10.309Z | 15 | 16.0 | -23.5645676 | -46.7395994 |
| timestamp_nano | datetime_utc | name | axis_x | accuracy |
|---|---|---|---|---|
| 712657771915658 | 2024-03-21T19:38:05.015Z | TCS3407 Uncalibrated lux Sensor | 1810.0 | 3 |
| timestamp_nano | datetime_utc | name | axis_x | axis_y | axis_z | accuracy |
|---|---|---|---|---|---|---|
| 712657652031560 | 2024-03-21T19:38:04.895Z | LSM6DSO Acceleration Sensor | 9.603442 | -0.10295067 | 3.9959226 | 3 |
| timestamp_nano | datetime_utc | name | axis_x | axis_y | axis_z | delta_x | delta_y | delta_z | accuracy |
|---|---|---|---|---|---|---|---|---|---|
| 712657852615658 | 2024-03-21T19:38:05.096Z | Gyroscope sensor UnCalibrated | 0.044593163 | -0.13439035 | 0.07086037 | -0.003009122 | -0.016193425 | -0.0026664268 | 3 |
| Attribute/Method | Description |
|---|---|
geolocation_points |
Geolocation data points. |
geolocation_center |
Geographic center of the instance. |
audio |
Path to the audio file. |
video |
Path to the video file. |
gif |
Path to the GIF file. |
sensors1, sensors3, sensors6 |
Dictionaries of sensor data. |
label |
Taxonomy tags for the instance. |
video_start_time, video_stop_time |
Video start and stop timestamps. |
extract_snippet() |
Extracts a snippet of all data types. |
extract_frames_...() |
Methods for frame extraction. |
This project uses tox for managing test environments. Tests are located in the tests/ directory.
To run the tests, execute the following command from the project root:
toxContributions are welcome! If you have a suggestion or find a bug, please open an issue to discuss it.
If you want to contribute with code, please fork the repository and submit a pull request.
The SideSeeing Project aims to develop methods based on Computer Vision and Machine Learning for Urban Informatics applications. Our goal is to devise strategies for obtaining and analyzing data related to urban accessibility. Visit our website to learn more.