Overwatch Match Data Dataset

This dataset contains anonymized Overwatch match data including game events, player statistics, and match outcomes.

Download

All files are available in the Google Drive folder: Parsertime Anonymized 12-1

Dataset Contents

This Google Drive folder contains:

SQL dump file (ptime-pscale-prod-anonymized-2025-12-01.sql) - Complete database with all tables and data
CSV exports - Individual tables exported as CSV files for easy analysis in Excel, Python, R, etc.
23 event tables containing in-game events (kills, hero swaps, ultimate usage, etc.)
Anonymized player and team identifiers for privacy protection
Match metadata including map types, round information, and timestamps

Prerequisites

For SQL dump: PostgreSQL 17.5 (recommended) or 16+
For CSV files: Any spreadsheet software (Excel, Google Sheets) or programming language (Python, R, etc.)
Basic familiarity with SQL (for PostgreSQL option)

Which Format Should I Use?

Use the SQL dump if you:

Want to run complex SQL queries with JOINs across tables
Need to preserve relationships between tables
Are comfortable with PostgreSQL
Want the complete relational database structure

Use the CSV files if you:

Want quick access without database setup
Need to analyze individual tables
Prefer working with spreadsheets or data analysis libraries (pandas, R)
Want to import into other tools (Tableau, Power BI, etc.)

Quick Start

Option 1: PostgreSQL Database (Full Dataset)

Use this option if you want to run SQL queries and have the complete relational database.

# 1. Create a new database
createdb -h localhost -p 5432 -U your_username overwatch_data

# 2. Restore the SQL dump
PGPASSWORD=your_password psql \
  --no-psqlrc \
  -h localhost \
  -p 5432 \
  -U your_username \
  -d overwatch_data \
  -f ptime-pscale-prod-anonymized-2025-12-01.sql

Note: The --no-psqlrc flag is required to avoid backslash command restrictions during restore.

Option 2: CSV Files (Simplified Analysis)

Use this option if you want to analyze the data without setting up PostgreSQL. Each table is available as a separate CSV file that you can:

Open in Excel or Google Sheets
Load into Python with pandas: pd.read_csv('Kill_anon.csv')
Import into R: read.csv('Kill_anon.csv')
Use with any other data analysis tool

CSV files are simpler to work with but don't include the relationships between tables.

Database Schema

Event Types

The dataset includes the following event types:

Combat Events: Kill, DefensiveAssist, OffensiveAssist
Hero Events: HeroSpawn, HeroSwap
Ultimate Events: UltimateCharged, UltimateStart, UltimateEnd
Objective Events: ObjectiveCaptured, ObjectiveUpdated, PayloadProgress, PointProgress
Match Events: MatchStart, MatchEnd, RoundStart, RoundEnd, SetupComplete
Hero-Specific Events: DvaRemech, RemechCharged, MercyRez, EchoDuplicateStart, EchoDuplicateEnd
Statistics: PlayerStat (comprehensive player performance metrics)

Key Tables

MatchStart_anon: Match initialization data with map and team information
MatchEnd: Final match scores and outcomes
Kill_anon: Combat elimination events with attacker/victim details
PlayerStat_anon: Detailed player statistics per round
HeroSwap_anon: Hero selection changes during matches
Ultimate*_anon: Ultimate ability tracking
And more...

Example Queries

Get all kills in a specific match

PostgreSQL:

SELECT
  match_time,
  attacker_hero,
  attacker_name,
  victim_hero,
  victim_name,
  event_ability
FROM "Kill_anon"
WHERE "scrimId" = 1234
ORDER BY match_time;

Python (pandas):

import pandas as pd

kills = pd.read_csv('Kill_anon.csv')
match_kills = kills[kills['scrimId'] == 1234].sort_values('match_time')
print(match_kills[['match_time', 'attacker_hero', 'victim_hero', 'event_ability']])

Calculate hero pick rates

PostgreSQL:

SELECT
  player_hero,
  COUNT(*) as spawn_count,
  SUM(hero_time_played) as total_time_played
FROM "HeroSpawn_anon"
GROUP BY player_hero
ORDER BY total_time_played DESC;

Python (pandas):

import pandas as pd

hero_spawns = pd.read_csv('HeroSpawn_anon.csv')
pick_rates = hero_spawns.groupby('player_hero').agg({
    'player_hero': 'count',
    'hero_time_played': 'sum'
}).rename(columns={'player_hero': 'spawn_count'})
pick_rates = pick_rates.sort_values('hero_time_played', ascending=False)
print(pick_rates)

Get player statistics summary

PostgreSQL:

SELECT
  player_name,
  player_hero,
  eliminations,
  deaths,
  hero_damage_dealt,
  healing_dealt,
  ultimates_used
FROM "PlayerStat_anon"
WHERE "scrimId" = 1234
ORDER BY eliminations DESC;

Python (pandas):

import pandas as pd

stats = pd.read_csv('PlayerStat_anon.csv')
match_stats = stats[stats['scrimId'] == 1234].sort_values('eliminations', ascending=False)
print(match_stats[['player_name', 'player_hero', 'eliminations', 'deaths', 'hero_damage_dealt']])

Data Privacy

All personally identifiable information has been anonymized:

Player names are replaced with hashed identifiers (e.g., P_6af4f2c6)
Team names are replaced with hashed identifiers (e.g., T_5568bd9d)
Original timestamps and sensitive metadata have been removed

Troubleshooting

Error: "backslash commands are restricted"

Make sure to use the --no-psqlrc flag when restoring:

PGPASSWORD=password psql --no-psqlrc -h localhost -p 5432 -U user -d db -f ptime-pscale-prod-anonymized-2025-12-01.sql

Error: "type EventType does not exist" or "relation does not exist"

The SQL dump includes all necessary schema definitions. If you encounter these errors, ensure you're restoring to a fresh database:

# Drop and recreate the database
dropdb overwatch_data
createdb overwatch_data

# Then restore again
PGPASSWORD=password psql --no-psqlrc -h localhost -p 5432 -U user -d overwatch_data -f ptime-pscale-prod-anonymized-2025-12-01.sql

Connection refused errors

Ensure PostgreSQL is running and accessible:

# Check if PostgreSQL is running
pg_isready -h localhost -p 5432

# Or check the service status
# On macOS: brew services list
# On Linux: systemctl status postgresql

Port conflicts

If port 5432 is already in use by another PostgreSQL instance, you can either:

Stop the other instance, or
Run your PostgreSQL on a different port (e.g., 5433)

Working with CSV files

If you're having trouble with the SQL dump, the CSV files provide an easier alternative. Most issues with CSVs involve:

Encoding: Files are UTF-8 encoded
NULL values: Represented as empty fields or \N
Delimiters: Standard comma-separated format

File Formats

SQL Dump: PostgreSQL 17.5 plain text format, includes schema and data
CSV Files: UTF-8 encoded, comma-delimited, NULL values represented as \N

Dataset Statistics

Event tables: 23
Total events: ~92,000+
Matches: 1,900+
PostgreSQL version: 17.5

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Questions or Issues?

If you encounter any problems with this dataset, please contact the maintainer at lucas@lux.dev.

Acknowledgments

This dataset contains anonymized competitive Overwatch match data collected for research and analysis purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ptime-pscale-prod-anonymized-2025-12-01		ptime-pscale-prod-anonymized-2025-12-01
LICENSE		LICENSE
README.md		README.md
ptime-pscale-prod-anonymized-2025-12-01.sql.gz		ptime-pscale-prod-anonymized-2025-12-01.sql.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overwatch Match Data Dataset

Download

Dataset Contents

Prerequisites

Which Format Should I Use?

Quick Start

Option 1: PostgreSQL Database (Full Dataset)

Option 2: CSV Files (Simplified Analysis)

Database Schema

Event Types

Key Tables

Example Queries

Get all kills in a specific match

Calculate hero pick rates

Get player statistics summary

Data Privacy

Troubleshooting

Error: "backslash commands are restricted"

Error: "type EventType does not exist" or "relation does not exist"

Connection refused errors

Port conflicts

Working with CSV files

File Formats

Dataset Statistics

License

Questions or Issues?

Acknowledgments

About

Uh oh!

Releases 2

Packages

License

luxdotdev/dataset

Folders and files

Latest commit

History

Repository files navigation

Overwatch Match Data Dataset

Download

Dataset Contents

Prerequisites

Which Format Should I Use?

Quick Start

Option 1: PostgreSQL Database (Full Dataset)

Option 2: CSV Files (Simplified Analysis)

Database Schema

Event Types

Key Tables

Example Queries

Get all kills in a specific match

Calculate hero pick rates

Get player statistics summary

Data Privacy

Troubleshooting

Error: "backslash commands are restricted"

Error: "type EventType does not exist" or "relation does not exist"

Connection refused errors

Port conflicts

Working with CSV files

File Formats

Dataset Statistics

License

Questions or Issues?

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Packages