This dataset contains anonymized Overwatch match data including game events, player statistics, and match outcomes.
All files are available in the Google Drive folder: Parsertime Anonymized 12-1
This Google Drive folder contains:
- SQL dump file (
ptime-pscale-prod-anonymized-2025-12-01.sql) - Complete database with all tables and data - CSV exports - Individual tables exported as CSV files for easy analysis in Excel, Python, R, etc.
- 23 event tables containing in-game events (kills, hero swaps, ultimate usage, etc.)
- Anonymized player and team identifiers for privacy protection
- Match metadata including map types, round information, and timestamps
- For SQL dump: PostgreSQL 17.5 (recommended) or 16+
- For CSV files: Any spreadsheet software (Excel, Google Sheets) or programming language (Python, R, etc.)
- Basic familiarity with SQL (for PostgreSQL option)
Use the SQL dump if you:
- Want to run complex SQL queries with JOINs across tables
- Need to preserve relationships between tables
- Are comfortable with PostgreSQL
- Want the complete relational database structure
Use the CSV files if you:
- Want quick access without database setup
- Need to analyze individual tables
- Prefer working with spreadsheets or data analysis libraries (pandas, R)
- Want to import into other tools (Tableau, Power BI, etc.)
Use this option if you want to run SQL queries and have the complete relational database.
# 1. Create a new database
createdb -h localhost -p 5432 -U your_username overwatch_data
# 2. Restore the SQL dump
PGPASSWORD=your_password psql \
--no-psqlrc \
-h localhost \
-p 5432 \
-U your_username \
-d overwatch_data \
-f ptime-pscale-prod-anonymized-2025-12-01.sqlNote: The --no-psqlrc flag is required to avoid backslash command restrictions during restore.
Use this option if you want to analyze the data without setting up PostgreSQL. Each table is available as a separate CSV file that you can:
- Open in Excel or Google Sheets
- Load into Python with pandas:
pd.read_csv('Kill_anon.csv') - Import into R:
read.csv('Kill_anon.csv') - Use with any other data analysis tool
CSV files are simpler to work with but don't include the relationships between tables.
The dataset includes the following event types:
- Combat Events: Kill, DefensiveAssist, OffensiveAssist
- Hero Events: HeroSpawn, HeroSwap
- Ultimate Events: UltimateCharged, UltimateStart, UltimateEnd
- Objective Events: ObjectiveCaptured, ObjectiveUpdated, PayloadProgress, PointProgress
- Match Events: MatchStart, MatchEnd, RoundStart, RoundEnd, SetupComplete
- Hero-Specific Events: DvaRemech, RemechCharged, MercyRez, EchoDuplicateStart, EchoDuplicateEnd
- Statistics: PlayerStat (comprehensive player performance metrics)
MatchStart_anon: Match initialization data with map and team informationMatchEnd: Final match scores and outcomesKill_anon: Combat elimination events with attacker/victim detailsPlayerStat_anon: Detailed player statistics per roundHeroSwap_anon: Hero selection changes during matchesUltimate*_anon: Ultimate ability tracking- And more...
PostgreSQL:
SELECT
match_time,
attacker_hero,
attacker_name,
victim_hero,
victim_name,
event_ability
FROM "Kill_anon"
WHERE "scrimId" = 1234
ORDER BY match_time;Python (pandas):
import pandas as pd
kills = pd.read_csv('Kill_anon.csv')
match_kills = kills[kills['scrimId'] == 1234].sort_values('match_time')
print(match_kills[['match_time', 'attacker_hero', 'victim_hero', 'event_ability']])PostgreSQL:
SELECT
player_hero,
COUNT(*) as spawn_count,
SUM(hero_time_played) as total_time_played
FROM "HeroSpawn_anon"
GROUP BY player_hero
ORDER BY total_time_played DESC;Python (pandas):
import pandas as pd
hero_spawns = pd.read_csv('HeroSpawn_anon.csv')
pick_rates = hero_spawns.groupby('player_hero').agg({
'player_hero': 'count',
'hero_time_played': 'sum'
}).rename(columns={'player_hero': 'spawn_count'})
pick_rates = pick_rates.sort_values('hero_time_played', ascending=False)
print(pick_rates)PostgreSQL:
SELECT
player_name,
player_hero,
eliminations,
deaths,
hero_damage_dealt,
healing_dealt,
ultimates_used
FROM "PlayerStat_anon"
WHERE "scrimId" = 1234
ORDER BY eliminations DESC;Python (pandas):
import pandas as pd
stats = pd.read_csv('PlayerStat_anon.csv')
match_stats = stats[stats['scrimId'] == 1234].sort_values('eliminations', ascending=False)
print(match_stats[['player_name', 'player_hero', 'eliminations', 'deaths', 'hero_damage_dealt']])All personally identifiable information has been anonymized:
- Player names are replaced with hashed identifiers (e.g.,
P_6af4f2c6) - Team names are replaced with hashed identifiers (e.g.,
T_5568bd9d) - Original timestamps and sensitive metadata have been removed
Make sure to use the --no-psqlrc flag when restoring:
PGPASSWORD=password psql --no-psqlrc -h localhost -p 5432 -U user -d db -f ptime-pscale-prod-anonymized-2025-12-01.sqlThe SQL dump includes all necessary schema definitions. If you encounter these errors, ensure you're restoring to a fresh database:
# Drop and recreate the database
dropdb overwatch_data
createdb overwatch_data
# Then restore again
PGPASSWORD=password psql --no-psqlrc -h localhost -p 5432 -U user -d overwatch_data -f ptime-pscale-prod-anonymized-2025-12-01.sqlEnsure PostgreSQL is running and accessible:
# Check if PostgreSQL is running
pg_isready -h localhost -p 5432
# Or check the service status
# On macOS: brew services list
# On Linux: systemctl status postgresqlIf port 5432 is already in use by another PostgreSQL instance, you can either:
- Stop the other instance, or
- Run your PostgreSQL on a different port (e.g., 5433)
If you're having trouble with the SQL dump, the CSV files provide an easier alternative. Most issues with CSVs involve:
- Encoding: Files are UTF-8 encoded
- NULL values: Represented as empty fields or
\N - Delimiters: Standard comma-separated format
- SQL Dump: PostgreSQL 17.5 plain text format, includes schema and data
- CSV Files: UTF-8 encoded, comma-delimited, NULL values represented as
\N
- Event tables: 23
- Total events: ~92,000+
- Matches: 1,900+
- PostgreSQL version: 17.5
Copyright 2025 lux.dev.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
If you encounter any problems with this dataset, please contact the maintainer at lucas@lux.dev.
This dataset contains anonymized competitive Overwatch match data collected for research and analysis purposes.