Data-driven methods have become increasingly important for sports analysis, including the assessment of player and team performance. So far, related work has focused on solving tasks solely on a single data domain, i.e., player position or video data, whereas the strengths of combining multiple modalities remain mostly unexplored. Moreover, approaches using position data have task-specific architectures and rely on handcrafted features. This thesis aims to learn latent representations for video and position data, which can be utilized to solve other downstream tasks without training from scratch or modifying the networks architecture. Since actions like shots and passes fundamentally characterize a match, we use action recognition as a pretraining task. We reproduce state-of-the-art results from the SoccerNet action spotting challenge for our video data and explore the ability of a Transformer and Graph Neural Networks for learning a representation of raw position data. Finally, we present a multimodal variant that significantly outperforms the unimodal approaches in recognizing actions as well as the downstream task of action spotting. Experiments are performed on unedited broadcast video material and corresponding synchronized position data of 25 halves from the German Handball Bundesliga.
Setup with conda:
$ conda env create && conda activate thesisBeware, installation might take a few (more than 15) minutes.
Consider passing -f exact_environment.yml in case newer versions of dependencies do not work.
While you wait, you can download the pretrained models:
$ ./scripts/download_models.sh
Have a look at DATASET.md for more information regarding the data.
To run training, validation or testing of a model with a certain configuration, run
$ python src/main.py -f config/[CONFIG].yaml -c logger.name=overwrittenValue
Use the -c argument to overwrite parameters on the command line.
You can reproduce results presented in the thesis by simply running the configurations in config/experiments.
Make sure to change the paths according to your data locations.
configcontains model, dataset and training configurations.notebookscontains jupyter notebooks to claculate metrics and visualize data and model predictions.experimentscontains checkpoints and LitModel Cache from validation and test epochs.modelscontains pre-trained models.scriptscontains scripts for preprocessing the dataset and downloading model checkpoints.srccontains all source code.imgcontains images.