Skip to content

Potential for a kaggle competition #195

@sgbaird

Description

@sgbaird

I think a major issue with getting more participation on Matbench is that people perform their own splits/work in isolation from Matbench and that performant models tend to be trained on most recent snapshots and comprehensive data (i.e. when the model is going into real-world use). This can make it difficult to persuade people to spend time learning Matbench, even though it is very easy to use, and setting up potentially large compute time for expensive models.

There are two approaches to addressing this.

One is reducing the barrier such as accepting disparate benchmarks, writing up the benchmark notebooks for people upon request, and running the benchmarks for them. The first waters down the benchmark, and the latter two put a lot of burden on the Matbench developers.

A second approach involves increasing the incentive. One way to do this is via a kaggle competition using Matbench 2.0 with property predictions, adaptive design, and generative modeling and offering prizes. This involves upfront work in designing and hosting the competition, but it also distributes the work across the community and incentivizes use of the best models by people, even if they weren't the original authors. Authorship can also be offered for participants with top-scoring models, assuming no disqualification.

We could base it on/learn from the NOMAD 2018 kaggle competition: https://www.nature.com/articles/s41524-019-0239-3.

Prize funding/prizes would need to also be sourced. Maybe materials informatics companies, acceleration consortium, Apple, Meta, etc. would be willing to sponsor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions