Potential for a kaggle competition

I think a major issue with getting more participation on Matbench is that people perform their own splits/work in isolation from Matbench and that performant models tend to be trained on most recent snapshots and comprehensive data (i.e. when the model is going into real-world use). This can make it difficult to persuade people to spend time learning Matbench, even though it is very easy to use, and setting up potentially large compute time for expensive models.

There are two approaches to addressing this.

One is reducing the barrier such as accepting disparate benchmarks, writing up the benchmark notebooks for people upon request, and running the benchmarks for them. The first waters down the benchmark, and the latter two put a lot of burden on the Matbench developers.

A second approach involves increasing the incentive. One way to do this is via a kaggle competition using Matbench 2.0 with property predictions, adaptive design, and generative modeling and offering prizes. This involves upfront work in designing and hosting the competition, but it also distributes the work across the community and incentivizes use of the best models by people, even if they weren't the original authors. Authorship can also be offered for participants with top-scoring models, assuming no disqualification.

We could base it on/learn from the NOMAD 2018 kaggle competition: https://www.nature.com/articles/s41524-019-0239-3.

Prize funding/prizes would need to also be sourced. Maybe materials informatics companies, acceleration consortium, Apple, Meta, etc. would be willing to sponsor.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential for a kaggle competition #195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential for a kaggle competition #195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions