Things we care about in the upstream: - Kubeflow - Training - Elastic Training - https://github.com/kleveross/ftlib Fault-tolerant for DL frameworks - https://github.com/kubeflow/mpi-operator/issues/278 Support Elastic in MPIJob - https://github.com/kubeflow/fate-operator FATE on Kubernetes - Serving - GPU Sharing - Inference Graph - Unified Serving API - AutoML - Hyperparameter Tuning https://github.com/kubeflow/katib/ - Neural Architecture Search for CV (DARTS, ProxylessNAS and so on) https://github.com/kubeflow/katib/ - Auto Model Compression - Auto Feature Engineering - Scheduling - Gang Scheduling - Affinity Scheduling - Ray - ray-operator & rayjob-operator Ray Cluster and Ray Job on Kubernetes
Things we care about in the upstream: