Instead of mnist, it should be benchmarked on sparse data (binary input), Efficiency will be more visible. Happy to discuss