Add a dataset size per language table so people know what they're getting into when they start downloading it.