Assuming that I have a new dataset in English, what are the steps to expand the knowledge of this library?