Hi! Thanks for releasing the codes!
Based on the codes, in the inference stage to obtain the segmentation mask, the input texts contains a whole set of labels of some dataset. And I wonder if the text contains only one text label, how to get the segmentation mask?
Looking forward to your reply!