Hi @srivatsan88 ,
I am working on a problem of text classification where the labels are quite similar, like
- Bad reputation
- customer issues
- delays
- good reputation
The thing is there is a major overlap between the first 3 labels, as many have common words and could fall into multiple categories.
Example - Delay could have words overlapping with bad reputation, same way with customer issues and bad reputation.
Is there any good approach to be taken that can ensure good metrics?
And what would be an ideal number of data points required. Currently there is only about 6000 data points.
Cheers.
Hi @srivatsan88 ,
I am working on a problem of text classification where the labels are quite similar, like
The thing is there is a major overlap between the first 3 labels, as many have common words and could fall into multiple categories.
Example - Delay could have words overlapping with bad reputation, same way with customer issues and bad reputation.
Is there any good approach to be taken that can ensure good metrics?
And what would be an ideal number of data points required. Currently there is only about 6000 data points.
Cheers.