-
Notifications
You must be signed in to change notification settings - Fork 59
Open
Description
encode pair: (塞车模拟器, REALGALLARDORAİNY赛车模拟器2018)
panic: runtime error: slice bounds out of range [:16] with capacity 14
goroutine 1757858 [running]:
github.com/sugarme/tokenizer/normalizer.(*NormalizedString).TransformRange(0xc0ee6a7e60, 0x7?, {0xc0ee6e8f00, 0xe, 0x164df60?}, 0x0)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/normalized.go:473 +0x2db8
github.com/sugarme/tokenizer/normalizer.(*NormalizedString).Transform(0xc0ee6a7e60, {0xc0ee6e8f00, 0xe, 0x10}, 0x0)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/normalized.go:859 +0x6f
github.com/sugarme/tokenizer/normalizer.(*NormalizedString).Filter(0xc0ee6a7e60, 0x1c6d8c8)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/normalized.go:1063 +0x395
github.com/sugarme/tokenizer/normalizer.(*NormalizedString).RemoveAccents(...)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/normalized.go:1144
github.com/sugarme/tokenizer/normalizer.stripAccents(...)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/bert.go:188
github.com/sugarme/tokenizer/normalizer.(*BertNormalizer).Normalize(0xc08cb3f370, 0x17ccb80?)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/normalizer/bert.go:206 +0xa5
github.com/sugarme/tokenizer.(*AddedVocabulary).ExtractAndNormalize.func2(0x0?, 0xc168c53be0?)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/added-vocabulary.go:517 +0x36
github.com/sugarme/tokenizer.(*PreTokenizedString).Split(0xc0ee6ab170, 0xc168c53c68)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/pretokenizer.go:81 +0x16f
github.com/sugarme/tokenizer.(*AddedVocabulary).ExtractAndNormalize(0xc03e9eb250, {0xc0af6f1140?, 0x48414f?}, {0x1e43f00, 0xc08cb3f370})
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/added-vocabulary.go:513 +0x85
github.com/sugarme/tokenizer.(*Tokenizer).EncodeSingleSequence.func1(0x0, 0x0, {0xc0af6f1140?, 0x0?})
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/tokenizer.go:383 +0x57
github.com/sugarme/tokenizer.(*Tokenizer).EncodeSingleSequence(0xb?, {{0xc0f186a4d0?, 0xc0f9f046af?, 0xc0bc393708?}, 0xc0bc3936f8?}, 0xc0bc393718?, 0x467f6a?)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/tokenizer.go:425 +0xb5
github.com/sugarme/tokenizer.(*Tokenizer).Encode(0xc03e9eb200, {0x1e44bc0, 0xc0a4bbd180}, 0x1)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/tokenizer.go:462 +0x1a7
github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch.func1(0x3c)
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/tokenizer.go:651 +0x90
created by github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch in goroutine 1757797
/go/pkg/mod/github.com/sugarme/tokenizer@v0.3.0/tokenizer.go:648 +0xf5
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels