Description
Summary of Feature
Description:
I want to be able to invoke the predicates isAlpha(), isDigit(), and so on, within a loop over a string's codepoints.
Currently these predicates are available only on whole strings and include an on-statement. So if I want to use them within my loop, there is a prohibitive amount of overhead of wrapping my codepoint into a string and performing an unnecessary on
within isDigit()
.
Is this issue currently blocking your progress?
No.
Code Sample
Consider the implementation of Arkouda's Strings.isdecimal()
It checks whether each (unicode) character of myString
either satisfies isDigit()
or is a numeric subscript or superscript.
Currently the implementation does expensive computations if myString.isDigit()
fails. Instead I would like for it to do simply:
for cp in myString.codepoints() do
if ! isCodepointDigit(cp) &&
// whatever other checks I need to do
! isNumericSubSuperScript(cp) then
return false;
return true;
Ideally, if myString
is long enough, I would like it to be a forall loop (see #19112) with a eureka exit (#12700).
Also if the config param useCachedNumCodepoints
is true, I would like to check whether myString
is an ASCII string and, if so, run much simpler/more efficient code. Currently I am reluctant to to use this route because string.isASCII()
is O(string size) if this param is false and I should not be checking this param from Arkouda because it is undocumented / unstable.