From 2a0b4fc6f49e8c1a6e5a4efc9dd25e88c171d675 Mon Sep 17 00:00:00 2001 From: Brandon Cook Date: Fri, 17 Oct 2025 11:37:24 -0700 Subject: [PATCH 01/34] initial edits for collective prefix reduction --- drafts/25-WIP-collective-edits.txt | 224 +++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 drafts/25-WIP-collective-edits.txt diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt new file mode 100644 index 0000000..0fda50d --- /dev/null +++ b/drafts/25-WIP-collective-edits.txt @@ -0,0 +1,224 @@ +To: J3 J3/25-#### +From: Brandon Cook & Dan Bonachea +Subject: Edits for US20 Collective Subroutines for Prefix Reductions +Date: 2025-October-17 +References: 25-177r1, J3/25-144r1, J3/25-145r1, J3/25-007r1, WG5/N-2239 + +TODO +1. Add a new section for each new collective (4 in total). +2. Insert routines in Table 16.1 +3. Update Annex A.2 +4. Changelog? +5. Order of words in the name? + +1. Background +============= + +The Fortran 202Y work list (WG5/N-2239) includes work item US20: +"Add Intrinsic and collective subroutines for prefix operations" + +Paper J3/25-144r1 "Requirements for US20: Collective Subroutines for +Prefix Operations" presents illustrative use cases and requirements +for collective subroutines for prefix reduction. That paper was +passed at J3 meeting #236 in June 2025. Specifications and syntax for +the collective subroutine variants of prefix reduction operations, +25-177r1, was passed in the October 2025 meeting #237. + + +2. Edits to J3/25-007r1: + +[xv] Add to "Intrinsic procedures" the sentences: + +"The new intrinsic subroutines CO_SUM_PREFIX_INCLUSIVE, +CO_SUM_PREFIX_EXCLUSIVE, CO_REDUCE_PREFIX_INCLUSIVE, and +CO_REDUCE_PREFIX_EXCLUSIVE perform collective prefix reduction +operations across images." + +[383] In 16.7 Standard generic intrinsic procedures, Table 16.1, +after the entry for CO_REDUCE add: +"CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) + C Generalized exclusive prefix reduction across images." +"CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, [, STAT, ERRMSG]) + C Generalized inclusive prefix reduction across images." + +[383] In 16.7 Standard generic intrinsic procedures, Table 16.1, +after the entry for CO_SUM add: +"CO_SUM_PREFIX_EXCLUSIVE (A, [, STAT, ERRMSG]) + C Compute exclusive prefix sum across images." +"CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) + C Compute inclusive prefix sum across images." + + + + +CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) + +16.9.58 CO_SUM (A [, RESULT_IMAGE, STAT, ERRMSG]) +1 Description. Compute sum across images. +2 Class. Collective subroutine. +3 Arguments. +A shall be of numeric type. It shall have the same shape, type, and type parameter values, in cor- +responding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. +If it is scalar, the computed value is equal to a processor-dependent approximation to the sum +of the values of A in corresponding references. If it is an array, each element of the computed +value is equal to a processor-dependent approximation to the sum of all corresponding ele- +ments of A in corresponding references. +The computed value is assigned to A if no error condition occurs, and either RESULT_IMAGE is +absent, or the executing image is the one identified by RESULT_IMAGE. Otherwise, A becomes +undefined. +RESULT_IMAGE (optional) shall be an integer scalar. It is an INTENT (IN) argument. Its presence, and +value if present, shall be the same in all corresponding references. If it is present, its value +shall be that of an image index in the current team. +STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is +an INTENT (OUT) argument. +ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. +4 The semantics of STAT and ERRMSG are described in 16.6. +5 Example. If the number of images in the current team is two and A is the array [1, 5, 3] on one image and +[4, 1, 6] on the other image, the value of A after executing the statement CALL CO_SUM(A) is [5, 6, 9] on +both images. + + + + + +2. Consistency with Local Prefix Operation Intrinsics +===================================================== + +Paper J3/25-145r1 describes requirements and use cases for local +prefix reduction operation intrinsics, which are mathematically +similar to the operations performed by the collective subroutines +proposed in this paper. + +This paper endeavors to preserve symmetry in the naming of +corresponding intrinsics and dummy arguments between the two families +of intrinsics. + +3. Image ordering +================= + +All the intrinsics proposed in this paper are collective subroutines, +and will be subject to all of the common requirements specified in +section 16.6 of J3/25-007r1. So for example, they must be invoked +collectively by the same statement on all active images in the current +team, with arguments that meet specified constraints for corresponding +references. + +Mathematically, a prefix reduction operation accepts an ordered list +of input values and computes an ordered list of output result +values. We propose collective prefix reductions where both these input +and output lists are ordered according to the image indexes in the +selected team. Specifically, for an inclusive prefix reduction, the +result R_i provided to image i is computed using the inputs provided +by images (1:i). For an exclusive prefix reduction, the result R_i +provided to image i is computed using the inputs provided by images +(1:i-1). + +4. Collective CO_SUM_PREFIX subroutines +======================================== + +Prefix reduction with sum (addition) across images. + +4.0 Syntax +---------- + +CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) +CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) + +CO_SUM_PREFIX_EXCLUSIVE +CO_PREFIX_SUM_INCLUSIVE +CO_SUM_INCLUSIVE_PREFIX +CO_INCLUSIVE_PREFIX_SUM + +4.1 Specifications +------------------ + +S01. A shall be of numeric type. + +S03. A shall have the same shape, type, and type parameter values in + corresponding references. + +S05. A is an INTENT(INOUT) argument and shall not be a coindexed + object. + +S07. Each element of the computed value assigned into A is equal to a + processor-dependent approximation to the inclusive/exclusive + (respectively) prefix sum of corresponding elements of A provided + in corresponding references. + +S09. Definition of computed values assigned to A. + + The input value provided by image i is referred to as A_i. + The computed value provided to image i is referred to as R_i. + In the inclusive case, S_i is the ordered list [A_1, ..., A_i]. + In the exclusive case, S_i is the ordered list [A_1, ..., A_{i-1}]. + If A is scalar, the value of R_i is a processor-dependent + approximation of the sum of the elements of S_i. + If A is an array, each element in the computed value of R_i is a + processor-dependent approximation of the sum of corresponding + elements across the elements of S_i. + +S15. The computed value is assigned to A if no error condition + occurs. Otherwise, A becomes undefined (as in CO_SUM). + +S17. The specifications for STAT and ERRMSG directly mirror the same + arguments in the existing collective subroutines, and the semantics + of STAT and ERRMSG are described in section 16.6 of J3/25-007r1. + +5. Collective CO_REDUCE_PREFIX subroutines +=========================================== + +Generalized prefix reduction across images. + +5.0 Syntax +---------- + +CO_REDUCE_PREFIX_INCLUSIVE(A, OPERATION [, STAT, ERRMSG]) +CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, IDENTITY [, STAT, ERRMSG]) + +5.1 Specifications +------------------ + +R01. A shall not be polymorphic or have an ultimate component that is + allocatable or a pointer. + +R03. A shall have the same shape, type, and type parameter values in + corresponding references. + +R05. A is an INTENT(INOUT) argument and shall not be a coindexed + object. + +R07. OPERATION shall be a pure function. + +R09. OPERATION shall accept exactly two arguments; the result and + each argument must be a scalar, nonallocatable, noncoarray, + nonpointer, nonpolymorphic, nonoptional data object with the same + declared type and type parameters as the input ARRAY. + +R11. OPERATION shall implement a mathematically associative operation. + +R13. OPERATION shall be the same function on all images in + corresponding references. + +R15. IDENTITY shall be a scalar with the same declared type and type + parameters as A. + +R17. IDENTITY shall have the same value in corresponding references. + +R19. Definition of computed values assigned to A. + + The input value provided by image i is referred to as A_i. + The computed value provided to image i is referred to as R_i. + In the inclusive case S_i is the ordered list [A_1, ..., A_i]. + In the exclusive case S_i is the ordered list + [IDENTITY, A_1, ..., A_{i-1}]. + The value of R_i is the result of applying OPERATION to adjacent + items of S_i, without commutation, until a single item remains. + If A is an array, OPERATION is applied elementwise. + + +R21. The specifications for STAT and ERRMSG directly mirror the same + arguments in the existing collective subroutines, and the + semantics of STAT and ERRMSG are described in section 16.6 of + J3/25-007r1. + +===END=== From 97985898b0eafa9625eb9dc8a3d5d5423dc3466e Mon Sep 17 00:00:00 2001 From: Brandon Cook Date: Fri, 17 Oct 2025 12:08:00 -0700 Subject: [PATCH 02/34] start of next edit --- drafts/25-WIP-collective-edits.txt | 47 +++++++++++++++++++++--------- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 0fda50d..c8cc61b 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -48,24 +48,47 @@ after the entry for CO_SUM add: "CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) C Compute inclusive prefix sum across images." +[412] In 16.9 Specifications of the standard intrinsic proceedures, +after the specification of CO_SUM, add + +16.9.?? CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) + +<> Compute exclusive prefix sum across images. + +<> Collective subroutine. + +<> + +A shall be of numeric type. It shall have the same shape, type, and + type parameter values, in corresponding references. It shall not be + a coindexed object. It is an INTENT (INOUT) argument. + + + + The computed value is assigned to A if no error condition + occurs. Otherwise, A becomes undefined. + +STAT (optional) shall be a noncoindexed integer scalar with a decimal + exponent range of at least four. It is an INTENT (OUT) argument. + +ERRMSG (optional) shall be a noncoindexed default character scalar. It + is an INTENT (INOUT) argument. + +The semantics of STAT and ERRMSG are described in 16.6. + +<> If the number of images in the current team is three and +the value of A is [1, 2] on image one, [3, 4] on image two, and [5, 6] +on image three, after executing the statement CALL +CO_SUM_PREFIX_EXCLUSIVE(A), the value of A is [0, 0] on image one, [1, +2] on image two, and [4, 6] on image three. -CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) 16.9.58 CO_SUM (A [, RESULT_IMAGE, STAT, ERRMSG]) 1 Description. Compute sum across images. 2 Class. Collective subroutine. 3 Arguments. -A shall be of numeric type. It shall have the same shape, type, and type parameter values, in cor- -responding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. -If it is scalar, the computed value is equal to a processor-dependent approximation to the sum -of the values of A in corresponding references. If it is an array, each element of the computed -value is equal to a processor-dependent approximation to the sum of all corresponding ele- -ments of A in corresponding references. -The computed value is assigned to A if no error condition occurs, and either RESULT_IMAGE is -absent, or the executing image is the one identified by RESULT_IMAGE. Otherwise, A becomes -undefined. RESULT_IMAGE (optional) shall be an integer scalar. It is an INTENT (IN) argument. Its presence, and value if present, shall be the same in all corresponding references. If it is present, its value shall be that of an image index in the current team. @@ -73,9 +96,7 @@ STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent r an INTENT (OUT) argument. ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. 4 The semantics of STAT and ERRMSG are described in 16.6. -5 Example. If the number of images in the current team is two and A is the array [1, 5, 3] on one image and -[4, 1, 6] on the other image, the value of A after executing the statement CALL CO_SUM(A) is [5, 6, 9] on -both images. +5 Example. From d82e9a8ec1c97c26a99b65e2af9380a727808191 Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 17 Oct 2025 15:38:06 -0400 Subject: [PATCH 03/34] Update the TODO list --- drafts/25-WIP-collective-edits.txt | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index c8cc61b..27f3f78 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -5,11 +5,16 @@ Date: 2025-October-17 References: 25-177r1, J3/25-144r1, J3/25-145r1, J3/25-007r1, WG5/N-2239 TODO -1. Add a new section for each new collective (4 in total). -2. Insert routines in Table 16.1 -3. Update Annex A.2 -4. Changelog? +1. DONE Changelog in Introduction +2. DONE: Insert routines in Table 16.1 +3. Add a new section for each new collective (4 in total). +4. Update Annex A.2 5. Order of words in the name? + RESOLVED: leave as-is for this paper +6. Figure out whether and how this paper should handle the "merge conflict" + with addition of TEAM argument in 25-127r1 (edits passed last mtg). +6. Figure out whether and how this paper should handle the "merge conflict" + with addition of COMPLETION argument in 25-165r1 (edits in-progress). 1. Background ============= From fd2e389bb3b1c63eb209274a1ea5afef17800c82 Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 17 Oct 2025 15:44:42 -0400 Subject: [PATCH 04/34] Add Annex A edit --- drafts/25-WIP-collective-edits.txt | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 27f3f78..893f259 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -8,12 +8,12 @@ TODO 1. DONE Changelog in Introduction 2. DONE: Insert routines in Table 16.1 3. Add a new section for each new collective (4 in total). -4. Update Annex A.2 +4. DONE: Update Annex A.2 5. Order of words in the name? RESOLVED: leave as-is for this paper 6. Figure out whether and how this paper should handle the "merge conflict" with addition of TEAM argument in 25-127r1 (edits passed last mtg). -6. Figure out whether and how this paper should handle the "merge conflict" +7. Figure out whether and how this paper should handle the "merge conflict" with addition of COMPLETION argument in 25-165r1 (edits in-progress). 1. Background @@ -104,20 +104,24 @@ ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INT 5 Example. +[596:19-20] In Annex A.2 Processor dependencies, replace the following +line: +"* the computed value of the intrinsic subroutine CO_REDUCE (16.9.57) and + the intrinsic subroutine CO_SUM (16.9.58);" +with the following line: + +"* the computed value of the intrinsic subroutines CO_REDUCE (16.9.57), + CO_REDUCE_PREFIX_EXCLUSIVE (16.9.??), CO_REDUCE_PREFIX_INCLUSIVE + (16.9.??), CO_SUM (16.9.58), CO_SUM_PREFIX_EXCLUSIVE (16.9.??) and + CO_SUM_PREFIX_INCLUSIVE (16.9.??);" -2. Consistency with Local Prefix Operation Intrinsics -===================================================== -Paper J3/25-145r1 describes requirements and use cases for local -prefix reduction operation intrinsics, which are mathematically -similar to the operations performed by the collective subroutines -proposed in this paper. +===================================================== + OLD STUFF BELOW +===================================================== -This paper endeavors to preserve symmetry in the naming of -corresponding intrinsics and dummy arguments between the two families -of intrinsics. 3. Image ordering ================= From c4a3e437230e563baf2017388e1c5d2bb65f439a Mon Sep 17 00:00:00 2001 From: bonachea Date: Tue, 21 Oct 2025 12:11:45 -0400 Subject: [PATCH 05/34] Add column numbers to edits that have them --- drafts/25-WIP-collective-edits.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 893f259..6674ae0 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -53,7 +53,7 @@ after the entry for CO_SUM add: "CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) C Compute inclusive prefix sum across images." -[412] In 16.9 Specifications of the standard intrinsic proceedures, +[412:4+] In 16.9 Specifications of the standard intrinsic proceedures, after the specification of CO_SUM, add 16.9.?? CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) From e5ffd61b4de6347e611a4865237e15f4f6258120 Mon Sep 17 00:00:00 2001 From: bonachea Date: Tue, 21 Oct 2025 13:14:25 -0400 Subject: [PATCH 06/34] Add section for CO_SUM_PREFIX_INCLUSIVE Still includes TODO for semantics --- drafts/25-WIP-collective-edits.txt | 49 ++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 6674ae0..fa0576f 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -54,7 +54,7 @@ after the entry for CO_SUM add: C Compute inclusive prefix sum across images." [412:4+] In 16.9 Specifications of the standard intrinsic proceedures, -after the specification of CO_SUM, add +after the specification of CO_SUM, add: 16.9.?? CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) @@ -68,7 +68,7 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - + **** TODO: description of result **** The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. @@ -84,24 +84,41 @@ The semantics of STAT and ERRMSG are described in 16.6. <> If the number of images in the current team is three and the value of A is [1, 2] on image one, [3, 4] on image two, and [5, 6] on image three, after executing the statement CALL -CO_SUM_PREFIX_EXCLUSIVE(A), the value of A is [0, 0] on image one, [1, -2] on image two, and [4, 6] on image three. +CO_SUM_PREFIX_EXCLUSIVE(A), the value of A is [0, 0] on image one, [1, 2] +on image two, and [4, 6] on image three. +16.9.?? CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) +<> Compute inclusive prefix sum across images. + +<> Collective subroutine. + +<> + +A shall be of numeric type. It shall have the same shape, type, and + type parameter values, in corresponding references. It shall not be + a coindexed object. It is an INTENT (INOUT) argument. + + **** TODO: description of result **** + + The computed value is assigned to A if no error condition + occurs. Otherwise, A becomes undefined. + +STAT (optional) shall be a noncoindexed integer scalar with a decimal + exponent range of at least four. It is an INTENT (OUT) argument. + +ERRMSG (optional) shall be a noncoindexed default character scalar. It + is an INTENT (INOUT) argument. + +The semantics of STAT and ERRMSG are described in 16.6. + +<> If the number of images in the current team is three and +the value of A is [1, 2] on image one, [3, 4] on image two, and [5, 6] +on image three, after executing the statement CALL +CO_SUM_PREFIX_INCLUSIVE(A), the value of A is [1, 2] on image one, [4, 6] +on image two, and [9, 12] on image three. -16.9.58 CO_SUM (A [, RESULT_IMAGE, STAT, ERRMSG]) -1 Description. Compute sum across images. -2 Class. Collective subroutine. -3 Arguments. -RESULT_IMAGE (optional) shall be an integer scalar. It is an INTENT (IN) argument. Its presence, and -value if present, shall be the same in all corresponding references. If it is present, its value -shall be that of an image index in the current team. -STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is -an INTENT (OUT) argument. -ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. -4 The semantics of STAT and ERRMSG are described in 16.6. -5 Example. [596:19-20] In Annex A.2 Processor dependencies, replace the following From 06809e085be4aba2decf18da435a6df7994d1044 Mon Sep 17 00:00:00 2001 From: bonachea Date: Tue, 21 Oct 2025 13:14:55 -0400 Subject: [PATCH 07/34] Add initial draft of CO_REDUCE_PREFIX_EXCLUSIVE --- drafts/25-WIP-collective-edits.txt | 63 ++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index fa0576f..c118136 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -53,6 +53,69 @@ after the entry for CO_SUM add: "CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) C Compute inclusive prefix sum across images." +[411:20+] In 16.9 Specifications of the standard intrinsic proceedures, +after the specification of CO_REDUCE, add: + +16.9.?? CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, IDENTITY [, STAT, ERRMSG]) + +<> Generalized exclusive prefix reduction across images. + +<> Collective subroutine. + +<> + +A shall not be polymorphic. It shall not be of a type with an ultimate + component that is allocatable or a pointer. It shall have the same shape, + type, and type parameter values, in corresponding references. It shall + not be a coindexed object. It is an INTENT (INOUT) argument. + + If A is scalar, the computed value provided to any given image is the + result of the exclusive prefix reduction operation described below. + If A is an array, each element of the computed value provided to any + given image is equal to the result of the exclusive prefix reduction + operation described below, as applied to corresponding elements of A in + corresponding references. + + The computed value is assigned to A if no error condition + occurs. Otherwise, A becomes undefined. + +IDENTITY shall be a scalar with the same declared type and type parameters + as A. IDENTITY shall have the same value in corresponding + references + +OPERATION shall be a pure function with exactly two arguments; the result + and each argument shall be a scalar, nonallocatable, noncoarray, + nonpointer, nonpolymorphic data object with the same type and + type parameters as A. The arguments shall not be optional. If one + argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, the + other shall have that attribute. OPERATION shall implement a + mathematically associative operation. OPERATION shall be the same + function on all images in corresponding references. + + ** TODO: Description below probably needs more work. It's currently + rather different from the verbal structure of CO_REDUCE. ** + + The computed value for an exclusive prefix reduction over a list of + values is the result of an iterative process. + Each scalar input value provided by image i is referred to as A_i. + The corresponding computed result value provided to image i is referred + to as R_i. + S_i is the ordered list [IDENTITY, A_1, ..., A_{i-1}]. + The value of R_i is the result of applying OPERATION to adjacent + items of S_i, without commutation, until a single item remains. + +STAT (optional) shall be a noncoindexed integer scalar with a decimal + exponent range of at least four. It is an INTENT (OUT) argument. + +ERRMSG (optional) shall be a noncoindexed default character scalar. It + is an INTENT (INOUT) argument. + +The semantics of STAT and ERRMSG are described in 16.6. + +<> + +*** TODO *** + [412:4+] In 16.9 Specifications of the standard intrinsic proceedures, after the specification of CO_SUM, add: From ecfae6887ce8510900ff74ea42d6c2cb72b5699a Mon Sep 17 00:00:00 2001 From: bonachea Date: Tue, 21 Oct 2025 13:27:08 -0400 Subject: [PATCH 08/34] First draft of result language for SUM_PREFIX --- drafts/25-WIP-collective-edits.txt | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index c118136..060dc13 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -82,7 +82,6 @@ A shall not be polymorphic. It shall not be of a type with an ultimate IDENTITY shall be a scalar with the same declared type and type parameters as A. IDENTITY shall have the same value in corresponding references - OPERATION shall be a pure function with exactly two arguments; the result and each argument shall be a scalar, nonallocatable, noncoarray, nonpointer, nonpolymorphic data object with the same type and @@ -131,7 +130,15 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - **** TODO: description of result **** + The computed value provided to image one is equal to the value zero. + If A is scalar, the computed value provided to any given image I (with I + greater than one) is equal to a processor-dependent approximation to the + sum of the values of A in corresponding references provided by images + (1:I-1). + If A is an array, each element of the computed value provided to any + given image I (with I greater than one) is equal to a processor-dependent + approximation to the sum of the values in corresponding elements of A in + corresponding references provided by images (1:I-1). The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. @@ -163,7 +170,13 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - **** TODO: description of result **** + If A is scalar, the computed value provided to any given image I is equal + to a processor-dependent approximation to the sum of the values of A in + corresponding references provided by images (1:I). + If A is an array, each element of the computed value provided to any + given image I is equal to a processor-dependent approximation to the sum + of the values in corresponding elements of A in corresponding references + provided by images (1:I). The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. From 2a41d7a66c2bfc9ed7c92a3003887051c267d945 Mon Sep 17 00:00:00 2001 From: bonachea Date: Wed, 22 Oct 2025 00:37:14 -0400 Subject: [PATCH 09/34] Wording consistency cleanup: "type parameter values" --- drafts/25-WIP-collective-edits.txt | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 060dc13..35e424a 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -79,15 +79,16 @@ A shall not be polymorphic. It shall not be of a type with an ultimate The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. -IDENTITY shall be a scalar with the same declared type and type parameters - as A. IDENTITY shall have the same value in corresponding - references +IDENTITY shall be a scalar with the same declared type and type parameter + values as A. IDENTITY shall have the same value in corresponding + references. + OPERATION shall be a pure function with exactly two arguments; the result and each argument shall be a scalar, nonallocatable, noncoarray, nonpointer, nonpolymorphic data object with the same type and - type parameters as A. The arguments shall not be optional. If one - argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, the - other shall have that attribute. OPERATION shall implement a + type parameter values as A. The arguments shall not be optional. + If one argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, + the other shall have that attribute. OPERATION shall implement a mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. @@ -315,7 +316,7 @@ R07. OPERATION shall be a pure function. R09. OPERATION shall accept exactly two arguments; the result and each argument must be a scalar, nonallocatable, noncoarray, nonpointer, nonpolymorphic, nonoptional data object with the same - declared type and type parameters as the input ARRAY. + declared type and type parameter values as the input ARRAY. R11. OPERATION shall implement a mathematically associative operation. From ad4ae1dd74a317015722f252fd923c93a1538df2 Mon Sep 17 00:00:00 2001 From: bonachea Date: Wed, 22 Oct 2025 13:10:29 -0400 Subject: [PATCH 10/34] replace (1:N) syntax --- drafts/25-WIP-collective-edits.txt | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 35e424a..289d849 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -88,7 +88,8 @@ OPERATION shall be a pure function with exactly two arguments; the result nonpointer, nonpolymorphic data object with the same type and type parameter values as A. The arguments shall not be optional. If one argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, - the other shall have that attribute. OPERATION shall implement a + the + other shall have that attribute. OPERATION shall implement a mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. @@ -135,11 +136,11 @@ A shall be of numeric type. It shall have the same shape, type, and If A is scalar, the computed value provided to any given image I (with I greater than one) is equal to a processor-dependent approximation to the sum of the values of A in corresponding references provided by images - (1:I-1). + 1 to (I-1). If A is an array, each element of the computed value provided to any given image I (with I greater than one) is equal to a processor-dependent approximation to the sum of the values in corresponding elements of A in - corresponding references provided by images (1:I-1). + corresponding references provided by images 1 to (I-1). The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. @@ -173,11 +174,11 @@ A shall be of numeric type. It shall have the same shape, type, and If A is scalar, the computed value provided to any given image I is equal to a processor-dependent approximation to the sum of the values of A in - corresponding references provided by images (1:I). + corresponding references provided by images 1 to I. If A is an array, each element of the computed value provided to any given image I is equal to a processor-dependent approximation to the sum of the values in corresponding elements of A in corresponding references - provided by images (1:I). + provided by images 1 to I. The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. From ac338eb749d5634f8bb1ed800ff006021a2529bb Mon Sep 17 00:00:00 2001 From: bonachea Date: Wed, 22 Oct 2025 14:59:58 -0400 Subject: [PATCH 11/34] Add CO_REDUCE_PREFIX_EXCLUSIVE operation wording crafted in our meeting Also add some notes below in the temporary text to be removed. --- drafts/25-WIP-collective-edits.txt | 58 +++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 8 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 289d849..a2d804b 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -88,22 +88,24 @@ OPERATION shall be a pure function with exactly two arguments; the result nonpointer, nonpolymorphic data object with the same type and type parameter values as A. The arguments shall not be optional. If one argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, - the - other shall have that attribute. OPERATION shall implement a + the other shall have that attribute. OPERATION shall implement a mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. - - ** TODO: Description below probably needs more work. It's currently - rather different from the verbal structure of CO_REDUCE. ** + + ** TODO: Maybe append: + , and shall not depend on the value of THIS_IMAGE(). The computed value for an exclusive prefix reduction over a list of values is the result of an iterative process. Each scalar input value provided by image i is referred to as A_i. The corresponding computed result value provided to image i is referred to as R_i. - S_i is the ordered list [IDENTITY, A_1, ..., A_{i-1}]. - The value of R_i is the result of applying OPERATION to adjacent - items of S_i, without commutation, until a single item remains. + S_i is initially the ordered list [IDENTITY, A_1, ..., A_{i-1}]. + Each iteration starts with a processor-dependent choice of item x from + the list S_i. Adjacent items x and y (where x preceeds y) are removed + from the list and replaced with the value of OPERATION(x, y). The + process terminates when the list has only one item; this is the computed + value of R_i. STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. @@ -218,6 +220,46 @@ with the following line: ===================================================== +CO_REDUCE verbiage: + +If A is scalar, the computed value is the result of the reduction operation of +applying OPERATION to the values of A in all corresponding references. If A is +an array, each element of the computed value is equal to the result of the +reduction operation of applying OPERATION to corresponding elements of A in all +corresponding references. +... +The computed value of a reduction operation over a set of values is the +result of an iterative process. Each iteration involves the evaluation of +OPERATION (x, y) for x and y in the set, the removal of x and y from the +set, and the addition of the value of OPERATION (x, y) to the set. The +process terminates when the set has only one element; this is the computed +value. + +Pathological example: + +pure function OPERATION(x,y) result(r) + INTEGER :: x, y, r + r = MAX(a,b,THIS_IMAGE()) +end function + +This OPERATION is a pure function as defined in F23 15.7 and 16.1. +It is associative and commutative. +It also satisfies all the other requirements for the OPERATION argument to +CO_REDUCE or CO_REDUCE_PREFIX_* with A of integer type. + +Passing this OPERATION along with the following arguments to either +CO_REDUCE or CO_REDUCE_PREFIX_* will reveal some information about which +images evaluated the OPERATION for any given input element (and for any +given image's result). + +A = [0, 0] +IDENTITY = 0 + +Other similar OPERATION functions can be crafted over a derived type to +reveal arbitrary information about which images executed the operation and +even in what order. + + 3. Image ordering ================= From be6023c6d65e7272dd4390ca28f56c471a290c07 Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 13:17:11 -0400 Subject: [PATCH 12/34] Update TO-DO list --- drafts/25-WIP-collective-edits.txt | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index a2d804b..1b90563 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -11,10 +11,9 @@ TODO 4. DONE: Update Annex A.2 5. Order of words in the name? RESOLVED: leave as-is for this paper -6. Figure out whether and how this paper should handle the "merge conflict" - with addition of TEAM argument in 25-127r1 (edits passed last mtg). -7. Figure out whether and how this paper should handle the "merge conflict" - with addition of COMPLETION argument in 25-165r1 (edits in-progress). +6. Add multiple forms as resolved in plenary, incorporating TEAM and + COMPLETION in syntax +7. Clone REDUCE_INCLUSIVE 1. Background ============= From c298527384ef783f731ad1f758e04bf9ce02024c Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 13:36:24 -0400 Subject: [PATCH 13/34] More updates from meeting discussion --- drafts/25-WIP-collective-edits.txt | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 1b90563..dc42194 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -13,6 +13,7 @@ TODO RESOLVED: leave as-is for this paper 6. Add multiple forms as resolved in plenary, incorporating TEAM and COMPLETION in syntax + Leave placeholders for description of new arguments for r0 7. Clone REDUCE_INCLUSIVE 1. Background @@ -91,9 +92,6 @@ OPERATION shall be a pure function with exactly two arguments; the result mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. - ** TODO: Maybe append: - , and shall not depend on the value of THIS_IMAGE(). - The computed value for an exclusive prefix reduction over a list of values is the result of an iterative process. Each scalar input value provided by image i is referred to as A_i. @@ -117,6 +115,15 @@ The semantics of STAT and ERRMSG are described in 16.6. <> *** TODO *** +Examples: +EXCLUSIVE + MAXLOC over a derived type of real value and integer image ID + computes max value in prefix and the image that provided it +INCLUSIVE + derived type: value and boolean + operation on boolean flag is XOR + illustrates segmented prefix reduction + MPI Example 6.24. [412:4+] In 16.9 Specifications of the standard intrinsic proceedures, after the specification of CO_SUM, add: @@ -218,6 +225,9 @@ with the following line: OLD STUFF BELOW ===================================================== + ** TODO: Maybe append to OPERATION requirements for REDUCE: + , and shall not depend on the value of THIS_IMAGE(). + CO_REDUCE verbiage: From 95c271b320daefa4292278642d9852a4e67b92c5 Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 19:15:07 -0400 Subject: [PATCH 14/34] Form-expand intrinsic interfaces --- drafts/25-WIP-collective-edits.txt | 73 ++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 13 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index dc42194..3d086f6 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -40,23 +40,64 @@ CO_REDUCE_PREFIX_EXCLUSIVE perform collective prefix reduction operations across images." [383] In 16.7 Standard generic intrinsic procedures, Table 16.1, -after the entry for CO_REDUCE add: -"CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) - C Generalized exclusive prefix reduction across images." -"CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, [, STAT, ERRMSG]) - C Generalized inclusive prefix reduction across images." +after the entry for CO_REDUCE add two new entries (with four forms each): + +" +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) or \ + C Generalized exclusive prefix reduction across images. +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, COMPLETION \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM, COMPLETION \ + [, STAT, ERRMSG]) +" + +and: + +" +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION [, STAT, ERRMSG]) or \ + C Generalized inclusive prefix reduction across images. +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, COMPLETION \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM, COMPLETION \ + [, STAT, ERRMSG]) +" [383] In 16.7 Standard generic intrinsic procedures, Table 16.1, -after the entry for CO_SUM add: -"CO_SUM_PREFIX_EXCLUSIVE (A, [, STAT, ERRMSG]) - C Compute exclusive prefix sum across images." -"CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) - C Compute inclusive prefix sum across images." +after the entry for CO_SUM add two new entries (with four forms each): + +" +CO_SUM_PREFIX_EXCLUSIVE (A, [, STAT, ERRMSG]) or \ + C Compute exclusive prefix sum across images. +CO_SUM_PREFIX_EXCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or +CO_SUM_PREFIX_EXCLUSIVE (A, TEAM [, STAT, ERRMSG]) or +CO_SUM_PREFIX_EXCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) +" + +and: + +" +CO_SUM_PREFIX_INCLUSIVE (A, [, STAT, ERRMSG]) or \ + C Compute inclusive prefix sum across images. +CO_SUM_PREFIX_INCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or +CO_SUM_PREFIX_INCLUSIVE (A, TEAM [, STAT, ERRMSG]) or +CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) +" [411:20+] In 16.9 Specifications of the standard intrinsic proceedures, after the specification of CO_REDUCE, add: -16.9.?? CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, IDENTITY [, STAT, ERRMSG]) +16.9.?? \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, COMPLETION \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM, COMPLETION \ + [, STAT, ERRMSG]) <> Generalized exclusive prefix reduction across images. @@ -128,7 +169,10 @@ INCLUSIVE [412:4+] In 16.9 Specifications of the standard intrinsic proceedures, after the specification of CO_SUM, add: -16.9.?? CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) +16.9.?? CO_SUM_PREFIX_EXCLUSIVE (A [, STAT, ERRMSG]) or + CO_SUM_PREFIX_EXCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or + CO_SUM_PREFIX_EXCLUSIVE (A, TEAM [, STAT, ERRMSG]) or + CO_SUM_PREFIX_EXCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) <> Compute exclusive prefix sum across images. @@ -168,7 +212,10 @@ CO_SUM_PREFIX_EXCLUSIVE(A), the value of A is [0, 0] on image one, [1, 2] on image two, and [4, 6] on image three. -16.9.?? CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) +16.9.?? CO_SUM_PREFIX_INCLUSIVE (A [, STAT, ERRMSG]) or + CO_SUM_PREFIX_INCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or + CO_SUM_PREFIX_INCLUSIVE (A, TEAM [, STAT, ERRMSG]) or + CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) <> Compute inclusive prefix sum across images. From f9f9ea08048ac77242dfaf2f961604fe9318c44e Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 19:58:10 -0400 Subject: [PATCH 15/34] Add section on syntax adjustments --- drafts/25-WIP-collective-edits.txt | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 3d086f6..202c576 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -2,7 +2,7 @@ To: J3 J3/25-#### From: Brandon Cook & Dan Bonachea Subject: Edits for US20 Collective Subroutines for Prefix Reductions Date: 2025-October-17 -References: 25-177r1, J3/25-144r1, J3/25-145r1, J3/25-007r1, WG5/N-2239 +References: 25-177r1, 25-144r1, 25-166r2, 25-195r1, 25-007r1, WG5/N-2239 TODO 1. DONE Changelog in Introduction @@ -22,15 +22,31 @@ TODO The Fortran 202Y work list (WG5/N-2239) includes work item US20: "Add Intrinsic and collective subroutines for prefix operations" -Paper J3/25-144r1 "Requirements for US20: Collective Subroutines for +Paper 25-144r1 "Requirements for US20: Collective Subroutines for Prefix Operations" presents illustrative use cases and requirements for collective subroutines for prefix reduction. That paper was passed at J3 meeting #236 in June 2025. Specifications and syntax for the collective subroutine variants of prefix reduction operations, 25-177r1, was passed in the October 2025 meeting #237. +2. Syntax Adjustments +===================== -2. Edits to J3/25-007r1: +Since the passage of 25-177r1, subsequent papers 25-166r2 and 25-195r1 +have suggested additional syntax adjustments in order to maintain +uniformity with closely related features. + +Syntax changes in this paper, relative to 25-177r1 are as follows: + +1. Additional forms have been introduced to accomodate the + presence of the TEAM argument (25-127r1) and the COMPLETION + argument (25-166r2). + +2. The IDENTITY argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to + INITIAL (as recommended by 25-195r1). + +3. Edits Relative to 25-007r1 +============================= [xv] Add to "Intrinsic procedures" the sentences: @@ -321,7 +337,7 @@ even in what order. All the intrinsics proposed in this paper are collective subroutines, and will be subject to all of the common requirements specified in -section 16.6 of J3/25-007r1. So for example, they must be invoked +section 16.6 of 25-007r1. So for example, they must be invoked collectively by the same statement on all active images in the current team, with arguments that meet specified constraints for corresponding references. @@ -385,7 +401,7 @@ S15. The computed value is assigned to A if no error condition S17. The specifications for STAT and ERRMSG directly mirror the same arguments in the existing collective subroutines, and the semantics - of STAT and ERRMSG are described in section 16.6 of J3/25-007r1. + of STAT and ERRMSG are described in section 16.6 of 25-007r1. 5. Collective CO_REDUCE_PREFIX subroutines =========================================== @@ -442,6 +458,6 @@ R19. Definition of computed values assigned to A. R21. The specifications for STAT and ERRMSG directly mirror the same arguments in the existing collective subroutines, and the semantics of STAT and ERRMSG are described in section 16.6 of - J3/25-007r1. + 25-007r1. ===END=== From 0981e717204cd67aa1ccca73c6981cba5f0aecab Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 19:59:12 -0400 Subject: [PATCH 16/34] Rename IDENTITY to INITIAL --- drafts/25-WIP-collective-edits.txt | 34 +++++++++++++++--------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 202c576..eae7c79 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -42,7 +42,7 @@ Syntax changes in this paper, relative to 25-177r1 are as follows: presence of the TEAM argument (25-127r1) and the COMPLETION argument (25-166r2). -2. The IDENTITY argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to +2. The INITIAL argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to INITIAL (as recommended by 25-195r1). 3. Edits Relative to 25-007r1 @@ -59,13 +59,13 @@ operations across images." after the entry for CO_REDUCE add two new entries (with four forms each): " -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) or \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL [, STAT, ERRMSG]) or \ C Generalized exclusive prefix reduction across images. -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, COMPLETION \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, COMPLETION \ [, STAT, ERRMSG]) or -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM \ [, STAT, ERRMSG]) or -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM, COMPLETION \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM, COMPLETION \ [, STAT, ERRMSG]) " @@ -107,12 +107,12 @@ CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) after the specification of CO_REDUCE, add: 16.9.?? \ -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY [, STAT, ERRMSG]) or -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, COMPLETION \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, COMPLETION \ [, STAT, ERRMSG]) or -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM \ [, STAT, ERRMSG]) or -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, IDENTITY, TEAM, COMPLETION \ +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM, COMPLETION \ [, STAT, ERRMSG]) <> Generalized exclusive prefix reduction across images. @@ -136,8 +136,8 @@ A shall not be polymorphic. It shall not be of a type with an ultimate The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. -IDENTITY shall be a scalar with the same declared type and type parameter - values as A. IDENTITY shall have the same value in corresponding +INITIAL shall be a scalar with the same declared type and type parameter + values as A. INITIAL shall have the same value in corresponding references. OPERATION shall be a pure function with exactly two arguments; the result @@ -154,7 +154,7 @@ OPERATION shall be a pure function with exactly two arguments; the result Each scalar input value provided by image i is referred to as A_i. The corresponding computed result value provided to image i is referred to as R_i. - S_i is initially the ordered list [IDENTITY, A_1, ..., A_{i-1}]. + S_i is initially the ordered list [INITIAL, A_1, ..., A_{i-1}]. Each iteration starts with a processor-dependent choice of item x from the list S_i. Adjacent items x and y (where x preceeds y) are removed from the list and replaced with the value of OPERATION(x, y). The @@ -325,7 +325,7 @@ images evaluated the OPERATION for any given input element (and for any given image's result). A = [0, 0] -IDENTITY = 0 +INITIAL = 0 Other similar OPERATION functions can be crafted over a derived type to reveal arbitrary information about which images executed the operation and @@ -412,7 +412,7 @@ Generalized prefix reduction across images. ---------- CO_REDUCE_PREFIX_INCLUSIVE(A, OPERATION [, STAT, ERRMSG]) -CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, IDENTITY [, STAT, ERRMSG]) +CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, INITIAL [, STAT, ERRMSG]) 5.1 Specifications ------------------ @@ -438,10 +438,10 @@ R11. OPERATION shall implement a mathematically associative operation. R13. OPERATION shall be the same function on all images in corresponding references. -R15. IDENTITY shall be a scalar with the same declared type and type +R15. INITIAL shall be a scalar with the same declared type and type parameters as A. -R17. IDENTITY shall have the same value in corresponding references. +R17. INITIAL shall have the same value in corresponding references. R19. Definition of computed values assigned to A. @@ -449,7 +449,7 @@ R19. Definition of computed values assigned to A. The computed value provided to image i is referred to as R_i. In the inclusive case S_i is the ordered list [A_1, ..., A_i]. In the exclusive case S_i is the ordered list - [IDENTITY, A_1, ..., A_{i-1}]. + [INITIAL, A_1, ..., A_{i-1}]. The value of R_i is the result of applying OPERATION to adjacent items of S_i, without commutation, until a single item remains. If A is an array, OPERATION is applied elementwise. From d9dfc8579711895e497fe6a29a9719664f25a051 Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 22:58:18 -0400 Subject: [PATCH 17/34] Move notes into a separate file Spell-check --- drafts/25-WIP-collective-edits.txt | 208 +---------------------------- drafts/coll-edit-notes.txt | 51 +++++++ 2 files changed, 56 insertions(+), 203 deletions(-) create mode 100644 drafts/coll-edit-notes.txt diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index eae7c79..ecaa784 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -4,18 +4,6 @@ Subject: Edits for US20 Collective Subroutines for Prefix Reductions Date: 2025-October-17 References: 25-177r1, 25-144r1, 25-166r2, 25-195r1, 25-007r1, WG5/N-2239 -TODO -1. DONE Changelog in Introduction -2. DONE: Insert routines in Table 16.1 -3. Add a new section for each new collective (4 in total). -4. DONE: Update Annex A.2 -5. Order of words in the name? - RESOLVED: leave as-is for this paper -6. Add multiple forms as resolved in plenary, incorporating TEAM and - COMPLETION in syntax - Leave placeholders for description of new arguments for r0 -7. Clone REDUCE_INCLUSIVE - 1. Background ============= @@ -38,7 +26,7 @@ uniformity with closely related features. Syntax changes in this paper, relative to 25-177r1 are as follows: -1. Additional forms have been introduced to accomodate the +1. Additional forms have been introduced to accommodate the presence of the TEAM argument (25-127r1) and the COMPLETION argument (25-166r2). @@ -103,7 +91,7 @@ CO_SUM_PREFIX_INCLUSIVE (A, TEAM [, STAT, ERRMSG]) or CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) " -[411:20+] In 16.9 Specifications of the standard intrinsic proceedures, +[411:20+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_REDUCE, add: 16.9.?? \ @@ -156,7 +144,7 @@ OPERATION shall be a pure function with exactly two arguments; the result to as R_i. S_i is initially the ordered list [INITIAL, A_1, ..., A_{i-1}]. Each iteration starts with a processor-dependent choice of item x from - the list S_i. Adjacent items x and y (where x preceeds y) are removed + the list S_i. Adjacent items x and y (where x precedes y) are removed from the list and replaced with the value of OPERATION(x, y). The process terminates when the list has only one item; this is the computed value of R_i. @@ -172,17 +160,8 @@ The semantics of STAT and ERRMSG are described in 16.6. <> *** TODO *** -Examples: -EXCLUSIVE - MAXLOC over a derived type of real value and integer image ID - computes max value in prefix and the image that provided it -INCLUSIVE - derived type: value and boolean - operation on boolean flag is XOR - illustrates segmented prefix reduction - MPI Example 6.24. - -[412:4+] In 16.9 Specifications of the standard intrinsic proceedures, + +[412:4+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_SUM, add: 16.9.?? CO_SUM_PREFIX_EXCLUSIVE (A [, STAT, ERRMSG]) or @@ -283,181 +262,4 @@ with the following line: (16.9.??), CO_SUM (16.9.58), CO_SUM_PREFIX_EXCLUSIVE (16.9.??) and CO_SUM_PREFIX_INCLUSIVE (16.9.??);" - -===================================================== - OLD STUFF BELOW -===================================================== - - ** TODO: Maybe append to OPERATION requirements for REDUCE: - , and shall not depend on the value of THIS_IMAGE(). - - -CO_REDUCE verbiage: - -If A is scalar, the computed value is the result of the reduction operation of -applying OPERATION to the values of A in all corresponding references. If A is -an array, each element of the computed value is equal to the result of the -reduction operation of applying OPERATION to corresponding elements of A in all -corresponding references. -... -The computed value of a reduction operation over a set of values is the -result of an iterative process. Each iteration involves the evaluation of -OPERATION (x, y) for x and y in the set, the removal of x and y from the -set, and the addition of the value of OPERATION (x, y) to the set. The -process terminates when the set has only one element; this is the computed -value. - -Pathological example: - -pure function OPERATION(x,y) result(r) - INTEGER :: x, y, r - r = MAX(a,b,THIS_IMAGE()) -end function - -This OPERATION is a pure function as defined in F23 15.7 and 16.1. -It is associative and commutative. -It also satisfies all the other requirements for the OPERATION argument to -CO_REDUCE or CO_REDUCE_PREFIX_* with A of integer type. - -Passing this OPERATION along with the following arguments to either -CO_REDUCE or CO_REDUCE_PREFIX_* will reveal some information about which -images evaluated the OPERATION for any given input element (and for any -given image's result). - -A = [0, 0] -INITIAL = 0 - -Other similar OPERATION functions can be crafted over a derived type to -reveal arbitrary information about which images executed the operation and -even in what order. - - -3. Image ordering -================= - -All the intrinsics proposed in this paper are collective subroutines, -and will be subject to all of the common requirements specified in -section 16.6 of 25-007r1. So for example, they must be invoked -collectively by the same statement on all active images in the current -team, with arguments that meet specified constraints for corresponding -references. - -Mathematically, a prefix reduction operation accepts an ordered list -of input values and computes an ordered list of output result -values. We propose collective prefix reductions where both these input -and output lists are ordered according to the image indexes in the -selected team. Specifically, for an inclusive prefix reduction, the -result R_i provided to image i is computed using the inputs provided -by images (1:i). For an exclusive prefix reduction, the result R_i -provided to image i is computed using the inputs provided by images -(1:i-1). - -4. Collective CO_SUM_PREFIX subroutines -======================================== - -Prefix reduction with sum (addition) across images. - -4.0 Syntax ----------- - -CO_SUM_PREFIX_INCLUSIVE(A [, STAT, ERRMSG]) -CO_SUM_PREFIX_EXCLUSIVE(A [, STAT, ERRMSG]) - -CO_SUM_PREFIX_EXCLUSIVE -CO_PREFIX_SUM_INCLUSIVE -CO_SUM_INCLUSIVE_PREFIX -CO_INCLUSIVE_PREFIX_SUM - -4.1 Specifications ------------------- - -S01. A shall be of numeric type. - -S03. A shall have the same shape, type, and type parameter values in - corresponding references. - -S05. A is an INTENT(INOUT) argument and shall not be a coindexed - object. - -S07. Each element of the computed value assigned into A is equal to a - processor-dependent approximation to the inclusive/exclusive - (respectively) prefix sum of corresponding elements of A provided - in corresponding references. - -S09. Definition of computed values assigned to A. - - The input value provided by image i is referred to as A_i. - The computed value provided to image i is referred to as R_i. - In the inclusive case, S_i is the ordered list [A_1, ..., A_i]. - In the exclusive case, S_i is the ordered list [A_1, ..., A_{i-1}]. - If A is scalar, the value of R_i is a processor-dependent - approximation of the sum of the elements of S_i. - If A is an array, each element in the computed value of R_i is a - processor-dependent approximation of the sum of corresponding - elements across the elements of S_i. - -S15. The computed value is assigned to A if no error condition - occurs. Otherwise, A becomes undefined (as in CO_SUM). - -S17. The specifications for STAT and ERRMSG directly mirror the same - arguments in the existing collective subroutines, and the semantics - of STAT and ERRMSG are described in section 16.6 of 25-007r1. - -5. Collective CO_REDUCE_PREFIX subroutines -=========================================== - -Generalized prefix reduction across images. - -5.0 Syntax ----------- - -CO_REDUCE_PREFIX_INCLUSIVE(A, OPERATION [, STAT, ERRMSG]) -CO_REDUCE_PREFIX_EXCLUSIVE(A, OPERATION, INITIAL [, STAT, ERRMSG]) - -5.1 Specifications ------------------- - -R01. A shall not be polymorphic or have an ultimate component that is - allocatable or a pointer. - -R03. A shall have the same shape, type, and type parameter values in - corresponding references. - -R05. A is an INTENT(INOUT) argument and shall not be a coindexed - object. - -R07. OPERATION shall be a pure function. - -R09. OPERATION shall accept exactly two arguments; the result and - each argument must be a scalar, nonallocatable, noncoarray, - nonpointer, nonpolymorphic, nonoptional data object with the same - declared type and type parameter values as the input ARRAY. - -R11. OPERATION shall implement a mathematically associative operation. - -R13. OPERATION shall be the same function on all images in - corresponding references. - -R15. INITIAL shall be a scalar with the same declared type and type - parameters as A. - -R17. INITIAL shall have the same value in corresponding references. - -R19. Definition of computed values assigned to A. - - The input value provided by image i is referred to as A_i. - The computed value provided to image i is referred to as R_i. - In the inclusive case S_i is the ordered list [A_1, ..., A_i]. - In the exclusive case S_i is the ordered list - [INITIAL, A_1, ..., A_{i-1}]. - The value of R_i is the result of applying OPERATION to adjacent - items of S_i, without commutation, until a single item remains. - If A is an array, OPERATION is applied elementwise. - - -R21. The specifications for STAT and ERRMSG directly mirror the same - arguments in the existing collective subroutines, and the - semantics of STAT and ERRMSG are described in section 16.6 of - 25-007r1. - ===END=== diff --git a/drafts/coll-edit-notes.txt b/drafts/coll-edit-notes.txt new file mode 100644 index 0000000..a4c694a --- /dev/null +++ b/drafts/coll-edit-notes.txt @@ -0,0 +1,51 @@ +Collective subroutines edits TODO: + +* Add multiple forms as resolved in plenary, incorporating TEAM and + COMPLETION in syntax + Leave placeholders for description of new arguments for r0 +* Clone REDUCE_INCLUSIVE +* Write REDUCE examples + +Examples: +EXCLUSIVE + MAXLOC over a derived type of real value and integer image ID + computes max value in prefix and the image that provided it +INCLUSIVE + derived type: value and boolean + operation on boolean flag is XOR + illustrates segmented prefix reduction + MPI Example 6.24. + +============================================ + +Pathological example: + +pure function OPERATION(x,y) result(r) + INTEGER :: x, y, r + r = MAX(a,b,THIS_IMAGE()) +end function + +This OPERATION is a pure function as defined in F23 15.7 and 16.1. +It is associative and commutative. +It also satisfies all the other requirements for the OPERATION argument to +CO_REDUCE or CO_REDUCE_PREFIX_* with A of integer type. + +Passing this OPERATION along with the following arguments to either CO_REDUCE +or CO_REDUCE_PREFIX_* will reveal some information about which images evaluated +the OPERATION for any given input element (and for any given image's result). + +A = [0, 0] +INITIAL = 0 + +Other similar OPERATION functions can be crafted over a derived type to reveal +arbitrary information about which images executed the operation and even in +what order. + +Potential resolution for all CO_REDUCE intrinsics: +"OPERATION shall not depend on the value of THIS_IMAGE()." + + +============================================ + + + From 1d1b9397bf7938fdaacc1b756eb9102fcd1235c3 Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 23:03:27 -0400 Subject: [PATCH 18/34] Add separator lines to improve readability --- drafts/25-WIP-collective-edits.txt | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index ecaa784..2163b15 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -36,6 +36,7 @@ Syntax changes in this paper, relative to 25-177r1 are as follows: 3. Edits Relative to 25-007r1 ============================= +------------------------------------------------------------------------- [xv] Add to "Intrinsic procedures" the sentences: "The new intrinsic subroutines CO_SUM_PREFIX_INCLUSIVE, @@ -43,6 +44,7 @@ CO_SUM_PREFIX_EXCLUSIVE, CO_REDUCE_PREFIX_INCLUSIVE, and CO_REDUCE_PREFIX_EXCLUSIVE perform collective prefix reduction operations across images." +------------------------------------------------------------------------- [383] In 16.7 Standard generic intrinsic procedures, Table 16.1, after the entry for CO_REDUCE add two new entries (with four forms each): @@ -70,6 +72,7 @@ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM, COMPLETION \ [, STAT, ERRMSG]) " +------------------------------------------------------------------------- [383] In 16.7 Standard generic intrinsic procedures, Table 16.1, after the entry for CO_SUM add two new entries (with four forms each): @@ -91,6 +94,7 @@ CO_SUM_PREFIX_INCLUSIVE (A, TEAM [, STAT, ERRMSG]) or CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) " +------------------------------------------------------------------------- [411:20+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_REDUCE, add: @@ -161,6 +165,7 @@ The semantics of STAT and ERRMSG are described in 16.6. *** TODO *** +------------------------------------------------------------------------- [412:4+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_SUM, add: @@ -247,8 +252,7 @@ on image three, after executing the statement CALL CO_SUM_PREFIX_INCLUSIVE(A), the value of A is [1, 2] on image one, [4, 6] on image two, and [9, 12] on image three. - - +------------------------------------------------------------------------- [596:19-20] In Annex A.2 Processor dependencies, replace the following line: @@ -262,4 +266,6 @@ with the following line: (16.9.??), CO_SUM (16.9.58), CO_SUM_PREFIX_EXCLUSIVE (16.9.??) and CO_SUM_PREFIX_INCLUSIVE (16.9.??);" +------------------------------------------------------------------------- + ===END=== From 0071ccc0c05119334862bffb02f07c8a797847eb Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 23:06:27 -0400 Subject: [PATCH 19/34] Add TEAM and COMPLETION arguments --- drafts/25-WIP-collective-edits.txt | 24 +++++++++++++++++++++--- drafts/coll-edit-notes.txt | 5 +---- 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 2163b15..20977c0 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -153,13 +153,19 @@ OPERATION shall be a pure function with exactly two arguments; the result process terminates when the list has only one item; this is the computed value of R_i. +TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module + ISO_FORTRAN_ENV. It is an INTENT(IN) argument. + +COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic + module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. + STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. -The semantics of STAT and ERRMSG are described in 16.6. +The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. <> @@ -196,6 +202,12 @@ A shall be of numeric type. It shall have the same shape, type, and The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. + +TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module + ISO_FORTRAN_ENV. It is an INTENT(IN) argument. + +COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic + module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. @@ -203,7 +215,7 @@ STAT (optional) shall be a noncoindexed integer scalar with a decimal ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. -The semantics of STAT and ERRMSG are described in 16.6. +The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. <> If the number of images in the current team is three and the value of A is [1, 2] on image one, [3, 4] on image two, and [5, 6] @@ -238,13 +250,19 @@ A shall be of numeric type. It shall have the same shape, type, and The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. +TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module + ISO_FORTRAN_ENV. It is an INTENT(IN) argument. + +COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic + module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. + STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. ERRMSG (optional) shall be a noncoindexed default character scalar. It is an INTENT (INOUT) argument. -The semantics of STAT and ERRMSG are described in 16.6. +The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. <> If the number of images in the current team is three and the value of A is [1, 2] on image one, [3, 4] on image two, and [5, 6] diff --git a/drafts/coll-edit-notes.txt b/drafts/coll-edit-notes.txt index a4c694a..8638d0c 100644 --- a/drafts/coll-edit-notes.txt +++ b/drafts/coll-edit-notes.txt @@ -1,9 +1,6 @@ Collective subroutines edits TODO: -* Add multiple forms as resolved in plenary, incorporating TEAM and - COMPLETION in syntax - Leave placeholders for description of new arguments for r0 -* Clone REDUCE_INCLUSIVE +* Clone REDUCE_INCLUSIVE and adjust * Write REDUCE examples Examples: From 6c0ddd4d725ea966d7f768c9eea24a2163daba50 Mon Sep 17 00:00:00 2001 From: bonachea Date: Thu, 23 Oct 2025 23:09:20 -0400 Subject: [PATCH 20/34] Add work items to section 2 --- drafts/25-WIP-collective-edits.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 20977c0..8aea5d7 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -27,8 +27,8 @@ uniformity with closely related features. Syntax changes in this paper, relative to 25-177r1 are as follows: 1. Additional forms have been introduced to accommodate the - presence of the TEAM argument (25-127r1) and the COMPLETION - argument (25-166r2). + presence of the TEAM argument (work item DIN1, 25-127r1) and the + COMPLETION argument (work item US04, 25-166r2). 2. The INITIAL argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to INITIAL (as recommended by 25-195r1). From 33853442de54ec90fbe7820e5ffeb0a81e1283e3 Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 15:05:56 -0400 Subject: [PATCH 21/34] blankspace cleanups (no content changes) --- drafts/25-WIP-collective-edits.txt | 78 +++++++++++++++--------------- 1 file changed, 38 insertions(+), 40 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 8aea5d7..3517c31 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -26,8 +26,8 @@ uniformity with closely related features. Syntax changes in this paper, relative to 25-177r1 are as follows: -1. Additional forms have been introduced to accommodate the - presence of the TEAM argument (work item DIN1, 25-127r1) and the +1. Additional forms have been introduced to accommodate the + presence of the TEAM argument (work item DIN1, 25-127r1) and the COMPLETION argument (work item US04, 25-166r2). 2. The INITIAL argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to @@ -56,7 +56,7 @@ CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, COMPLETION \ CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM \ [, STAT, ERRMSG]) or CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM, COMPLETION \ - [, STAT, ERRMSG]) + [, STAT, ERRMSG]) " and: @@ -69,7 +69,7 @@ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, COMPLETION \ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM \ [, STAT, ERRMSG]) or CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM, COMPLETION \ - [, STAT, ERRMSG]) + [, STAT, ERRMSG]) " ------------------------------------------------------------------------- @@ -99,13 +99,13 @@ CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) after the specification of CO_REDUCE, add: 16.9.?? \ -CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL [, STAT, ERRMSG]) or CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, COMPLETION \ [, STAT, ERRMSG]) or CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM \ [, STAT, ERRMSG]) or CO_REDUCE_PREFIX_EXCLUSIVE (A, OPERATION, INITIAL, TEAM, COMPLETION \ - [, STAT, ERRMSG]) + [, STAT, ERRMSG]) <> Generalized exclusive prefix reduction across images. @@ -125,8 +125,8 @@ A shall not be polymorphic. It shall not be of a type with an ultimate operation described below, as applied to corresponding elements of A in corresponding references. - The computed value is assigned to A if no error condition - occurs. Otherwise, A becomes undefined. + The computed value is assigned to A if no error condition occurs. + Otherwise, A becomes undefined. INITIAL shall be a scalar with the same declared type and type parameter values as A. INITIAL shall have the same value in corresponding @@ -140,25 +140,23 @@ OPERATION shall be a pure function with exactly two arguments; the result the other shall have that attribute. OPERATION shall implement a mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. - + The computed value for an exclusive prefix reduction over a list of - values is the result of an iterative process. - Each scalar input value provided by image i is referred to as A_i. - The corresponding computed result value provided to image i is referred - to as R_i. - S_i is initially the ordered list [INITIAL, A_1, ..., A_{i-1}]. - Each iteration starts with a processor-dependent choice of item x from - the list S_i. Adjacent items x and y (where x precedes y) are removed - from the list and replaced with the value of OPERATION(x, y). The - process terminates when the list has only one item; this is the computed - value of R_i. + values is the result of an iterative process. Each scalar input value + provided by image i is referred to as A_i. The corresponding computed + result value provided to image i is referred to as R_i. S_i is initially + the ordered list [INITIAL, A_1, ..., A_{i-1}]. Each iteration starts + with a processor-dependent choice of item x from the list S_i. Adjacent + items x and y (where x precedes y) are removed from the list and replaced + with the value of OPERATION(x, y). The process terminates when the list + has only one item; this is the computed value of R_i. TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module - ISO_FORTRAN_ENV. It is an INTENT(IN) argument. - + ISO_FORTRAN_ENV. It is an INTENT (IN) argument. + COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic - module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. - + module ISO_FORTRAN_ENV. It is an INTENT (INOUT) argument. + STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. @@ -167,15 +165,15 @@ ERRMSG (optional) shall be a noncoindexed default character scalar. It The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. -<> - +<> + *** TODO *** ------------------------------------------------------------------------- [412:4+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_SUM, add: -16.9.?? CO_SUM_PREFIX_EXCLUSIVE (A [, STAT, ERRMSG]) or +16.9.?? CO_SUM_PREFIX_EXCLUSIVE (A [, STAT, ERRMSG]) or CO_SUM_PREFIX_EXCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or CO_SUM_PREFIX_EXCLUSIVE (A, TEAM [, STAT, ERRMSG]) or CO_SUM_PREFIX_EXCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) @@ -200,15 +198,15 @@ A shall be of numeric type. It shall have the same shape, type, and approximation to the sum of the values in corresponding elements of A in corresponding references provided by images 1 to (I-1). - The computed value is assigned to A if no error condition - occurs. Otherwise, A becomes undefined. + The computed value is assigned to A if no error condition occurs. + Otherwise, A becomes undefined. TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module - ISO_FORTRAN_ENV. It is an INTENT(IN) argument. - + ISO_FORTRAN_ENV. It is an INTENT (IN) argument. + COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic - module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. - + module ISO_FORTRAN_ENV. It is an INTENT (INOUT) argument. + STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. @@ -224,7 +222,7 @@ CO_SUM_PREFIX_EXCLUSIVE(A), the value of A is [0, 0] on image one, [1, 2] on image two, and [4, 6] on image three. -16.9.?? CO_SUM_PREFIX_INCLUSIVE (A [, STAT, ERRMSG]) or +16.9.?? CO_SUM_PREFIX_INCLUSIVE (A [, STAT, ERRMSG]) or CO_SUM_PREFIX_INCLUSIVE (A, COMPLETION [, STAT, ERRMSG]) or CO_SUM_PREFIX_INCLUSIVE (A, TEAM [, STAT, ERRMSG]) or CO_SUM_PREFIX_INCLUSIVE (A, TEAM, COMPLETION [, STAT, ERRMSG]) @@ -247,15 +245,15 @@ A shall be of numeric type. It shall have the same shape, type, and of the values in corresponding elements of A in corresponding references provided by images 1 to I. - The computed value is assigned to A if no error condition - occurs. Otherwise, A becomes undefined. - + The computed value is assigned to A if no error condition occurs. + Otherwise, A becomes undefined. + TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module - ISO_FORTRAN_ENV. It is an INTENT(IN) argument. - + ISO_FORTRAN_ENV. It is an INTENT (IN) argument. + COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic - module ISO_FORTRAN_ENV. It is an INTENT(INOUT) argument. - + module ISO_FORTRAN_ENV. It is an INTENT (INOUT) argument. + STAT (optional) shall be a noncoindexed integer scalar with a decimal exponent range of at least four. It is an INTENT (OUT) argument. From f5f8a8c66608ace11cc0e5f1f380dc75a8efea96 Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 16:35:44 -0400 Subject: [PATCH 22/34] Clone out CO_REDUCE_PREFIX_INCLUSIVE --- drafts/25-WIP-collective-edits.txt | 69 +++++++++++++++++++++++++++++- drafts/coll-edit-notes.txt | 1 - 2 files changed, 68 insertions(+), 2 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 3517c31..be65f1d 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -30,7 +30,7 @@ Syntax changes in this paper, relative to 25-177r1 are as follows: presence of the TEAM argument (work item DIN1, 25-127r1) and the COMPLETION argument (work item US04, 25-166r2). -2. The INITIAL argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to +2. The IDENTITY argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to INITIAL (as recommended by 25-195r1). 3. Edits Relative to 25-007r1 @@ -169,6 +169,73 @@ The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. *** TODO *** +16.9.?? \ +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, COMPLETION \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM \ + [, STAT, ERRMSG]) or +CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM, COMPLETION \ + [, STAT, ERRMSG]) + +<> Generalized exclusive prefix reduction across images. + +<> Collective subroutine. + +<> + +A shall not be polymorphic. It shall not be of a type with an ultimate + component that is allocatable or a pointer. It shall have the same shape, + type, and type parameter values, in corresponding references. It shall + not be a coindexed object. It is an INTENT (INOUT) argument. + + If A is scalar, the computed value provided to any given image is the + result of the exclusive prefix reduction operation described below. + If A is an array, each element of the computed value provided to any + given image is equal to the result of the exclusive prefix reduction + operation described below, as applied to corresponding elements of A in + corresponding references. + + The computed value is assigned to A if no error condition occurs. + Otherwise, A becomes undefined. + +OPERATION shall be a pure function with exactly two arguments; the result + and each argument shall be a scalar, nonallocatable, noncoarray, + nonpointer, nonpolymorphic data object with the same type and + type parameter values as A. The arguments shall not be optional. + If one argument has the ASYNCHRONOUS, TARGET, or VALUE attribute, + the other shall have that attribute. OPERATION shall implement a + mathematically associative operation. OPERATION shall be the same + function on all images in corresponding references. + + The computed value for an exclusive prefix reduction over a list of + values is the result of an iterative process. Each scalar input value + provided by image i is referred to as A_i. The corresponding computed + result value provided to image i is referred to as R_i. S_i is + initially the ordered list [A_1, ..., A_i]. Each iteration starts + with a processor-dependent choice of item x from the list S_i. Adjacent + items x and y (where x precedes y) are removed from the list and replaced + with the value of OPERATION(x, y). The process terminates when the list + has only one item; this is the computed value of R_i. + +TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module + ISO_FORTRAN_ENV. It is an INTENT (IN) argument. + +COMPLETION shall be a scalar of type COMPLETION_TYPE from the intrinsic + module ISO_FORTRAN_ENV. It is an INTENT (INOUT) argument. + +STAT (optional) shall be a noncoindexed integer scalar with a decimal + exponent range of at least four. It is an INTENT (OUT) argument. + +ERRMSG (optional) shall be a noncoindexed default character scalar. It + is an INTENT (INOUT) argument. + +The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. + +<> + +*** TODO *** + ------------------------------------------------------------------------- [412:4+] In 16.9 Specifications of the standard intrinsic procedures, after the specification of CO_SUM, add: diff --git a/drafts/coll-edit-notes.txt b/drafts/coll-edit-notes.txt index 8638d0c..e548ebf 100644 --- a/drafts/coll-edit-notes.txt +++ b/drafts/coll-edit-notes.txt @@ -1,6 +1,5 @@ Collective subroutines edits TODO: -* Clone REDUCE_INCLUSIVE and adjust * Write REDUCE examples Examples: From 39143130efa2211e7b61331bf67928a96b4d11de Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 16:43:18 -0400 Subject: [PATCH 23/34] First draft of MAXLOC-like example for EXCLUSIVE --- drafts/25-WIP-collective-edits.txt | 36 ++++++++++++++++++++++++++++-- drafts/coll-edit-notes.txt | 1 + 2 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index be65f1d..9f8a1a5 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -165,9 +165,41 @@ ERRMSG (optional) shall be a noncoindexed default character scalar. It The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. -<> +<> The subroutine below demonstrates how to use +CO_REDUCE_PREFIX_EXCLUSIVE to perform a collective exclusive prefix +reduction analogous to the intrinsic function MAXLOC: + +SUBROUTINE co_prefix_maxloc(value, image) + USE, INTRINSIC :: IEEE_ARITHMETIC, ONLY: IEEE_VALUE, IEEE_NEGATIVE_INF + REAL, INTENT(INOUT) :: value + INTEGER, INTENT(OUT) :: image + + TYPE :: tuple + REAL :: value + INTEGER :: image + END TYPE + TYPE(tuple) :: t + + t = tuple(value, THIS_IMAGE()) + CALL CO_REDUCE_PREFIX_EXCLUSIVE(t, find_maxloc, & + INITIAL=tuple{IEEE_VALUE(1.0,IEEE_NEGATIVE_INF), 0}) + value = t%value ! The largest value provided by a prior image, + image = t%image ! .. and the index of that image, + +CONTAINS + PURE FUNCTION find_maxloc(lhs,rhs) RESULT(maxloc) + TYPE(tuple), INTENT(IN) :: lhs,rhs + TYPE(tuple) :: maxloc + IF (lhs%value > rhs%value) THEN + maxloc%value = lhs%value + maxloc%image = lhs%image + ELSE + maxloc%value = rhs%value + maxloc%image = rhs%image + END IF + END FUNCTION find_maxloc +END SUBROUTINE co_prefix_maxloc -*** TODO *** 16.9.?? \ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION [, STAT, ERRMSG]) or diff --git a/drafts/coll-edit-notes.txt b/drafts/coll-edit-notes.txt index e548ebf..5184875 100644 --- a/drafts/coll-edit-notes.txt +++ b/drafts/coll-edit-notes.txt @@ -4,6 +4,7 @@ Collective subroutines edits TODO: Examples: EXCLUSIVE + UNTESTED with multi-image MAXLOC over a derived type of real value and integer image ID computes max value in prefix and the image that provided it INCLUSIVE From ecd5b3c8f87e91d8c04dedf7510dc4821b9ef31f Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 18:33:43 -0400 Subject: [PATCH 24/34] Add note about section 16.6 edits --- drafts/25-WIP-collective-edits.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 9f8a1a5..1fc228f 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -33,6 +33,10 @@ Syntax changes in this paper, relative to 25-177r1 are as follows: 2. The IDENTITY argument to CO_REDUCE_PREFIX_EXCLUSIVE has been renamed to INITIAL (as recommended by 25-195r1). +Note that a combined edits paper for orthogonal work-items DIN1 and US04 is +still forthcoming, which will provide the edits in section 16.6 that are +cross-referenced by the edits in this paper. + 3. Edits Relative to 25-007r1 ============================= From cd701c22f2b890d870c1928574a6e38e6a9f93ee Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 19:22:57 -0400 Subject: [PATCH 25/34] CO_REDUCE_PREFIX_INCLUSIVE: Add segmented sum example, based on MPI_SCAN --- drafts/25-WIP-collective-edits.txt | 39 ++++++++++++++++++++++++++++-- drafts/coll-edit-notes.txt | 4 +-- 2 files changed, 38 insertions(+), 5 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 1fc228f..49159c4 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -268,9 +268,44 @@ ERRMSG (optional) shall be a noncoindexed default character scalar. It The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. -<> +<> This example uses a user-defined operation to produce a +segmented prefix sum. A segmented prefix sum takes, as input, an ordered +list of values and corresponding list of logicals, and the logicals +delineate the various segments of the prefix sum. For example: -*** TODO *** + values: 1 2 4 5 6 7 8 9 + logicals: F F T T T F F T + result: 1 3 4 9 15 7 15 9 + +Note the segmented_sum operation used below is noncommutative. + +SUBROUTINE co_prefix_segment_sum(value, flag) + REAL, INTENT(INOUT) :: value + LOGICAL, INTENT(IN) :: flag + + TYPE :: tuple + REAL :: value + LOGICAL :: flag + END TYPE + TYPE(tuple) :: t + + t = tuple(value, flag) + CALL CO_REDUCE_PREFIX_INCLUSIVE(t, OPERATION=segmented_sum) + value = t%value + +CONTAINS + PURE FUNCTION segmented_sum(lhs,rhs) RESULT(sum) + TYPE(tuple), INTENT(IN) :: lhs,rhs + TYPE(tuple) :: sum + + IF (lhs%flag .eqv. rhs%flag) THEN + sum%value = lhs%value + rhs%value + ELSE + sum%value = rhs%value + END IF + sum%flag = rhs%flag + END FUNCTION segmented_sum +END SUBROUTINE co_prefix_segment_sum ------------------------------------------------------------------------- [412:4+] In 16.9 Specifications of the standard intrinsic procedures, diff --git a/drafts/coll-edit-notes.txt b/drafts/coll-edit-notes.txt index 5184875..4f4858b 100644 --- a/drafts/coll-edit-notes.txt +++ b/drafts/coll-edit-notes.txt @@ -1,15 +1,13 @@ Collective subroutines edits TODO: -* Write REDUCE examples +* Find better ways to test the CO_REDUCE_PREFIX examples Examples: EXCLUSIVE - UNTESTED with multi-image MAXLOC over a derived type of real value and integer image ID computes max value in prefix and the image that provided it INCLUSIVE derived type: value and boolean - operation on boolean flag is XOR illustrates segmented prefix reduction MPI Example 6.24. From 37817b981c98c7ab93f3f3af252ac82de0948046 Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 22:24:43 -0400 Subject: [PATCH 26/34] co_prefix_maxloc example: Fix typos and simplify --- drafts/25-WIP-collective-edits.txt | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 49159c4..fe916e5 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -185,8 +185,8 @@ SUBROUTINE co_prefix_maxloc(value, image) TYPE(tuple) :: t t = tuple(value, THIS_IMAGE()) - CALL CO_REDUCE_PREFIX_EXCLUSIVE(t, find_maxloc, & - INITIAL=tuple{IEEE_VALUE(1.0,IEEE_NEGATIVE_INF), 0}) + CALL CO_REDUCE_PREFIX_EXCLUSIVE(t, OPERATION=find_maxloc, & + INITIAL=tuple(IEEE_VALUE(1.0,IEEE_NEGATIVE_INF), 0) ) value = t%value ! The largest value provided by a prior image, image = t%image ! .. and the index of that image, @@ -194,13 +194,8 @@ CONTAINS PURE FUNCTION find_maxloc(lhs,rhs) RESULT(maxloc) TYPE(tuple), INTENT(IN) :: lhs,rhs TYPE(tuple) :: maxloc - IF (lhs%value > rhs%value) THEN - maxloc%value = lhs%value - maxloc%image = lhs%image - ELSE - maxloc%value = rhs%value - maxloc%image = rhs%image - END IF + + maxloc = MERGE(lhs, rhs, lhs%value > rhs%value) END FUNCTION find_maxloc END SUBROUTINE co_prefix_maxloc From ecc8d733983652fa76145b6a20e84232649d851c Mon Sep 17 00:00:00 2001 From: bonachea Date: Fri, 24 Oct 2025 22:31:28 -0400 Subject: [PATCH 27/34] Minor wording improvement --- drafts/25-WIP-collective-edits.txt | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index fe916e5..d8ddbf6 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -263,10 +263,11 @@ ERRMSG (optional) shall be a noncoindexed default character scalar. It The semantics of TEAM, COMPLETION, STAT and ERRMSG are described in 16.6. -<> This example uses a user-defined operation to produce a -segmented prefix sum. A segmented prefix sum takes, as input, an ordered -list of values and corresponding list of logicals, and the logicals -delineate the various segments of the prefix sum. For example: +<> The subroutine below demonstrates how to use +CO_REDUCE_PREFIX_INCLUSIVE to compute a collective segmented prefix sum. +A segmented prefix sum takes, as input, an ordered list of values and +corresponding list of logicals, and the logicals delineate the various +segments of the prefix sum. For example: values: 1 2 4 5 6 7 8 9 logicals: F F T T T F F T From c45ccedc18e8401a4b92c12dbd7f157c359c8698 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 16:31:24 -0400 Subject: [PATCH 28/34] Add a missing reference --- drafts/25-WIP-collective-edits.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index d8ddbf6..ef82b2d 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -2,7 +2,8 @@ To: J3 J3/25-#### From: Brandon Cook & Dan Bonachea Subject: Edits for US20 Collective Subroutines for Prefix Reductions Date: 2025-October-17 -References: 25-177r1, 25-144r1, 25-166r2, 25-195r1, 25-007r1, WG5/N-2239 +References: 25-144r1, 25-177r1, 25-166r2, 25-195r1, 25-127r1, + 25-007r1, WG5/N-2239 1. Background ============= From 3846483da650da881dee260a085ec65be63ddea9 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 16:32:50 -0400 Subject: [PATCH 29/34] CO_REDUCE_PREFIX_INCLUSIVE: Fix copy pasta in prose --- drafts/25-WIP-collective-edits.txt | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index ef82b2d..6f7daf7 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -210,7 +210,7 @@ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM \ CO_REDUCE_PREFIX_INCLUSIVE (A, OPERATION, TEAM, COMPLETION \ [, STAT, ERRMSG]) -<> Generalized exclusive prefix reduction across images. +<> Generalized inclusive prefix reduction across images. <> Collective subroutine. @@ -222,9 +222,9 @@ A shall not be polymorphic. It shall not be of a type with an ultimate not be a coindexed object. It is an INTENT (INOUT) argument. If A is scalar, the computed value provided to any given image is the - result of the exclusive prefix reduction operation described below. + result of the inclusive prefix reduction operation described below. If A is an array, each element of the computed value provided to any - given image is equal to the result of the exclusive prefix reduction + given image is equal to the result of the inclusive prefix reduction operation described below, as applied to corresponding elements of A in corresponding references. @@ -240,7 +240,7 @@ OPERATION shall be a pure function with exactly two arguments; the result mathematically associative operation. OPERATION shall be the same function on all images in corresponding references. - The computed value for an exclusive prefix reduction over a list of + The computed value for an inclusive prefix reduction over a list of values is the result of an iterative process. Each scalar input value provided by image i is referred to as A_i. The corresponding computed result value provided to image i is referred to as R_i. S_i is From 0d3960f0f5e613f1f430ff04ef58cfaf4acabe09 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 16:35:23 -0400 Subject: [PATCH 30/34] INITIAL is INTENT(IN) --- drafts/25-WIP-collective-edits.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 6f7daf7..6aa0a9a 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -134,8 +134,8 @@ A shall not be polymorphic. It shall not be of a type with an ultimate Otherwise, A becomes undefined. INITIAL shall be a scalar with the same declared type and type parameter - values as A. INITIAL shall have the same value in corresponding - references. + values as A. It is an INTENT (IN) argument. INITIAL shall have the + same value in corresponding references. OPERATION shall be a pure function with exactly two arguments; the result and each argument shall be a scalar, nonallocatable, noncoarray, From e92eba92f53963bb4b4b6566422687b354814b8d Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 16:43:52 -0400 Subject: [PATCH 31/34] find_maxloc: Favor the lowest-numbered image to break ties, analogously to default behavior of MAXLOC --- drafts/25-WIP-collective-edits.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 6aa0a9a..4b86830 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -196,7 +196,7 @@ CONTAINS TYPE(tuple), INTENT(IN) :: lhs,rhs TYPE(tuple) :: maxloc - maxloc = MERGE(lhs, rhs, lhs%value > rhs%value) + maxloc = MERGE(lhs, rhs, lhs%value >= rhs%value) END FUNCTION find_maxloc END SUBROUTINE co_prefix_maxloc From 85004c2bde765ef9c0f3a230099399fde22bf1a5 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 16:51:03 -0400 Subject: [PATCH 32/34] Append "in the specified team" everywhere we are referring to a specific image index (excluding examples which use the current team) --- drafts/25-WIP-collective-edits.txt | 64 ++++++++++++++++-------------- 1 file changed, 34 insertions(+), 30 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 4b86830..bd498a5 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -148,13 +148,14 @@ OPERATION shall be a pure function with exactly two arguments; the result The computed value for an exclusive prefix reduction over a list of values is the result of an iterative process. Each scalar input value - provided by image i is referred to as A_i. The corresponding computed - result value provided to image i is referred to as R_i. S_i is initially - the ordered list [INITIAL, A_1, ..., A_{i-1}]. Each iteration starts - with a processor-dependent choice of item x from the list S_i. Adjacent - items x and y (where x precedes y) are removed from the list and replaced - with the value of OPERATION(x, y). The process terminates when the list - has only one item; this is the computed value of R_i. + provided by image i in the specified team is referred to as A_i. The + corresponding computed result value provided to image i in the specified + team is referred to as R_i. S_i is initially the ordered list [INITIAL, + A_1, ..., A_{i-1}]. Each iteration starts with a processor-dependent + choice of item x from the list S_i. Adjacent items x and y (where x + precedes y) are removed from the list and replaced with the value of + OPERATION(x, y). The process terminates when the list has only one item; + this is the computed value of R_i. TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module ISO_FORTRAN_ENV. It is an INTENT (IN) argument. @@ -242,13 +243,14 @@ OPERATION shall be a pure function with exactly two arguments; the result The computed value for an inclusive prefix reduction over a list of values is the result of an iterative process. Each scalar input value - provided by image i is referred to as A_i. The corresponding computed - result value provided to image i is referred to as R_i. S_i is - initially the ordered list [A_1, ..., A_i]. Each iteration starts - with a processor-dependent choice of item x from the list S_i. Adjacent - items x and y (where x precedes y) are removed from the list and replaced - with the value of OPERATION(x, y). The process terminates when the list - has only one item; this is the computed value of R_i. + provided by image i in the specified team is referred to as A_i. The + corresponding computed result value provided to image i in the specified + team is referred to as R_i. S_i is initially the ordered list [A_1, ..., + A_i]. Each iteration starts with a processor-dependent choice of item x + from the list S_i. Adjacent items x and y (where x precedes y) are + removed from the list and replaced with the value of OPERATION(x, y). + The process terminates when the list has only one item; this is the + computed value of R_i. TEAM shall be a scalar of type TEAM_TYPE from the intrinsic module ISO_FORTRAN_ENV. It is an INTENT (IN) argument. @@ -323,15 +325,16 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - The computed value provided to image one is equal to the value zero. - If A is scalar, the computed value provided to any given image I (with I - greater than one) is equal to a processor-dependent approximation to the - sum of the values of A in corresponding references provided by images - 1 to (I-1). - If A is an array, each element of the computed value provided to any - given image I (with I greater than one) is equal to a processor-dependent - approximation to the sum of the values in corresponding elements of A in - corresponding references provided by images 1 to (I-1). + The computed value provided to image one in the specified team is equal + to the value zero. If A is scalar, the computed value provided to any + given image I in the specified team (with I greater than one) is equal to + a processor-dependent approximation to the sum of the values of A in + corresponding references provided by images 1 to (I-1) in the specified + team. If A is an array, each element of the computed value provided to + any given image I in the specified team (with I greater than one) is + equal to a processor-dependent approximation to the sum of the values in + corresponding elements of A in corresponding references provided by + images 1 to (I-1) in the specified team. The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. @@ -372,13 +375,14 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - If A is scalar, the computed value provided to any given image I is equal - to a processor-dependent approximation to the sum of the values of A in - corresponding references provided by images 1 to I. - If A is an array, each element of the computed value provided to any - given image I is equal to a processor-dependent approximation to the sum - of the values in corresponding elements of A in corresponding references - provided by images 1 to I. + If A is scalar, the computed value provided to any given image I in the + specified team is equal to a processor-dependent approximation to the sum + of the values of A in corresponding references provided by images 1 to I + in the specified team. If A is an array, each element of the computed + value provided to any given image I in the specified team is equal to a + processor-dependent approximation to the sum of the values in + corresponding elements of A in corresponding references provided by + images 1 to I in the specified team. The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. From 3264c9195c1a65aff48ffed1f70b7c85cb7ecc43 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sat, 25 Oct 2025 23:42:36 -0400 Subject: [PATCH 33/34] Minor wording tweak --- drafts/25-WIP-collective-edits.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index bd498a5..5e4e29d 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -23,7 +23,7 @@ the collective subroutine variants of prefix reduction operations, Since the passage of 25-177r1, subsequent papers 25-166r2 and 25-195r1 have suggested additional syntax adjustments in order to maintain -uniformity with closely related features. +uniformity with closely related features under concurrent development. Syntax changes in this paper, relative to 25-177r1 are as follows: From e6a95f086e3d2e79512a8315f94dab014136f163 Mon Sep 17 00:00:00 2001 From: bonachea Date: Sun, 26 Oct 2025 15:21:26 -0400 Subject: [PATCH 34/34] Use "image i" instead of "image I" --- drafts/25-WIP-collective-edits.txt | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drafts/25-WIP-collective-edits.txt b/drafts/25-WIP-collective-edits.txt index 5e4e29d..ac1a67e 100644 --- a/drafts/25-WIP-collective-edits.txt +++ b/drafts/25-WIP-collective-edits.txt @@ -327,14 +327,14 @@ A shall be of numeric type. It shall have the same shape, type, and The computed value provided to image one in the specified team is equal to the value zero. If A is scalar, the computed value provided to any - given image I in the specified team (with I greater than one) is equal to + given image i in the specified team (with i greater than one) is equal to a processor-dependent approximation to the sum of the values of A in - corresponding references provided by images 1 to (I-1) in the specified + corresponding references provided by images 1 to (i-1) in the specified team. If A is an array, each element of the computed value provided to - any given image I in the specified team (with I greater than one) is + any given image i in the specified team (with i greater than one) is equal to a processor-dependent approximation to the sum of the values in corresponding elements of A in corresponding references provided by - images 1 to (I-1) in the specified team. + images 1 to (i-1) in the specified team. The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined. @@ -375,14 +375,14 @@ A shall be of numeric type. It shall have the same shape, type, and type parameter values, in corresponding references. It shall not be a coindexed object. It is an INTENT (INOUT) argument. - If A is scalar, the computed value provided to any given image I in the + If A is scalar, the computed value provided to any given image i in the specified team is equal to a processor-dependent approximation to the sum - of the values of A in corresponding references provided by images 1 to I + of the values of A in corresponding references provided by images 1 to i in the specified team. If A is an array, each element of the computed - value provided to any given image I in the specified team is equal to a + value provided to any given image i in the specified team is equal to a processor-dependent approximation to the sum of the values in corresponding elements of A in corresponding references provided by - images 1 to I in the specified team. + images 1 to i in the specified team. The computed value is assigned to A if no error condition occurs. Otherwise, A becomes undefined.