This issue has been automatically converted from the original issue lists and some formatting may not have been preserved.
Authors: WG14, Jim Thomas
Date: 2017-03-04
Reference document: N2125
Submitted against: Floating-point TS 18661 (C11 version, 2014-2016)
Status: Fixed
Fixed in: C23
Converted from: n2397.htm
This is about an issue raised by Joseph Myers in SC22WG14.14561:
TS 18661-1 and -2 define type-generic macros for the functions that round
result to a narrower type. In part 1 these are, for example, fadd and
dadd for addition; in part 2, for example, d32add and d64add.
Part 3 does not seem to make any changes or additions to those macros, and
consequences of that seem nonobvious. It defines new functions for the
new types: fMaddfN, fMaddfNx, fMxaddfN, fMxaddfNx (where M < N, or M <= N
in the fMaddfNx case), and likewise for decimal types. But the
type-generic macros remain as defined in 7.25#6a after the changes from
parts 1 and 2 are applied (part 3 does not contain the string "6a").
That is, it's valid to pass the _FloatN and _FloatNx types to the fadd and
dadd macros, and valid to pass the new _DecimalN and _DecimalNx types from
part 3 to the d32add and d64add types.
(a) 7.25#6a says "If the macro prefix is d32 or d64, use of an argument of
standard floating type results in undefined behavior.". Other places get
amended in part 3 to say "floating type of radix 2" in addition to
"standard floating type". But it appears it fails to make it undefined to
pass _FloatN or _FloatNx arguments to d32add, d64add etc. type-generic
macros - although clearly it should be undefined.
(b) Passing _Decimal128 to d32add would result in the d32addd128 function
being called, as expected. But say you pass a _Decimal128x argument. A
function d32addd128x exists but the specification would seem to result in
d32addd64 being called, which seems unintuitive. Similar issues apply
with _FloatN and _FloatNx types - calling fadd on them would always call
the fadd function not faddl. (But in that case there *are* no functions
defined that take _FloatN / _FloatNx arguments and return float or double.
So the right thing to do is less obvious.)
The following addresses these issues by filling in the missing specification in part 3.
In clause 15, after the change to 7.25#6, add:
Change 7.25#6a from:
[6a] The functions that round result to a narrower type have type-generic macros whose names are obtained by omitting any suffix from the function names. Thus, the macros with
fordprefix are:fadd fmul ffma dadd dmul dfma fsub fdiv fsqrt dsub ddiv dsqrtand the macros with
d32ord64prefix are:d32add d32mul d32fma d64add d64mul d64fma d32sub d32div d32sqrt d64sub d64div d64sqrtAll arguments are generic. If any argument is not real, use of the macro results in undefined behavior. If the macro prefix is
ford, use of an argument of decimal floating type results in undefined behavior. If the macro prefix isd32ord64, use of an argument of standard floating type results in undefined behavior. The function invoked is determined as follows:— If any argument has type
_Decimal128, or if the macro prefix isd64, the function invoked has the name of the macro, with ad128suffix.— Otherwise, if the macro prefix is
d32, the function invoked has the name of the macro, with ad64suffix.— Otherwise, if any argument has type
long double, or if the macro prefix isd, the function invoked has the name of the macro, with anlsuffix.— Otherwise, the function invoked has the name of the macro (with no suffix).
to:
[6a] The functions that round result to a narrower type have type-generic macros whose names are obtained by omitting any suffix from the function names. Thus, the macros with
fordprefix are:fadd fmul ffma dadd dmul dfma fsub fdiv fsqrt dsub ddiv dsqrtand the macros with
fM,fMx,dM, ordMxprefix are:
fMadd fMxmul dMfma
fMsub fMxdiv dMsqrt
fMmul fMxfma dMxadd
fMdiv fMxsqrt dMxsub
fMfma dMadd dMxmul
fMsqrt dMsub dMxdiv
fMxadd dMmul dMxfma
fMxsub dMdiv dMxsqrtAll arguments are generic. If any argument is not real, use of the macro results in undefined behavior. If the macro prefix is
f,d,fM, orfMx, use of an argument of decimal floating type results in undefined behavior. If the macro prefix isdM ordMx, use of an argument of standard or binary floating type results in undefined behavior. The function invoked is determined as follows:— Arguments that have integer type are regarded as having type
_Decimal64if any argument has decimal floating type, and as having typedoubleotherwise.— The unsuffixed name of the function is the name of the macro, and its suffix, if any, corresponds to the parameter type which may be any type with at least the range and precision of the argument types.
In clause 15, at the end of the text appended to the table in 7.25#7, further append:
f32xadd(d, f32x)anyf32xaddfN orf32xaddfNxsuch that N > 32 and the suffix type,_FloatNor_FloatNx, is at least as wide asdoubleand_Float32x
Comment from WG14 on 2019-05-03:
Apr 2017 meeting
The committee agrees that this is a defect and accepts the Suggested Technical Corrigendum
Apr 2018 meeting
After extensive discussion on the mailing list several documents were proposed with new and revised change suggestions. The following revised proposed change is largely drawn from N2213 with further changes reviewed at the meeting.
In clause 15, after the change to 7.25#6, add:
Change 7.25#6a from:
[6a] The functions that round result to a narrower type have type-generic macros whose names are obtained by omitting any suffix from the function names. Thus, the macros with
fordprefix are:fadd fmul ffma dadd dmul dfma fsub fdiv fsqrt dsub ddiv dsqrtand the macros with
d32ord64prefix are:d32add d32mul d32fma d64add d64mul d64fma d32sub d32div d32sqrt d64sub d64div d64sqrtAll arguments are generic. If any argument is not real, use of the macro results in undefined behavior. If the macro prefix is
ford, use of an argument of decimal floating type results in undefined behavior. If the macro prefix isd32ord64, use of an argument of standard floating type results in undefined behavior. The function invoked is determined as follows:— If any argument has type
_Decimal128, or if the macro prefix isd64, the function invoked has the name of the macro, with ad128suffix.— Otherwise, if the macro prefix is
d32, the function invoked has the name of the macro, with ad64suffix.— Otherwise, if any argument has type
long double, or if the macro prefix isd, the function invoked has the name of the macro, with anlsuffix.— Otherwise, the function invoked has the name of the macro (with no suffix).
to:
[6a] The functions that round result to a narrower type have type-generic macros whose names are obtained by omitting any suffix from the function names. Thus, the macros with
fordprefix are:fadd fmul ffma dadd dmul dfma fsub fdiv fsqrt dsub ddiv dsqrtand the macros with
fM,fMx,dM, ordMxprefix are:
fMadd fMxmul dMfma
fMsub fMxdiv dMsqrt
fMmul fMxfma dMxadd
fMdiv fMxsqrt dMxsub
fMfma dMadd dMxmul
fMsqrt dMsub dMxdiv
fMxadd dMmul dMxfma
fMxsub dMdiv dMxsqrtAll arguments are generic. If any argument is not real, use of the macro results in undefined behavior. If the macro prefix is
ford, use of an argument of interchange or extended floating type results in undefined behavior. If the macro prefix isfM, orfMx, use of an argument of standard or decimal floating type results in undefined behavior. If the macro prefix isdM ordMx, use of an argument of standard or binary floating type results in undefined behavior. The function invoked is determined as follows:— Arguments that have integer type are regarded as having type
doubleif the macro prefix isford, as having type_Float64if the macro prefix isfM orfMx, and as having type_Decimal64if the macro prefix isdM ordMx.— If the function has exactly one generic parameter, the type determined is the type of the argument.
— If the function has exactly two generic parameters, the type determined is the type determined by the usual arithmetic conversions (6.3.1.8) applied to the arguments.
— If the function has three generic parameters, the type determined is the type determined by applying the usual arithmetic conversions twice, first to the first two arguments, then to that result type and the third argument.
— If no function with the given prefix has the parameter type determined above, the parameter type is determined from the prefix as follows:
c f |
c double |
c d |
c long double |
fM |
_FloatN for minimum N > M if supported, else _FloatMx |
fMx |
_FloatNx for minimum N > M if supported, else _FloatN for minimum N > M |
dM |
_DecimalN for minimum N > M if supported, else _DecimalMx |
dMx |
_DecimalNx for minimum N > M if supported, else _DecimalN for minimum N > M |
In clause 15, at the end of the text appended to the table in 7.25#7, further append:
fsub(d, ld) fsubl f32add(f64x, f64) f32addf64x d32xsqrt(n) d32xsqrtd64
f32mul(f128, f32x) f32mulf128if_Float128is at least as wide as_Float32x, orf32mulf32xif_Float32xis wider than_Float128
f32fma(f32x, n, f32x) f32fmaf64if_Float64is at least as wide as_Float32x,orf32fmaf32xif_Float32xis wider than_Float64
ddiv(ld, f128)undefined
f32fma(f64, d, f64)undefined
fmul(dc, d)undefinedf32add(f32, f32) f32addf64(f32, f32)
f32xsqrt(f32) f32xsqrtf64x(f32)if_Float64xissupported, else
f32xsqrtf64f64div(f32x, f32x) f64divf128(f32x, f32x)