Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

Programming

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

procodes 2020. 7. 7. 21:19

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

Clang이이 코드에서 루프를 최적화하는 이유

#include <time.h>
#include <stdio.h>

static size_t const N = 1 << 27;
static double arr[N] = { /* initialize to zero */ };

int main()
{
    clock_t const start = clock();
    for (int i = 0; i < N; ++i) { arr[i] *= 1.0; }
    printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
}

그러나이 코드의 루프는 아닙니다.

#include <time.h>
#include <stdio.h>

static size_t const N = 1 << 27;
static double arr[N] = { /* initialize to zero */ };

int main()
{
    clock_t const start = clock();
    for (int i = 0; i < N; ++i) { arr[i] += 0.0; }
    printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
}

(응답이 각각 다른지 알고 싶기 때문에 C와 C ++로 태깅합니다.)

부동 소수점 산술에 대한 IEEE 754-2008 표준 및 ISO / IEC 10967 LIA (Language Independent Arithmetic) 표준, 1 부는 이것이 왜 그런지에 대한 답변입니다.

IEEE 754 § 6.3 부호 비트

입력 또는 결과가 NaN 인 경우이 표준은 NaN의 부호를 해석하지 않습니다. 그러나 비트 문자열 (copy, negate, abs, copySign)에 대한 작업은 NaN 피연산자의 부호 비트에 따라 NaN 결과의 부호 비트를 지정합니다. 논리 술어 totalOrder도 NaN 피연산자의 부호 비트의 영향을받습니다. 다른 모든 연산의 경우,이 표준은 입력 NaN이 하나만 있거나 유효하지 않은 연산에서 NaN이 생성 된 경우에도 NaN 결과의 부호 비트를 지정하지 않습니다.

입력 값이나 결과가 NaN이 아닌 경우 곱 또는 부호의 부호는 피연산자 부호의 배타적 OR입니다. 합의 부호 또는 합 x + (-y)로 간주되는 차이 x-y는 최대 부수 부호와 다릅니다. 변환 결과의 부호, 양자화 연산, roundTo-Integral 연산 및 roundToIntegralExact (5.3.1 참조)는 첫 번째 또는 유일한 피연산자의 부호입니다. 이 규칙은 피연산자 또는 결과가 0 또는 무한 인 경우에도 적용됩니다.

반대 부호를 가진 두 피연산자의 합 (또는 같은 부호를 가진 두 피연산자의 차이)이 정확히 0 인 경우 roundTowardNegative를 제외한 모든 반올림 방향 속성에서 해당 합 (또는 차이)의 부호는 +0이어야합니다. 이 속성 하에서, 정확한 제로섬 (또는 차이)의 부호는 -0이어야한다. 그러나 x + x = x − (−x)는 x가 0 인 경우에도 x와 동일한 부호를 유지합니다.

덧셈의 경우

기본 반올림 모드에서 (라운드에은 - 가장 가까운, 타이 - 투 - 심지어) , 우리는 그 볼 x+0.0생산하는 x경우를 제외 x한다 -0.0:이 경우 우리는 합계 제로 반대 징후와 두 개의 피연산자의 합을 가지고 있고, §6.3 단락 이 추가로 생성되는 3 가지 규칙 +0.0.

이후 +0.0아니다 비트 본래 동일 -0.0하고, 그 -0.0입력으로 발생할 수있는 정상적인 값은 컴파일러에 부정적 제로 변환 될 코드에 배치해야만한다 +0.0.

요약 : 기본 아래에서, 반올림 모드 x+0.0경우,x

이 아닌 -0.0 경우 x자체는 허용 가능한 출력 값입니다.
이면 -0.0 출력 값 은 +0.0 비트 단위와 동일하지 않아야합니다 -0.0.

곱셈의 경우

기본 반올림 모드 에서는 이러한 문제가 발생하지 않습니다 x*1.0. 만약 x:

x*1.0 == x항상 (하) 정수 입니다.
이면 +/- infinity결과는 +/- infinity같은 부호입니다.
인 NaN다음에있어서,

IEEE 754 § 6.2.3 NaN 전파

NaN 피연산자를 결과로 전파하고 입력으로 단일 NaN을 갖는 연산은 대상 형식으로 표현 가능한 경우 입력 NaN의 페이로드를 사용하여 NaN을 생성해야합니다.

그중, 지수 및 가수 (아니지만 부호) 것을 의미 NaN*1.0되어 추천이 입력으로부터 변경 될 NaN. 위의 §6.3p1에 따라 부호가 지정되어 있지 않지만 구현시 소스와 동일하도록 지정할 수 있습니다 NaN.
is +/- 0.0, then the result is a 0 with its sign bit XORed with the sign bit of 1.0, in agreement with §6.3p2. Since the sign bit of 1.0 is 0, the output value is unchanged from the input. Thus, x*1.0 == x even when x is a (negative) zero.

The Case of Subtraction

Under the default rounding mode, the subtraction x-0.0 is also a no-op, because it is equivalent to x + (-0.0). If x is

is NaN, then §6.3p1 and §6.2.3 apply in much the same way as for addition and multiplication.
is +/- infinity, then the result is +/- infinity of the same sign.
is a (sub)normal number, x-0.0 == x always.
is -0.0, then by §6.3p2 we have "[...] the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs;". This forces us to assign -0.0 as the result of (-0.0) + (-0.0), because -0.0 differs in sign from none of the addends, while +0.0 differs in sign from two of the addends, in violation of this clause.
is +0.0, then this reduces to the addition case (+0.0) + (-0.0) considered above in The Case of Addition, which by §6.3p3 is ruled to give +0.0.

Since for all cases the input value is legal as the output, it is permissible to consider x-0.0 a no-op, and x == x-0.0 a tautology.

Value-Changing Optimizations

The IEEE 754-2008 Standard has the following interesting quote:

IEEE 754 § 10.4 Literal meaning and value-changing optimizations

[...]

The following value-changing transformations, among others, preserve the literal meaning of the source code:

Applying the identity property 0 + x when x is not zero and is not a signaling NaN and the result has the same exponent as x.

Applying the identity property 1 × x when x is not a signaling NaN and the result has the same exponent as x.

Changing the payload or sign bit of a quiet NaN.

[...]

Since all NaNs and all infinities share the same exponent, and the correctly rounded result of x+0.0 and x*1.0 for finite x has exactly the same magnitude as x, their exponent is the same.

sNaNs

Signaling NaNs are floating-point trap values; They are special NaN values whose use as a floating-point operand results in an invalid operation exception (SIGFPE). If a loop that triggers an exception were optimized out, the software would no longer behave the same.

However, as user2357112 points out in the comments, the C11 Standard explicitly leaves undefined the behaviour of signaling NaNs (sNaN), so the compiler is allowed to assume they do not occur, and thus that the exceptions that they raise also do not occur. The C++11 standard omits describing a behaviour for signaling NaNs, and thus also leaves it undefined.

Rounding Modes

In alternate rounding modes, the permissible optimizations may change. For instance, under Round-to-Negative-Infinity mode, the optimization x+0.0 -> x becomes permissible, but x-0.0 -> x becomes forbidden.

To prevent GCC from assuming default rounding modes and behaviours, the experimental flag -frounding-math can be passed to GCC.

Conclusion

Clang and GCC, even at -O3, remains IEEE-754 compliant. This means it must keep to the above rules of the IEEE-754 standard. x+0.0 is not bit-identical to x for all x under those rules, but x*1.0 may be chosen to be so: Namely, when we

Obey the recommendation to pass unchanged the payload of x when it is a NaN.
Leave the sign bit of a NaN result unchanged by * 1.0.
Obey the order to XOR the sign bit during a quotient/product, when x is not a NaN.

To enable the IEEE-754-unsafe optimization (x+0.0) -> x, the flag -ffast-math needs to be passed to Clang or GCC.

x += 0.0 isn't a NOOP if x is -0.0. The optimizer could strip out the whole loop anyway since the results aren't used, though. In general, it's hard to tell why an optimizer makes the decisions it does.

참고URL : https://stackoverflow.com/questions/33272994/why-does-clang-optimize-away-x-1-0-but-not-x-0-0

'Programming' 카테고리의 다른 글

모의 대 MagicMock (0)	2020.07.07
Git Repo에서 기존 파일 제거 (0)	2020.07.07
연속과 콜백의 차이점은 무엇입니까? (0)	2020.07.07
배쉬 인 경우 [false]; (0)	2020.07.07
rsync : 서버에서 대상 디렉토리를 만들도록 어떻게 구성 할 수 있습니까? (0)	2020.07.07

현재글Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

procodes

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

IEEE 754 § 6.3 부호 비트

덧셈의 경우

곱셈의 경우

IEEE 754 § 6.2.3 NaN 전파

The Case of Subtraction

Value-Changing Optimizations

IEEE 754 § 10.4 Literal meaning and value-changing optimizations

sNaNs

Rounding Modes

Conclusion

'Programming' 카테고리의 다른 글

'Programming'의 다른글

티스토리툴바

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

Clang은 왜 x * 1.0을 최적화하지만 x + 0.0은 최적화하지 않습니까?

IEEE 754 § 6.3 부호 비트

덧셈의 ​​경우

곱셈의 경우

IEEE 754 § 6.2.3 NaN 전파

The Case of Subtraction

Value-Changing Optimizations

IEEE 754 § 10.4 Literal meaning and value-changing optimizations

sNaNs

Rounding Modes

Conclusion

'Programming' 카테고리의 다른 글

'Programming'의 다른글

관련글

티스토리툴바

덧셈의 경우