Programming

GCC가있는 x86에서 정수 오버플로가 무한 루프를 일으키는 이유는 무엇입니까?

procodes 2020. 7. 7. 20:54
반응형

GCC가있는 x86에서 정수 오버플로가 무한 루프를 일으키는 이유는 무엇입니까?


다음 코드는 GCC에서 무한 루프에 들어갑니다.

#include <iostream>
using namespace std;

int main(){
    int i = 0x10000000;

    int c = 0;
    do{
        c++;
        i += i;
        cout << i << endl;
    }while (i > 0);

    cout << c << endl;
    return 0;
}

그래서 여기 거래는 다음과 같습니다 오버 플로우가 기술적으로 정의되지 않은 동작입니다 정수를 체결했다. 그러나 x86의 GCC는 x86 정수 명령어를 사용하여 정수 산술을 구현합니다.

따라서 정의되지 않은 동작이라는 사실에도 불구하고 오버플로를 감쌀 것으로 예상했습니다. 그러나 그것은 사실이 아닙니다. 그래서 내가 무엇을 놓쳤습니까?

나는 이것을 사용하여 이것을 컴파일했다 :

~/Desktop$ g++ main.cpp -O2

GCC 출력 :

~/Desktop$ ./a.out
536870912
1073741824
-2147483648
0
0
0

... (infinite loop)

최적화가 비활성화되면 무한 루프가 없으며 출력이 정확합니다. Visual Studio는이를 올바르게 컴파일하고 다음 결과를 제공합니다.

올바른 출력 :

~/Desktop$ g++ main.cpp
~/Desktop$ ./a.out
536870912
1073741824
-2147483648
3

다른 변형은 다음과 같습니다.

i *= 2;   //  Also fails and goes into infinite loop.
i <<= 1;  //  This seems okay. It does not enter infinite loop.

모든 관련 버전 정보는 다음과 같습니다.

~/Desktop$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ..

...

Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) 
~/Desktop$ 

So the question is: Is this a bug in GCC? Or did I misunderstand something about how GCC handles integer arithmetic?

*I'm tagging this C as well, because I assume this bug will reproduce in C. (I haven't verified it yet.)

EDIT:

Here's the assembly of the loop: (if I recognized it properly)

.L5:
addl    %ebp, %ebp
movl    $_ZSt4cout, %edi
movl    %ebp, %esi
.cfi_offset 3, -40
call    _ZNSolsEi
movq    %rax, %rbx
movq    (%rax), %rax
movq    -24(%rax), %rax
movq    240(%rbx,%rax), %r13
testq   %r13, %r13
je  .L10
cmpb    $0, 56(%r13)
je  .L3
movzbl  67(%r13), %eax
.L4:
movsbl  %al, %esi
movq    %rbx, %rdi
addl    $1, %r12d
call    _ZNSo3putEc
movq    %rax, %rdi
call    _ZNSo5flushEv
cmpl    $3, %r12d
jne .L5

When the standard says it's undefined behavior, it means it. Anything can happen. "Anything" includes "usually integers wrap around, but on occasion weird stuff happens".

Yes, on x86 CPUs, integers usually wrap the way you expect. This is one of those exceptions. The compiler assumes you won't cause undefined behavior, and optimizes away the loop test. If you really want wraparound, pass -fwrapv to g++ or gcc when compiling; this gives you well-defined (twos-complement) overflow semantics, but can hurt performance.


It's simple: Undefined behaviour - especially with optimization (-O2) turned on - means anything can happen.

Your code behaves as (you) expected without the -O2 switch.

It's works quite fine with icl and tcc by the way, but you can't rely on stuff like that...

According to this, gcc optimization actually exploits signed integer overflow. This would mean that the "bug" is by design.


The important thing to note here is that C++ programs are written for the C++ abstract machine (which is usually emulated through hardware instructions). The fact that you are compiling for x86 is totally irrelevant to the fact that this has undefined behaviour.

The compiler is free to use the existence of undefined behaviour to improve its optimisations, (by removing a conditional from a loop, as in this example). There is no guaranteed, or even useful, mapping between C++ level constructs and x86 level machine code constructs apart from the requirement that the machine code will, when executed, produce the result demanded by the C++ abstract machine.


i += i;

// the overflow is undefined.

With -fwrapv it is correct. -fwrapv


Please people, undefined behaviour is exactly that, undefined. It means that anything could happen. In practice (as in this case), the compiler is free to assume it won't be called upon, and do whatever it pleases if that could make the code faster/smaller. What happens with code that should't run is anybody's guess. It will depend on the surrounding code (depending on that, the compiler could well generate different code), variables/constants used, compiler flags, ... Oh, and the compiler could get updated and write the same code differently, or you could get another compiler with a different view on code generation. Or just get a different machine, even another model in the same architecture line could very well have it's own undefined behaviour (look up undefined opcodes, some enterprising programmers found out that on some of those early machines sometimes did do useful stuff...). There is no "the compiler gives a definite behaviour on undefined behaviour". There are areas that are implementation-defined, and there you should be able to count on the compiler behaving consistently.


Even if a compiler were to specify that integer overflow must be considered a "non-critical" form of Undefined Behavior (as defined in Annex L), the result of an integer overflow should, absent a specific platform promise of more specific behavior, be at minimum regarded as a "partially-indeterminate value". Under such rules, adding 1073741824+1073741824 could arbitrarily be regarded as yielding 2147483648 or -2147483648 or any other value which was congruent to 2147483648 mod 4294967296, and values obtained by additions could arbitrarily be regarded as any value which was congruent to 0 mod 4294967296.

Rules allowing overflow to yield "partially-indeterminate values" would be sufficiently well-defined to abide by the letter and spirit of Annex L, but would not prevent a compiler from making the same generally-useful inferences as would be justified if overflows were unconstrained Undefined Behavior. It would prevent a compiler from making some phony "optimizations" whose primary effect in many cases is to require that programmers add extra clutter to the code whose sole purpose is to prevent such "optimizations"; whether that would be a good thing or not depends on one's point of view.

참고URL : https://stackoverflow.com/questions/7682477/why-does-integer-overflow-on-x86-with-gcc-cause-an-infinite-loop

반응형