루프에서 나머지 작업을 실행하는 Java 스레드는 다른 모든 스레드를 차단합니다.
다음 코드 스 니펫은 두 개의 스레드를 실행합니다. 하나는 매초마다 간단한 타이머 로깅이고 두 번째는 나머지 작업을 실행하는 무한 루프입니다.
public class TestBlockingThread {
private static final Logger LOGGER = LoggerFactory.getLogger(TestBlockingThread.class);
public static final void main(String[] args) throws InterruptedException {
Runnable task = () -> {
int i = 0;
while (true) {
i++;
if (i != 0) {
boolean b = 1 % i == 0;
}
}
};
new Thread(new LogTimer()).start();
Thread.sleep(2000);
new Thread(task).start();
}
public static class LogTimer implements Runnable {
@Override
public void run() {
while (true) {
long start = System.currentTimeMillis();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// do nothing
}
LOGGER.info("timeElapsed={}", System.currentTimeMillis() - start);
}
}
}
}
결과는 다음과 같습니다.
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1003
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=13331
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1006
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1003
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004
무한 작업이 13.3 초 동안 다른 모든 스레드를 차단하는 이유를 이해하지 못합니다. 스레드 우선 순위 및 기타 설정을 변경하려고했지만 아무것도 작동하지 않았습니다.
이 문제를 해결하기위한 제안이있는 경우 (OS 컨텍스트 전환 설정 조정 포함) 알려주십시오.
여기에서의 모든 설명 ( Peter Lawrey 덕분에 ) 덕분 에이 일시 중지의 주요 원인은 루프 내부의 안전 지점이 거의 도달하지 않아 JIT 컴파일 코드 교체를 위해 모든 스레드를 중지하는 데 오랜 시간이 걸린다는 것을 알았습니다.
그러나 나는 더 깊이 들어가서 safepoint에 도달하는 이유를 거의 찾지 않기로 결정 했습니다. while
이 경우 루프 의 백 점프 가 "안전"하지 않은 이유가 약간 혼란 스러웠습니다 .
그래서 나는 -XX:+PrintAssembly
모든 영광을 소환 하여
-XX:+UnlockDiagnosticVMOptions \
-XX:+TraceClassLoading \
-XX:+DebugNonSafepoints \
-XX:+PrintCompilation \
-XX:+PrintGCDetails \
-XX:+PrintStubCode \
-XX:+PrintAssembly \
-XX:PrintAssemblyOptions=-Mintel
조사 결과 람다 C2
컴파일러 의 세 번째 재 컴파일 후에 루프 내에서 safepoint 폴링이 완전히 사라졌습니다.
최신 정보
프로파일 링 단계에서 변수 i
는 0과 같지 않았습니다. 그렇기 C2
때문에이 분기를 추측 적으로 최적화하여 루프가 다음과 같이 변형되었습니다.
for (int i = OSR_value; i != 0; i++) {
if (1 % i == 0) {
uncommon_trap();
}
}
uncommon_trap();
원래 무한 루프는 카운터를 사용하여 일반 유한 루프로 재구성되었습니다! 유한 카운트 루프에서 Safepoint 폴링을 제거하기위한 JIT 최적화로 인해이 루프에도 Safepoint 폴링이 없었습니다.
After some time, i
wrapped back to 0
, and the uncommon trap was taken. The method was deoptimized and continued execution in the interpreter. During recompilation with a new knowledge C2
recognized the infinite loop and gave up compilation. The rest of the method proceeded in the interpreter with proper safepoints.
There is a great must-read blog post "Safepoints: Meaning, Side Effects and Overheads" by Nitsan Wakart covering safepoints and this particular issue.
Safepoint elimination in very long counted loops is known to be a problem. The bug JDK-5014723
(thanks to Vladimir Ivanov) addresses this problem.
The workaround is available until the bug is finally fixed.
- You can try using
-XX:+UseCountedLoopSafepoints
(it will cause overall performance penalty and may lead to JVM crashJDK-8161147
). After using itC2
compiler continue keeping safepoints at the back jumps and original pause disappears completely. You can explicitly disable compilation of problematic method by using
-XX:CompileCommand='exclude,binary/class/Name,methodName'
Or you can rewrite your code by adding safepoint manually. For example
Thread.yield()
call at the end of cycle or even changingint i
tolong i
(thanks, Nitsan Wakart) will also fix pause.
In short, the loop you have has no safe point inside it except when i == 0
is reached. When this method is compiled and triggers the code to be replaced it needs to bring all the threads to a safe point, but this takes a very long time, locking up not just the thread running the code but all threads in the JVM.
I added the following command line options.
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintCompilation
I also modified the code to use floating point which appears to take longer.
boolean b = 1.0 / i == 0;
And what I see in the output is
timeElapsed=100
Application time: 0.9560686 seconds
41423 280 % 4 TestBlockingThread::lambda$main$0 @ -2 (27 bytes) made not entrant
Total time for which application threads were stopped: 40.3971116 seconds, Stopping threads took: 40.3967755 seconds
Application time: 0.0000219 seconds
Total time for which application threads were stopped: 0.0005840 seconds, Stopping threads took: 0.0000383 seconds
41424 281 % 3 TestBlockingThread::lambda$main$0 @ 2 (27 bytes)
timeElapsed=40473
41425 282 % 4 TestBlockingThread::lambda$main$0 @ 2 (27 bytes)
41426 281 % 3 TestBlockingThread::lambda$main$0 @ -2 (27 bytes) made not entrant
timeElapsed=100
Note: for code to be replaced, threads have to be stopped at a safe point. However it appears here that such a safe point is reached very rarely (possibly only when i == 0
Changing the task to
Runnable task = () -> {
for (int i = 1; i != 0 ; i++) {
boolean b = 1.0 / i == 0;
}
};
I see a similar delay.
timeElapsed=100
Application time: 0.9587419 seconds
39044 280 % 4 TestBlockingThread::lambda$main$0 @ -2 (28 bytes) made not entrant
Total time for which application threads were stopped: 38.0227039 seconds, Stopping threads took: 38.0225761 seconds
Application time: 0.0000087 seconds
Total time for which application threads were stopped: 0.0003102 seconds, Stopping threads took: 0.0000105 seconds
timeElapsed=38100
timeElapsed=100
Adding code to the loop carefully you get a longer delay.
for (int i = 1; i != 0 ; i++) {
boolean b = 1.0 / i / i == 0;
}
gets
Total time for which application threads were stopped: 59.6034546 seconds, Stopping threads took: 59.6030773 seconds
However, change the code to use a native method which always has a safe point (if it is not an intrinsic)
for (int i = 1; i != 0 ; i++) {
boolean b = Math.cos(1.0 / i) == 0;
}
prints
Total time for which application threads were stopped: 0.0001444 seconds, Stopping threads took: 0.0000615 seconds
Note: adding if (Thread.currentThread().isInterrupted()) { ... }
to a loop adds a safe point.
Note: This happened on a 16 core machine so there is no lack of CPU resources.
Found the answer of why. They are called safepoints, and are best known as the Stop-The-World that happens because of GC.
See this articles: Logging stop-the-world pauses in JVM
Different events can cause the JVM to pause all the application threads. Such pauses are called Stop-The-World (STW) pauses. The most common cause for an STW pause to be triggered is garbage collection (example in github) , but different JIT actions (example), biased lock revocation (example), certain JVMTI operations , and many more also require the application to be stopped.
The points at which the application threads may be safely stopped are called, surprise, safepoints. This term is also often used to refer to all the STW pauses.
It is more or less common that GC logs are enabled. However, this does not capture information on all the safepoints. To get it all, use these JVM options:
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
If you are wondering about the naming explicitly referring to GC, don’t be alarmed – turning on these options logs all of the safepoints, not just garbage collection pauses. If you run a following example (source in github) with the flags specified above.
Reading the HotSpot Glossary of Terms, it defines this:
safepoint
A point during program execution at which all GC roots are known and all heap object contents are consistent. From a global point of view, all threads must block at a safepoint before the GC can run. (As a special case, threads running JNI code can continue to run, because they use only handles. During a safepoint they must block instead of loading the contents of the handle.) From a local point of view, a safepoint is a distinguished point in a block of code where the executing thread may block for the GC. Most call sites qualify as safepoints. There are strong invariants which hold true at every safepoint, which may be disregarded at non-safepoints. Both compiled Java code and C/C++ code be optimized between safepoints, but less so across safepoints. The JIT compiler emits a GC map at each safepoint. C/C++ code in the VM uses stylized macro-based conventions (e.g., TRAPS) to mark potential safepoints.
Running with the above mentioned flags, I get this output:
Application time: 0.9668750 seconds
Total time for which application threads were stopped: 0.0000747 seconds, Stopping threads took: 0.0000291 seconds
timeElapsed=1015
Application time: 1.0148568 seconds
Total time for which application threads were stopped: 0.0000556 seconds, Stopping threads took: 0.0000168 seconds
timeElapsed=1015
timeElapsed=1014
Application time: 2.0453971 seconds
Total time for which application threads were stopped: 10.7951187 seconds, Stopping threads took: 10.7950774 seconds
timeElapsed=11732
Application time: 1.0149263 seconds
Total time for which application threads were stopped: 0.0000644 seconds, Stopping threads took: 0.0000368 seconds
timeElapsed=1015
Notice the third STW event:
Total time stopped: 10.7951187 seconds
Stopping threads took: 10.7950774 seconds
JIT itself took virtually no time, but once the JVM had decided to perform a JIT compilation, it entered STW mode, however since the code to be compiled (the infinite loop) doesn't have a call site, no safepoint was ever reached.
The STW ends when JIT eventually gives up waiting and concludes the code is in an infinite loop.
After following the comment threads and some testing on my own, I believe that the pause is caused by the JIT compiler. Why the JIT compiler is taking such a long time is beyond my ability to debug.
However, since you only asked for how to prevent this, I have a solution:
Pull your infinite loop into a method where it can be excluded from the JIT compiler
public class TestBlockingThread {
private static final Logger LOGGER = Logger.getLogger(TestBlockingThread.class.getName());
public static final void main(String[] args) throws InterruptedException {
Runnable task = () -> {
infLoop();
};
new Thread(new LogTimer()).start();
Thread.sleep(2000);
new Thread(task).start();
}
private static void infLoop()
{
int i = 0;
while (true) {
i++;
if (i != 0) {
boolean b = 1 % i == 0;
}
}
}
Run your program with this VM argument:
-XX:CompileCommand=exclude,PACKAGE.TestBlockingThread::infLoop (replace PACKAGE with your package information)
You should get a message like this to indicate when the method would have been JIT-compiled:
### Excluding compile: static blocking.TestBlockingThread::infLoop
you may notice that I put the class into a package called blocking
'Programming' 카테고리의 다른 글
테이블 행 높이 설정 (0) | 2020.07.11 |
---|---|
Node.js에서 사용하기위한 간단한 Javascript DB (0) | 2020.07.11 |
CSS 특이점 (0) | 2020.07.11 |
var_dump, var_export 및 print_r의 차이점 (0) | 2020.07.11 |
내 리눅스 시스템에서 모든 사람을 위해 환경 변수를 설정하는 방법은 무엇입니까? (0) | 2020.07.11 |