Kishore Kumar Pusukuri
2011-01-29 07:28:53 UTC
Hi,
I am playing with FX scheduling policy with different time-quanta on SPECOMP multithreaded programs. I am using "prstat -Lm" to analyze the effect of different time-quanta on the performance of the programs.
Most of the programs experience "system traps" (TRP) with FX 10ms time-quantum. However, there are no traps with FX 100ms, 200ms, and higher time-quantum values. I understand that based on the time-quantum value, there will be change in other prstat fields such as context-switches, lock contention etc., but I don't understand why I am getting "traps" only when I used FX "10ms" time-quantum. My machine is a multi-core AMD Opteron running Solaris 10.
Please see the output of prstat below (for FX 200ms, 100ms, and 10ms). I am also providing stack traces of FX with 10ms run.
Please clarify my confusion. Many many thanks.
FX with 200ms
-------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23062 user 77 0.1 0.0 0.0 0.0 3.1 0.0 20 119 88 1K 0 myprogram/1
23062 user 73 0.0 0.0 0.0 0.0 9.6 0.0 18 185 351 206 0 myprogram/11
23062 user 72 0.0 0.0 0.0 0.0 9.3 0.0 19 185 72 204 0 myprogram/8
23062 user 71 0.0 0.0 0.0 0.0 10 0.0 19 178 83 194 0 myprogram/17
.....
.....
23062 user 69 0.0 0.0 0.0 0.0 9.5 0.0 21 180 85 196 0 myprogram/18
...
...
23062 user 69 0.0 0.0 0.0 0.0 9.8 0.0 21 193 84 206 0 myprogram/31
FX with 100ms
-------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23089 user 76 0.1 0.0 0.0 0.0 2.0 0.0 22 138 164 2K 0 myprogram/1
23089 user 70 0.0 0.0 0.0 0.0 10 0.0 20 211 157 227 0 myprogram/10
23089 user 70 0.0 0.0 0.0 0.0 10 0.0 20 220 435 238 0 myprogram/7
....
....
23089 user 69 0.0 0.0 0.0 0.0 10 0.0 20 214 153 228 0 myprogram/21
23089 user 69 0.0 0.0 0.0 0.0 10 0.0 21 221 138 241 0 myprogram/4
....
...
23089 user 68 0.0 0.0 0.0 0.0 10 0.0 22 206 155 223 0 myprogram/9
23089 user 68 0.0 0.0 0.0 0.0 10 0.0 22 215 136 232 0 myprogram/24
FX with 10ms
------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23105 user 78 0.1 0.1 0.0 0.0 1.3 0.0 21 168 1K 2K 0 myprogram/1
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 21 241 1K 267 0 myprogram/2
23105 user 68 0.0 0.2 0.0 0.0 10 0.0 22 245 1K 270 0 myprogram/16
23105 user 68 0.0 0.2 0.0 0.0 10 0.0 22 246 1K 271 0 myprogram/25
....
...
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 22 255 1K 281 0 myprogram/8
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 22 235 1K 260 0 myprogram/21
...
...
23105 user 67 0.0 0.1 0.0 0.0 10 0.0 22 250 1K 273 0 myprogram/20
Stack traces of the program with FX 10ms
----------------------------------------
$ pstack 23137/25
23137: ./myprogram
----------------- lwp# 25 / thread# 25 --------------------
fffffd7ffa86abcb omp_set_lock () + 8b
000000000041e813 _$d1A593.mm_fv_update_nonbon () + 10c3
000001bd00000001 ???????? ()
4054eef5073a9994 ???????? ()
$ pstack 23137/16
23137: ./myprogram
----------------- lwp# 16 / thread# 16 --------------------
000000000041eb6d _$d1A593.mm_fv_update_nonbon () + 141d
000001bd00000001 ???????? ()
4054ec19011d549c ???????? ()
$ pstack 23137/21
23137: ./myprogram
----------------- lwp# 21 / thread# 21 --------------------
000000000041ebfa _$d1A593.mm_fv_update_nonbon () + 14aa
000001bd00000001 ???????? ()
4054ea1dc63c899c ???????? ()
$ pstack 23137/2
23137: ./myprogram
----------------- lwp# 2 / thread# 2 --------------------
fffffd7ffa896f12 atomic_store () + 2
000000000041ec68 _$d1A593.mm_fv_update_nonbon () + 1518
000001bd00000001 ???????? ()
4054e9adebe3c3da ???????? ()
I am playing with FX scheduling policy with different time-quanta on SPECOMP multithreaded programs. I am using "prstat -Lm" to analyze the effect of different time-quanta on the performance of the programs.
Most of the programs experience "system traps" (TRP) with FX 10ms time-quantum. However, there are no traps with FX 100ms, 200ms, and higher time-quantum values. I understand that based on the time-quantum value, there will be change in other prstat fields such as context-switches, lock contention etc., but I don't understand why I am getting "traps" only when I used FX "10ms" time-quantum. My machine is a multi-core AMD Opteron running Solaris 10.
Please see the output of prstat below (for FX 200ms, 100ms, and 10ms). I am also providing stack traces of FX with 10ms run.
Please clarify my confusion. Many many thanks.
FX with 200ms
-------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23062 user 77 0.1 0.0 0.0 0.0 3.1 0.0 20 119 88 1K 0 myprogram/1
23062 user 73 0.0 0.0 0.0 0.0 9.6 0.0 18 185 351 206 0 myprogram/11
23062 user 72 0.0 0.0 0.0 0.0 9.3 0.0 19 185 72 204 0 myprogram/8
23062 user 71 0.0 0.0 0.0 0.0 10 0.0 19 178 83 194 0 myprogram/17
.....
.....
23062 user 69 0.0 0.0 0.0 0.0 9.5 0.0 21 180 85 196 0 myprogram/18
...
...
23062 user 69 0.0 0.0 0.0 0.0 9.8 0.0 21 193 84 206 0 myprogram/31
FX with 100ms
-------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23089 user 76 0.1 0.0 0.0 0.0 2.0 0.0 22 138 164 2K 0 myprogram/1
23089 user 70 0.0 0.0 0.0 0.0 10 0.0 20 211 157 227 0 myprogram/10
23089 user 70 0.0 0.0 0.0 0.0 10 0.0 20 220 435 238 0 myprogram/7
....
....
23089 user 69 0.0 0.0 0.0 0.0 10 0.0 20 214 153 228 0 myprogram/21
23089 user 69 0.0 0.0 0.0 0.0 10 0.0 21 221 138 241 0 myprogram/4
....
...
23089 user 68 0.0 0.0 0.0 0.0 10 0.0 22 206 155 223 0 myprogram/9
23089 user 68 0.0 0.0 0.0 0.0 10 0.0 22 215 136 232 0 myprogram/24
FX with 10ms
------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
23105 user 78 0.1 0.1 0.0 0.0 1.3 0.0 21 168 1K 2K 0 myprogram/1
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 21 241 1K 267 0 myprogram/2
23105 user 68 0.0 0.2 0.0 0.0 10 0.0 22 245 1K 270 0 myprogram/16
23105 user 68 0.0 0.2 0.0 0.0 10 0.0 22 246 1K 271 0 myprogram/25
....
...
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 22 255 1K 281 0 myprogram/8
23105 user 68 0.0 0.1 0.0 0.0 10 0.0 22 235 1K 260 0 myprogram/21
...
...
23105 user 67 0.0 0.1 0.0 0.0 10 0.0 22 250 1K 273 0 myprogram/20
Stack traces of the program with FX 10ms
----------------------------------------
$ pstack 23137/25
23137: ./myprogram
----------------- lwp# 25 / thread# 25 --------------------
fffffd7ffa86abcb omp_set_lock () + 8b
000000000041e813 _$d1A593.mm_fv_update_nonbon () + 10c3
000001bd00000001 ???????? ()
4054eef5073a9994 ???????? ()
$ pstack 23137/16
23137: ./myprogram
----------------- lwp# 16 / thread# 16 --------------------
000000000041eb6d _$d1A593.mm_fv_update_nonbon () + 141d
000001bd00000001 ???????? ()
4054ec19011d549c ???????? ()
$ pstack 23137/21
23137: ./myprogram
----------------- lwp# 21 / thread# 21 --------------------
000000000041ebfa _$d1A593.mm_fv_update_nonbon () + 14aa
000001bd00000001 ???????? ()
4054ea1dc63c899c ???????? ()
$ pstack 23137/2
23137: ./myprogram
----------------- lwp# 2 / thread# 2 --------------------
fffffd7ffa896f12 atomic_store () + 2
000000000041ec68 _$d1A593.mm_fv_update_nonbon () + 1518
000001bd00000001 ???????? ()
4054e9adebe3c3da ???????? ()
--
This message posted from opensolaris.org
This message posted from opensolaris.org