DRAGONFLY NVME BENCHMARKS 2 x XEON 2620v4 (16 cores / 32 threads @ 2.1 GHz, turbo 3.0 GHz) 16 July 2016 Matthew Dillon nvme0: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA03031 nscount=1 nvme0: NVME Version 1.1 maxqe=16384 caps=00f000203c013fff nvme0: mapped 9 MSIX IRQs nvme0: Request 64/32 queues, Returns 8/8 queues, rw-sep map (8, 8) nvme1: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA02988 nscount=1 nvme1: NVME Version 1.1 maxqe=16384 caps=00f000203c013fff nvme1: mapped 9 MSIX IRQs nvme1: Request 64/32 queues, Returns 8/8 queues, rw-sep map (8, 8) nvme2: Model INTEL_SSDPEDMW400G4 BaseSerial CVCQ535100LC400AGN nscount=1 nvme2: NVME Version 1.0 maxqe=4096 caps=0000002028010fff nvme2: mapped 32 MSIX IRQs nvme2: Request 64/32 queues, Returns 31/31 queues, rw-sep map (31, 31) NOTES ON IOSTAT and SYSTAT -pv 1 OUTPUT: * The iostat output shows the block-size, tps (IOPS), and throughput for each device under test. 'id' is for total-system idle percentage, and is correct. I just noticed that it isn't reporting user 'us' time correctly. The idle time is verified by the systat -pv 1 output. Each line is one second. * The systat -pv 1 breaks down system resources by cpu. timer Timer interrupts/sec (lapic timer). ipi IPIs/sec (general IPIs and invltlb/invlpg IPIs). Most of the IPI traffic will be for pmap invalidations and wakeup()s. extint External interrupts (from the NVMe and AHCI SSDs). The NVMe devices use MSI-x, the AHCI devices use MSI. Substantially all the extints you see are from the NVMe devices. As you can see we distribute interrupts between cpus fairly well. smpcol SMP collisions (with -pv 1 this is per-second). Any value less than 10000 or so is good. The collision count can build very quickly since spin-locks increment the per-cpu value on each loop. label Approximate reason for the SMP collisions, typically a lock name, spin-lock name, or procedure name. (heuristical, accurate if you stare at it for a few seconds but not so much in a snapshot). * The NVMe device spec supports up to 65535 queues and up to 65535 entries per queue, but NVMe devices are not likely to implement these limits. The Samsungs only implements 8 queues, the Intel only 31 (non-inclusive of the admin queue). Pretty stupid actually, since there's no need to use the interrupt mask registers with MSI-x. But these are first-generation NVMe devices. The devices have generous queue entry limits (maxqe, 16384 for the Samsung, 4096 for the Intel, per queue). However, the driver I wrote for DFly currently sets an arbitrary 256-entry limit per queue. This is more than generous enough and really has to be done in order to pre-allocate (for performance reasons) all necessary DMA descriptor buffers for the highest-possible transfer size per entry. Otherwise the kernel would have to allocate an insane amount of ram that would mostly go unused. DragonFly itself implements fine-grained, per-cpu MSI-x support and thus has no significant MSI-x limitations. My preference would be to have one read and one write queue per cpu per NVMe device and it is really unfortunate that these first-generation devices do not have a generous number of queues or MSI-x vectors to make that possible. For the Samsungs my DFly driver is implementing a combined R/W queue for each group of 4 cpu threads (8 queues, 32 cpu threads total). For the Intel the driver is able to assign a unique queue to all but two cpus, with cpu #31 and cpu #0 sharing a queue. * WARNING! The performance of the Intel NVMe SSD is destroyed in larger-block tests when I use 65536 bytes. So for now Im using 32768 bytes for all large-block tests. I consider this a serious bug in the Intel NVMe SSD chipset and hardware. It certainly isn't a bug in DragonFly. The Samsung's don't have this problem. LARGE BLOCK TEST BEFORE PHYSIO CHANGES (RESULTS ~SAME AFTER PHYSIO CHANGES TOO) This is running two randread's per nvme card x 3 cards, 32KB block size, random seek, 32 threads (64 threads per card, total of 196 threads). All tests are on a 16GB partition via the raw device. The partition is completely filled with /dev/urandom data. Example test line using randread from /usr/src/sys/test/sysperf: randread /dev/nvme0s1b 4096 95 32 (two of these for each of nvme0, nvme1, and nvme2). Aggregate throughput is around 5.0-5.1 GBytes/sec with the three NVMe devices (NVMe driver) and 6.5 GBytes/sec if I throw in another four SATA devices (AHCI driver). tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 83 32.00 55135 1722.96 32.00 53221 1663.16 32.00 65507 2047.03 0 0 27 1 72 0 79 32.00 55234 1726.07 32.00 53455 1670.51 32.00 64968 2030.23 0 0 26 1 73 0 81 32.00 46728 1460.29 32.00 53323 1666.36 32.00 65009 2031.52 0 0 24 1 75 0 79 32.00 47748 1492.08 32.00 52949 1654.67 32.00 64619 2019.33 0 0 23 1 76 0 80 32.00 37377 1168.03 32.00 53044 1657.61 32.00 65646 2051.42 0 0 20 1 78 0 79 32.00 48556 1517.37 32.00 53126 1660.18 32.00 65465 2045.71 0 0 25 1 74 timer ipi extint user% sys% intr% idle% smpcol label total 1439669 160903 62132 cpu0 290 51350 15 0.7 20.8 1.7 76.9 1890 Xrelpbuf cpu1 283 51882 5927 0.0 23.9 0.8 75.4 1934 Xrelpbuf cpu2 283 59501 8848 0.0 26.9 0.8 72.3 2138 Xgetpbuf_kva cpu3 282 44157 4956 0.0 23.9 1.5 74.6 1463 X_vm_page_queue cpu4 283 62401 6184 0.8 31.6 2.3 65.4 2160 Xrelpbuf cpu5 282 47571 4961 0.0 26.9 0.0 73.1 1758 Xgetpbuf_kva cpu6 283 38836 5262 0.0 18.5 0.0 81.5 1551 X_vm_page_queue cpu7 283 43718 5596 0.0 19.2 3.1 77.7 1324 X_vm_page_queue cpu8 282 47511 6324 0.0 20.8 3.1 76.1 1945 Xrelpbuf cpu9 282 44625 5106 0.0 22.3 0.0 77.7 2151 X_vm_page_queue cpu10 282 50254 5860 0.0 23.1 1.5 75.4 2041 Xrelpbuf cpu11 283 49734 7519 0.8 23.1 1.5 74.6 2200 Xrelpbuf cpu12 283 50408 6840 0.0 16.9 1.5 81.5 2523 Xrelpbuf cpu13 283 53601 9031 0.8 26.2 1.5 71.5 2154 X_vm_page_queue cpu14 282 53460 7443 0.0 27.7 4.6 67.7 2520 X_vm_page_queue cpu15 282 51619 6735 0.0 25.4 0.8 73.8 2035 Xrelpbuf cpu16 282 49195 10131 0.0 22.3 0.8 76.9 1534 X_vm_page_queue cpu17 283 48078 4358 0.8 14.6 0.8 83.8 1320 Xrelpbuf cpu18 282 32368 27 0.0 18.5 0.0 81.5 1336 Xrelpbuf cpu19 283 40742 4077 0.0 18.5 0.0 81.5 1477 Xrelpbuf cpu20 283 44458 4131 0.0 26.2 0.0 73.8 1794 Xrelpbuf cpu21 283 33963 3107 0.0 22.3 0.8 76.9 1208 Xgetpbuf_kva cpu22 283 37800 3056 0.0 16.9 0.0 83.1 1676 Xrelpbuf cpu23 284 37727 3050 0.0 24.6 0.0 75.4 1637 Xgetpbuf_kva cpu24 284 35968 2057 0.0 13.9 0.0 86.1 1698 Xgetpbuf_kva cpu25 284 46486 6153 0.0 20.8 0.0 79.2 2975 Xgetpbuf_kva cpu26 282 39188 5121 2.3 21.6 0.8 75.4 2291 Xgetpbuf_kva cpu27 284 34978 3112 0.0 18.5 0.8 80.8 2109 X_vm_page_queue cpu28 282 34981 4503 0.8 13.9 2.3 83.1 2064 Xgetpbuf_kva cpu29 283 36798 3089 1.5 20.8 0.8 76.9 2408 Xrelpbuf cpu30 283 38774 4151 1.5 22.3 2.3 73.8 2612 Xgetpbuf_kva cpu31 283 47537 4173 1.5 26.2 0.0 72.3 2206 Xgetpbuf_kva SMALL BLOCK RANDREAD TEST BEFORE PHYSIO CHANGES This is the 4K block size randread test. Individually the cards can do around 200K IOPS. Currently all three together perform more poorly, only 160K IOPS in aggregate. Roughly speaking the problem is associated with the IPI rate which tends to max out at 150K IPI/sec. These IPIs are due to kernel_pmap manipulation related to the buffer cache. Also pbuf lock contention. As you can see, the cpu's are fully engaged (0% idle). tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 74 4.00 46882 183.13 4.00 54310 212.14 4.00 56358 220.15 0 0 96 4 0 0 73 4.00 47400 185.16 4.00 53260 208.04 4.00 56695 221.44 0 0 96 4 0 0 72 4.00 48413 189.11 4.00 54059 211.18 4.00 54986 214.78 0 0 96 3 0 0 69 4.00 47826 186.82 4.00 54065 211.19 4.00 55118 215.30 0 0 96 3 0 timer ipi extint user% sys% intr% idle% smpcol label total 4405203 139569 1834874 cpu0 282 156766 10 1.5 94.7 3.8 0.0 50330 Xrelpbuf cpu1 276 152035 4954 0.0 96.2 3.8 0.0 50976 Xgetpbuf_kva cpu2 277 149827 7917 1.5 90.0 8.5 0.0 45292 Xnvqlk cpu3 276 147195 5634 0.0 93.9 6.1 0.0 45478 Xrelpbuf cpu4 276 146153 5970 2.3 95.4 2.3 0.0 46169 Xgetpbuf_kva cpu5 ... cpu31 SMALL BLOCK RANDREAD TEST AFTER PHYSIO CHANGES This is a 4K block size test after changes to the pbuf system used by physio. Most telling of these changes is that tests on each card no longer interfere with each other, the IPI rate is way way down, lock contention is also way way down, and the system is able to achieve these results will maintaining 75% idle. tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 69 4.00 274582 1072.58 4.00 274437 1072.01 4.00 381999 1492.17 1 0 19 4 76 0 57 4.00 274659 1072.89 4.00 274340 1071.64 4.00 382228 1493.06 1 0 20 3 76 0 65 4.00 274597 1072.65 4.00 274436 1072.01 4.00 380048 1484.54 1 0 18 3 78 0 65 4.00 274373 1071.77 4.00 273993 1070.29 4.00 380542 1486.46 1 0 18 3 77 0 59 4.00 274157 1070.93 4.00 274310 1071.53 4.00 380431 1486.04 1 0 18 4 77 0 58 4.00 275083 1074.54 4.00 274157 1070.93 4.00 381172 1488.93 1 0 20 3 76 timer ipi extint user% sys% intr% idle% smpcol label total 151375 940226 5034 cpu0 239 23672 8 0.8 10.7 1.6 86.9 573 Xnvqlk cpu1 232 4933 37915 0.8 27.7 2.3 69.2 85 Xnvqlk cpu2 232 393 51882 2.3 23.1 6.2 68.5 411 Xnvqlk cpu3 232 4338 30340 2.3 13.8 3.8 80.0 18 Xgetpbuf_mem cpu4 232 4930 40823 1.5 17.7 2.3 78.5 99 Xnvqlk cpu5 232 59 40732 0.8 21.5 5.4 72.3 85 Xrelpbuf cpu6 232 4989 33267 0.0 18.5 4.6 76.9 94 Xnvqlk cpu7 232 103 45010 2.3 26.9 4.6 66.2 132 Xnvqlk cpu8 232 25 27719 1.5 16.9 6.9 74.6 48 Xrelpbuf cpu9 232 0 30153 0.0 11.5 1.5 86.9 27 Xgetpbuf_mem cpu10 232 9887 37865 0.8 20.0 3.1 76.2 166 Xnvqlk cpu11 232 86 41407 1.5 20.8 7.7 70.0 103 Xnvqlk cpu12 232 4856 26126 0.8 13.1 2.3 83.8 15 Xrelpbuf cpu13 232 303 38798 2.3 18.5 3.8 75.4 323 Xnvqlk cpu14 233 170 34764 0.0 20.0 1.5 78.5 187 Xnvqlk cpu15 232 0 31869 4.6 16.2 5.4 73.8 23 Xrelpbuf cpu16 232 550 56215 2.3 30.0 2.3 65.4 575 Xnvqlk cpu17 232 9143 21787 0.8 16.9 3.8 78.5 68 Xrelpbuf cpu18 233 24311 0 0.8 9.2 0.0 90.0 550 Xnvqlk cpu19 233 938 24132 0.8 16.9 2.3 80.0 29 Xnvqlk cpu20 232 6374 12117 1.5 6.9 1.5 90.0 163 Xnvqlk cpu21 232 2909 18207 0.0 10.8 1.5 87.7 86 Xnvqlk cpu22 234 4073 18072 0.0 12.3 3.1 84.6 88 Xnvqlk cpu23 232 5066 18194 0.8 14.6 0.0 84.6 148 Xnvqlk cpu24 232 2 34357 1.5 16.9 3.8 77.7 27 Xrelpbuf cpu25 232 4850 36118 0.0 20.0 3.8 76.2 20 Xrelpbuf cpu26 232 4344 24183 0.8 15.4 0.8 83.1 189 Xnvqlk cpu27 232 9331 24196 0.0 11.5 0.8 87.7 138 Xnvqlk cpu28 232 0 30248 0.8 16.9 3.1 79.2 31 Xgetpbuf_mem cpu29 232 11313 24211 0.8 18.5 5.4 75.4 310 Xnvqlk cpu30 232 6779 19382 0.0 16.2 1.5 82.3 203 Xnvqlk cpu31 232 2648 30129 0.8 17.7 3.1 78.5 20 Xrelpbuf SMALL BLOCK RANDREAD TEST AFTER PHYSIO CHANGES 3 x NVMe + 4 x SATA In this experiment I had three NVMe drives and four SATA drives, achieving 1.05M IOPS with random 4K reads which left the machine with 63% idle. I need to buy some more NVMe drives, I think this machine should be capable of 2M IOPS+ before it runs out of cpu. I did have to make one change, and that was to increase the number of preallocated kernel pbufs from 256 to 512 since in this test I am running 320 user process threads for the test. Without that change the IOPS becomes limited by avaiable pbufs in the kernel. tty da0 da1 da2 da3 nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 73 4.00 57746 225.57 4.00 77116 301.23 4.00 29216 114.12 4.00 29486 115.18 4.00 274191 1071.05 4.00 274273 1071.38 4.00 380702 1487.08 2 0 31 6 62 0 66 4.00 57763 225.64 4.00 76130 297.38 4.00 28992 113.25 4.00 29605 115.65 4.00 274923 1073.91 4.00 273980 1070.23 4.00 380305 1485.53 2 0 29 6 64 0 65 4.00 57464 224.47 4.00 75386 294.48 4.00 28987 113.23 4.00 29475 115.14 4.00 274611 1072.71 4.00 273802 1069.53 4.00 379282 1481.51 2 0 29 5 65 0 65 4.00 57989 226.52 4.00 74467 290.89 4.00 28832 112.63 4.00 29317 114.52 4.00 274584 1072.56 4.00 274162 1070.95 4.00 379796 1483.53 2 0 29 6 63 0 66 4.00 57558 224.84 4.00 74587 291.36 4.00 28781 112.43 4.00 29491 115.20 4.00 273922 1070.00 4.00 273702 1069.14 4.00 379750 1483.33 2 0 31 6 61 0 65 4.00 57662 225.24 4.00 75338 294.29 4.00 28795 112.48 4.00 29495 115.22 4.00 274470 1072.14 4.00 273672 1069.02 4.00 379529 1482.51 2 0 29 5 64 timer ipi extint user% sys% intr% idle% smpcol label total 990162 982494 353883 cpu0 40413 87057 30428 3.8 40.0 4.6 51.5 24988 Xahcicam cpu1 282 155197 39109 2.3 63.1 11.5 23.1 15485 Xahcicam cpu2 282 4543 48734 4.6 30.0 4.6 60.8 2783 Xrelpbuf cpu3 282 165138 8269 0.0 40.0 32.3 27.7 56868 Xahcipo cpu4 282 1161 50861 0.8 31.5 5.4 62.3 212 Xnvqlk cpu5 282 5746 48141 1.5 33.8 5.4 59.2 1255 Xahcicam cpu6 282 26873 32241 0.8 27.7 6.9 64.6 15695 Xahcicam cpu7 282 33200 25926 0.8 27.7 3.8 67.7 16462 Xahcicam cpu8 282 10319 56108 2.3 29.2 8.5 60.0 1136 Xnvqlk cpu9 282 10113 30148 0.8 21.5 4.6 73.1 4188 Xahcicam cpu10 282 5200 42295 0.0 25.4 3.1 71.5 820 Xrelpbuf cpu11 282 5701 38645 3.1 25.4 7.7 63.8 2433 Xgetpbuf_mem cpu12 282 3154 52138 0.8 29.2 6.2 63.8 358 Xnvqlk cpu13 282 13507 40507 3.1 31.5 3.8 61.5 1611 Xahcicam cpu14 283 15422 33537 0.8 29.2 1.5 68.5 7283 Xahcicam cpu15 282 9317 36373 0.8 30.8 10.0 58.5 2681 Xahcicam cpu16 282 3391 43111 0.0 39.2 6.9 53.8 1582 Xahcicam cpu17 282 32044 31724 1.5 36.2 3.8 58.5 16379 Xahcicam cpu18 282 52803 26 0.0 30.0 0.0 70.0 37945 Xahcicam cpu19 282 44982 0 0.0 14.6 0.0 85.4 26327 Xahcicam cpu20 283 13102 29949 2.3 23.8 3.8 70.0 4404 Xahcicam cpu21 282 37836 23980 2.3 26.2 4.6 66.9 18587 Xahcicam cpu22 282 38797 17959 2.3 26.9 1.5 69.2 23421 Xahcicam cpu23 282 33839 17971 0.8 26.9 3.8 68.5 21431 Xahcicam cpu24 282 35791 12059 0.0 24.6 2.3 73.1 2292 Xahcicam cpu25 284 18204 30009 1.5 26.9 2.3 69.2 3199 Xahcicam cpu26 282 26933 23878 2.3 27.7 3.1 66.9 13126 Xahcicam cpu27 283 18406 30063 2.3 30.8 1.5 65.4 5294 Xahcicam cpu28 282 28561 15075 1.5 28.5 1.5 68.5 8788 Xahcicam cpu29 283 10091 30173 1.5 23.8 2.3 72.3 2494 Xahcicam cpu30 283 21472 35899 0.8 20.0 6.2 73.1 3660 Xahcicam cpu31 283 22262 27158 0.8 34.6 5.4 59.2 10696 Xahcicam LARGE BLOCK RANDREAD TEST AFTER PHYSIO CHANGES 3 x NVMe + 4 x SATA This test is using a 32KB block size (in deference to the Intel NVMe card which has *HORRIBLE* performance with 64KB blocks. Literally only 300MBytes/sec if I use 64KB blocks, verses 2GBytes/sec with 32KB blocks). In this test I am clearing 6.5 GBytes/sec using 32 user process threads per device except for the Intel (nvme2) where I use 64 user process threads. (Since the Intel has a nearly 1:1 queue mapping with 31 queues I need to queue at least two read requests per queue to maximize performance). tty da0 da1 da2 da3 nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 71 32.00 16115 503.58 32.00 13834 432.30 32.00 6394 199.83 32.00 8560 267.51 32.00 55535 1735.47 32.00 53310 1665.97 32.00 64949 2029.63 0 0 9 2 89 0 64 32.00 16086 502.68 32.00 13791 430.97 32.00 6394 199.80 32.00 8570 267.82 32.00 47155 1473.58 32.00 52977 1655.53 32.00 65181 2036.88 1 0 8 2 90 0 57 32.00 16089 502.78 32.00 13839 432.47 32.00 6313 197.27 32.00 8520 266.26 32.00 47577 1486.78 32.00 53338 1666.80 32.00 65401 2043.78 0 0 8 2 90 0 57 32.00 16057 501.78 32.00 13793 431.04 32.00 6318 197.43 32.00 8509 265.89 32.00 47063 1470.73 32.00 53184 1662.00 32.00 64865 2027.04 0 0 9 1 89 0 62 32.00 16096 502.99 32.00 13838 432.44 32.00 6376 199.24 32.00 8490 265.32 32.00 47469 1483.39 32.00 53098 1659.32 32.00 65148 2035.88 0 0 8 2 90 0 60 32.00 16095 502.97 32.00 13804 431.38 32.00 6399 199.95 32.00 8553 267.29 32.00 55278 1727.44 32.00 53178 1661.82 32.00 65247 2038.98 1 0 9 2 88 timer ipi extint user% sys% intr% idle% smpcol label total 215970 256043 13758 cpu0 288 19934 32356 0.0 17.1 8.5 74.4 33 Xahcicam cpu1 40621 29759 59232 0.0 35.7 11.6 52.7 0 cpu2 281 1503 7375 0.8 10.1 2.3 86.8 13 Xahcicam cpu3 40622 90981 0 0.0 9.3 14.0 76.7 13147 Xahcipo cpu4 281 976 9015 0.8 14.0 2.3 82.9 2 Xahcicam cpu5 281 2529 7268 0.8 7.8 0.0 91.5 2 Xahcicam cpu6 281 293 9673 0.0 14.0 0.8 85.3 1 Xahcicam cpu7 281 3729 7292 0.0 7.8 0.8 91.5 25 Xahcicam cpu8 281 15 15663 0.8 22.5 0.8 76.0 8 Xahcipo cpu9 281 3042 2039 0.0 6.2 0.8 93.0 12 Xahcicam cpu10 281 1454 6127 0.8 3.9 2.3 93.0 2 Xahcicam cpu11 282 276 9367 0.0 9.3 0.8 89.9 1 Xrelpbuf cpu12 281 1249 8713 0.0 9.3 1.6 89.1 0 cpu13 282 10141 8406 0.8 12.4 0.8 86.0 1 Xgetpbuf_mem cpu14 282 2331 7778 0.0 7.0 1.6 91.5 3 Xahcicam cpu15 283 2060 6671 2.3 8.5 1.5 87.6 1 Xrelpbuf cpu16 282 2490 3669 1.6 4.7 0.8 93.0 17 Xahcicam cpu17 281 2317 12099 0.0 12.4 1.6 86.0 1 Xgetpbuf_mem cpu18 282 3717 0 0.0 3.9 0.0 96.1 39 Xahcicam cpu19 282 4217 1037 0.0 7.7 0.0 92.3 52 Xrelpbuf cpu20 281 1510 4079 0.0 10.1 0.8 89.1 11 Xahcicam cpu21 281 3983 2075 0.0 4.7 0.0 95.3 43 Xahcicam cpu22 281 1926 4218 0.0 11.6 0.0 88.4 12 Xahcicam cpu23 281 2261 3051 0.8 6.2 0.0 93.0 13 Xahcicam cpu24 281 3873 4067 0.0 6.2 0.0 93.8 2 Xrelpbuf cpu25 281 5080 2073 0.0 8.5 0.0 91.5 6 Xnvqlk cpu26 281 2089 3093 0.0 5.4 0.8 93.8 6 Xrelpbuf cpu27 281 3569 3057 0.8 6.2 0.8 92.2 6 Xahcicam cpu28 281 1776 2105 0.8 3.1 0.0 96.1 277 Xahcicam cpu29 281 2450 4125 0.0 3.9 0.8 95.3 10 Xahcicam cpu30 281 1689 4118 0.0 6.2 1.6 92.2 11 Xahcicam cpu31 281 2751 6202 0.8 9.3 0.8 89.2 1 Xnvqlk