DragonFly Basic NVME Driver Tests 07-June-2016 Max IOPS tests, appears to be cpu limited CPU: Intel(R) Xeon(R) CPU E31270 @ 3.40GHz (3392.31-MHz K8-class CPU) (4-core/8-thread) nvme0: Model INTEL_SSDPEDMW400G4 BaseSerial CVCQ535100LC400AGN nscount=1 nvme0: Request 16/8 queues, Returns 31/31 queues, optimal map nvme1: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA02988 nscount=1 nvme1: Request 16/8 queues, Returns 8/8 queues, nominal map 1:1 cpu nvme2: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA03031 nscount=1 nvme2: Request 16/8 queues, Returns 8/8 queues, nominal map 1:1 cpu 512 byte random read, 32 processes per device [1] + 1384 Running randread /dev/nvme0s1b [2] - 1385 Running randread /dev/nvme1s1b [3] 1386 Running randread /dev/nvme2s1b timer ipi extint user% sys% intr% idle% smpcol label total 16532 530314 11233 cpu0 289 14 85115 8.5 79.1 12.4 0.0 3699 Xrelpbuf cpu1 282 4356 63090 7.8 79.8 12.4 0.0 1017 Xgetpbuf_kva cpu2 284 9 63294 8.5 77.5 14.0 0.0 1084 Xrelpbuf cpu3 282 6503 63468 3.9 79.1 17.1 0.0 1006 Xrelpbuf cpu4 283 19 65579 10.1 79.8 10.1 0.0 1215 Xrelpbuf cpu5 283 7 63248 10.9 79.1 10.1 0.0 1044 Xrelpbuf cpu6 283 8 63397 7.0 76.0 17.1 0.0 1095 Xrelpbuf cpu7 282 5616 63123 7.8 78.3 14.0 0.0 1073 Xrelpbuf tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 107 0.50 187781 91.69 0.50 176821 86.34 0.50 170829 83.41 8 0 76 15 0 0 94 0.50 192791 94.13 0.50 175528 85.71 0.50 167454 81.76 7 0 77 16 0 0 95 0.50 194812 95.12 0.50 176297 86.08 0.50 163919 80.03 9 0 78 13 0 0 97 0.50 190716 93.12 0.50 171165 83.58 0.50 172798 84.37 7 0 78 15 0 0 92 0.50 193249 94.35 0.50 173765 84.85 0.50 168127 82.09 8 0 78 14 0 0 92 0.50 182300 89.01 0.50 180501 88.13 0.50 172034 84.00 7 0 78 15 0 0 95 0.50 195943 95.67 0.50 169722 82.87 0.50 169115 82.58 7 0 78 15 0 0 94 0.50 194730 95.08 0.50 173503 84.72 0.50 166149 81.12 8 0 79 13 0 0 92 0.50 193314 94.39 0.50 174218 85.07 0.50 167193 81.64 8 0 77 14 0 --- 4096 byte random read, 32 processes per device timer ipi extint user% sys% intr% idle% smpcol label total 10110 537058 10920 cpu0 289 20 84934 5.4 78.3 12.4 3.9 3632 Xgetpbuf_kva cpu1 281 4193 64054 8.5 76.7 14.7 0.0 1000 Xrelpbuf cpu2 282 15 64794 8.5 72.9 18.6 0.0 1040 X_vm_page_queue cpu3 281 14 64087 4.7 72.1 20.9 2.3 979 Xrelpbuf cpu4 282 27 66305 7.0 84.5 8.5 0.0 1188 Xrelpbuf cpu5 282 9 64387 6.2 76.7 15.5 1.6 1028 Xrelpbuf cpu6 283 14 64172 9.3 74.4 15.5 0.8 1046 Xrelpbuf cpu7 281 5818 64325 3.9 84.5 11.6 0.0 1007 X_vm_page_queue tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 115 4.00 183661 717.41 4.00 174870 683.08 4.00 175364 684.98 8 0 78 14 0 0 91 4.00 180674 705.76 4.00 173465 677.58 4.00 180462 704.91 8 0 78 13 1 0 88 4.00 174713 682.47 4.00 179920 702.82 4.00 180719 705.93 8 0 75 17 1 0 91 4.00 172252 672.86 4.00 181824 710.25 4.00 181308 708.23 7 0 77 15 1 0 89 4.00 180939 706.79 4.00 174813 682.86 4.00 179362 700.64 7 0 76 16 1 0 85 4.00 183797 717.95 4.00 173374 677.23 4.00 177456 693.18 7 0 76 17 0 0 94 4.00 181913 710.59 4.00 176594 689.82 4.00 176635 689.99 6 0 78 16 0 0 94 4.00 182140 711.49 4.00 176033 687.64 4.00 177174 692.08 8 0 77 14 1 0 87 4.00 178371 696.76 4.00 177174 692.08 4.00 179460 701.00 8 0 77 14 0 0 92 4.00 171155 668.56 4.00 182048 711.11 4.00 182470 712.76 8 0 76 15 1 0 92 4.00 173880 679.21 4.00 180768 706.13 4.00 180461 704.92 9 0 73 17 1 --- 32KB sequential read, 8 processes per device timer ipi extint user% sys% intr% idle% smpcol label total 4200 128901 122 cpu0 288 2 16588 0.0 20.2 3.1 76.7 9 Xgetpbuf_kva cpu1 281 7 19557 2.3 26.4 4.7 66.7 23 X_vm_page_queue cpu2 282 8 19685 1.6 30.2 1.6 66.7 18 Xvm_page_spin_l cpu3 281 7 15856 1.6 25.6 0.8 72.1 8 Xrelpbuf cpu4 281 6 14346 0.0 24.8 0.8 74.4 14 X_vm_page_queue cpu5 280 4 14248 2.3 24.8 0.0 72.9 9 Xrelpbuf cpu6 282 8 14244 0.8 18.6 2.3 78.3 39 X_vm_page_queue cpu7 281 4158 14377 0.8 24.0 0.0 75.2 2 Xrelpbuf tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 93 32.00 33497 1046.80 32.00 47357 1479.90 32.00 46867 1464.60 1 0 25 1 73 0 74 32.00 33460 1045.63 32.00 47346 1479.55 32.00 47039 1469.96 1 0 24 1 74 0 76 32.00 33531 1047.85 32.00 47308 1478.36 32.00 47026 1469.55 0 0 24 2 73 0 80 32.00 33583 1049.45 32.00 47324 1478.87 32.00 47035 1469.85 1 0 25 3 72 0 79 32.00 33569 1049.03 32.00 47320 1478.76 32.00 46868 1464.61 1 0 27 1 71 0 70 32.00 33545 1048.28 32.00 47332 1479.14 32.00 46932 1466.61 1 0 23 3 73 0 73 32.00 33276 1039.88 32.00 47275 1477.33 32.00 46936 1466.74 1 0 23 3 72 0 76 32.00 33349 1042.15 32.00 47325 1478.91 32.00 46915 1466.11 0 0 26 1 72 One thing to note here, the Intel card on nvme0 has a particularly poor bandwidth ramp, it really takes ~30 processes to max it out and I have done this for the results below. But honestly it shouldn't have to take that much work to max-out the intel card so I don't really consider the parameters to be reasonable. tty nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 86 32.00 43955 1373.60 64.00 24564 1535.25 64.00 12248 765.50 1 0 19 1 79 0 86 32.00 44689 1396.54 64.00 25101 1568.78 64.00 24340 1521.22 1 0 21 1 77 0 78 32.00 44713 1397.27 64.00 25120 1570.02 64.00 24754 1547.14 1 0 25 1 73 0 75 32.00 44693 1396.66 64.00 25111 1569.43 64.00 24744 1546.50 1 0 22 1 76 0 77 32.00 44675 1396.09 64.00 25117 1569.80 64.00 24750 1546.87 1 0 24 1 74 0 79 32.00 44693 1396.67 64.00 25117 1569.79 64.00 24751 1546.91 1 0 24 1 73 0 76 32.00 44721 1397.54 64.00 25127 1570.41 64.00 24760 1547.47 0 0 23 1 76 0 77 32.00 44702 1396.94 64.00 25115 1569.66 64.00 24748 1546.72 0 0 23 1 76 1 76 32.00 44711 1397.22 64.00 25124 1570.22 64.00 24758 1547.35 1 0 23 1 74 So at least through the raw device (/dev/nvme*s1b - a swap partition filled with /dev/urandom data), we can achieve 4.5 GBytes/sec in aggregate from these three cards sitting in a medium-end haswell xeon box with three three PCIe-2.0 slots. -- In fact, I can go further. This machine has four hot-swap 2.5" drive bays. I've thrown in four 2.5" SSDs and added those to the dd tests. These results are very wide, make your browser wide: da0: Fixed Direct Access SCSI-4 device da0: 600.000MB/s transfers da1: Fixed Direct Access SCSI-4 device da1: 300.000MB/s transfers da2: Fixed Direct Access SCSI-4 device da2: 300.000MB/s transfers da3: Fixed Direct Access SCSI-4 device da3: 300.000MB/s transfers Basically an aggregate of ... still around 4.5 GBytes/sec. The four SATA SSDs (on an AHCI controller) added 995 MBytes/sec of bandwidth, but nvme2 seems to have lost about as much, so the aggregate has not changed. Not sure why yet but there were over 112 'dd' processes running so a lot of things could go wrong. timer ipi extint user% sys% intr% idle% smpcol label total 129539 132675 498 cpu0 288 3647 10649 1.6 16.3 0.0 82.2 22 Xbus_dma_tag_lo cpu1 281 13035 6574 0.0 21.7 1.6 76.7 58 Xbus_dma_tag_lo cpu2 281 33626 73731 0.0 41.9 14.0 44.2 5 Xahcicam cpu3 282 46494 6044 0.0 32.6 3.9 63.6 158 Xbus_dma_tag_lo cpu4 281 5481 8810 0.8 22.5 1.6 75.2 30 Xbus_dma_tag_lo cpu5 281 7387 11236 0.0 24.0 1.6 74.4 60 Xrelpbuf cpu6 282 1284 10515 0.8 17.1 0.8 81.4 25 Xahcipo cpu7 281 18585 5116 0.0 26.4 2.3 71.3 140 Xbus_dma_tag_lo tty da0 da1 da2 da3 nvme0 nvme1 nvme2 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 88 32.00 8330 260.33 32.00 6621 206.89 32.00 8234 257.33 32.00 8233 257.30 32.00 44701 1396.94 64.00 25435 1589.70 64.00 9007 562.96 1 0 27 3 70 0 90 32.00 8634 269.80 32.00 6617 206.77 32.00 8223 256.96 32.00 8220 256.87 32.00 44765 1398.92 64.00 25394 1587.13 64.00 8824 551.48 1 0 25 3 71 0 85 32.00 8708 272.12 32.00 6706 209.56 32.00 8231 257.21 32.00 8227 257.09 32.00 44714 1397.30 64.00 25418 1588.64 64.00 8758 547.36 1 0 22 3 73 0 88 32.00 8947 279.59 32.00 7226 225.81 32.00 8232 257.24 32.00 8230 257.18 32.00 44593 1393.52 64.00 25423 1588.95 64.00 8374 523.36 1 0 27 4 68 0 84 32.00 9068 283.36 32.00 7657 239.27 32.00 8230 257.17 32.00 8228 257.11 32.00 44659 1395.58 64.00 25419 1588.66 64.00 8088 505.47 1 0 25 4 70 0 87 32.00 9013 281.65 32.00 7978 249.30 32.00 8232 257.24 32.00 8229 257.15 32.00 44861 1401.89 64.00 25422 1588.89 64.00 7953 497.05 1 0 26 2 70 0 87 32.00 9086 283.92 32.00 8065 252.02 32.00 8235 257.33 32.00 8232 257.24 32.00 44716 1397.37 64.00 25430 1589.36 64.00 7877 492.29 1 0 25 3 71 0 81 32.00 9237 288.67 32.00 8138 254.33 32.00 8234 257.33 32.00 8232 257.26 32.00 44656 1395.50 64.00 25432 1589.52 64.00 7764 485.28 1 0 30 4 65