|
|
|
Member
      
Group: Forum Members
Last Login: 9/5/2006 4:37 AM
Posts: 18,
Visits: 78
|
|
| Yep, the Marvell and Nvidia were 'onboard', or LOM (lan on motherboard). The way I understand it, it's still attached to the onboard PCI bus. Tytus can probably explain more. The Intel was a PCI card.
|
|
|
|
|
Member
      
Group: Forum Members
Last Login: 12/30/2006 10:58 PM
Posts: 21,
Visits: 103
|
|
jeepdave (8/12/2006) Yep, the Marvell and Nvidia were 'onboard', or LOM (lan on motherboard). The way I understand it, it's still attached to the onboard PCI bus. Tytus can probably explain more.
The Intel was a PCI card.
Hmmmm, I'd like to see something of a standalone PCIe NIC. I still believe on-board devices carry a penalty from CPU overhead (maybe a small penalty, but I am very detailed in how I build and configure my machines, to ensure that I can get as many CPU cycles for the games I play instead of mired in the muck of Windows....).
|
|
|
|
|
Noob
      
Group: Forum Members
Last Login: 9/20/2006 9:23 AM
Posts: 7,
Visits: 47
|
|
jeepdave (8/12/2006)
I'll do one ya one better, Here are the actual screen shots from the PCATCP test I ran a couple weeks back. I ran each test 3 times, the network card tested was typed in the command line, then screen shot taken.
Many thanks for the details on the test. I've done some tests of my own, matching your configurations and conditions to a large extent. Here are the results:
Transmission test system:
Motherboard: Asus K8N-VM CSM (nForce 430/6150)
Processor: AMD X2 3800+ (stock)
RAM: 2 GiB OCZ
NICs: Intel(R) PRO/1000 MT Server Adapter (in PCI 32)
SysKonnect SK-9E21D (Marvell Yukon) (PCI-Express)
NVIDIA nForce (onboard)
Receiver test system:
Motherboard: Tyan S5151
Processor: Intel 2.8 GHz (no HT)
RAM: 1 GiB
NIC: Onboard Broadcom NetXtreme BCM5721
(In case this MB is significantly different from yours, I also tested using a different, nForce3, MB as receiver, and the results were not materially different.)
Systems were connected via a D-Link DGS-1008D switch using standard CAT5e cables.
The test was changed slightly adding only -n 60000 to the command line parameters. This had no logical effect on the test, and merely increased the sampling time to get more reliable sustained performance results.
Top Results over 3 tests were as follows:
Intel:
491520000 bytes in 9.25 real seconds = 51891.89 KB/sec
calls/sec: 6486.70
Marvell:
491520000 bytes in 6.13 real seconds = 78367.35 KB/sec
calls/sec: 9796.24
nVIDIA
491520000 bytes in 6.38 real seconds = 75294.12 KB/sec
calls/sec: 9412.08
I illustrate these figures using graphs similar to yours, including your results.
(As PCATTCP actually reports performance in KiB/s, I converted them to proper MB/s using:
KiB/sec * 1024 (to get to Bytes/s) / 10^6 (to get to MB/s).
Please see Wikipedia's MiB definitions if needed.
This part only makes around a 2.4% difference, so is not material.)


So what where the differences that caused the vast differences in results?
(1) I think my nVIDIA NIC might have better tuning than yours by default (internally)
(2) I think my Intel Server NIC might be a bit better than your Intel Desktop NIC
(3) I did not stop at just the default NIC parameters, and tuned them a bit based on results
To illustrate, I did another test my nVIDIA NIC's parameters set to default. This clearly invoked performance throttling, and gave the following results.
nVIDIA - optimization CPU:
491520000 bytes in 8.00 real seconds = 60000.00 KB/sec
call/sec: 7500.25
However, this result still far exceeded yours. This test is as close to your configuration as I can reasonably achieve, and is reasonable for me. I don't know why your performance was so much worse.
However, it would be misleading for me to suggest that all default parameter performance was as high as this. I noticed significantly slower performance in some cases, and these essentially had to do with the interrupt moderation parameters set within
the NICs by default. However, this performance still at worst roughly matched your stated Killer NIC performance, and at best, still well-exceeded them.
Unfortunately, I don't have a KillerNIC to try tuning myself -- I guess we'll have to leave all that to independent third-party reviewers. I've spent a lot of personal time trying to understand why your stated performance figures looked so low for gigabit, and I think I've gained some understanding by doing these tests, and will probably just leave it at this point.
A sample test screen shots follow; I removed my MAC addresses because I didn't see the point of sharing them. I'll also spare you the massive scrolling / etc. by only uploading one of them.

|
|
|
|
|
ELN Board Member, Bigfoot Networks CEO
Group: Administrators
Last Login: 9/8/2008 7:13 PM
Posts: 266,
Visits: 636
|
|
Thanks for the post Madwand... very informative! Obviously our test setups are different. I think the main difference may be the -n 60000 that you used... (so I'll see if I can get jeepdave to reproduce that).
In the meantime, I think it's interesting that we both had nFORCE and Pro 1000's, and got different results: (clearly indicating a different test setup).
I would wager that Killer would have similar performance relative to the Intel Pro 1000 in your setup (but you don't have a Killer to try it ) I hope you correct that! 
Killer outperforms everything we have in the lab consistantly in tons of metrics, and we recently went to another Lab (cyberjocks.com) and did some real world testing:
http://www.endlagnow.org/ELNForums/Topic444-12-1.aspx
I too can't wait for reviewers to have some fun with Killer!!!
Tytus
------------------------- [ELN]Tytus - EndLagNow.ORG
Member of the Board of Directors of ELN
CEO + Mad Scientist of Bigfoot Networks, Inc.
http://www.bigfootnetworks.com
|
|
|
|
|
Noob
      
Group: Forum Members
Last Login: 9/20/2006 9:23 AM
Posts: 7,
Visits: 47
|
|
Tytus (8/13/2006) I think the main difference may be the -n 60000 that you used... (so I'll see if I can get jeepdave to reproduce that).
No, that's not it. When I remove it I get similar results. All the n=60000 does is increase the duration of the test. With GbE working at anywhere near capacity, the default n=2048 buffers are sent in a fraction of a second. I didn't like the duration of a test being so short -- I think it increases the impact of luck -- so I extended it. With that, I could also watch the performance graph, confirm that the tool was reporting performance observed elsewhere, and have a more stable average performance figure due to the increased duration.
|
|
|
|
|
ELN Board Member, Bigfoot Networks CEO
Group: Administrators
Last Login: 9/8/2008 7:13 PM
Posts: 266,
Visits: 636
|
|
Thanks,
I'll see what we can do to try to reproduce your settings and get a Killer number.
The relative results are still valid relative to our test setup... and I really would like to see our Killer in your environment.
Tytus
------------------------- [ELN]Tytus - EndLagNow.ORG
Member of the Board of Directors of ELN
CEO + Mad Scientist of Bigfoot Networks, Inc.
http://www.bigfootnetworks.com
|
|
|
|
|
Member
      
Group: Forum Members
Last Login: 12/30/2006 10:58 PM
Posts: 21,
Visits: 103
|
|
Impressive Madwand! I look forward to Tytus giving you a shot at testing a KNic in your setup! *hint* *hint*
That would be the only way to place the KNic in the same environment as the one you tested.
I suppose I'll stick with my SysKonnect card....for now...
Many thanks!
|
|
|
|
|
Noob
      
Group: Forum Members
Last Login: 8/14/2006 9:41 AM
Posts: 4,
Visits: 1
|
|
In case anyone cares, my Intel Pro1000 MT gets 5000/40MB on a 1.3ghz PM, Asus VM 478 w/479 slocket, w2k system.
The Syskonnect and probably the Nvidia LOM too, are not on the PCI bus, it's not really a fair test (Linux version of ttcp easily pushes > 960 mbit/s on a PCI-X / PCI-E bus in comparison).
Anyway, here's an experiment: push the interrupts on the Intel card to Extreme, since that's what's going to get you lowest latency. That pushes down my calls/sec and bandwidth into jeepdave range.
I don't like this test, though. The recver system has no role whatsoever in this test. I want to see round trip times, both idle and with a cpu / graphics load.
|
|
|
|
| | |