PDA

View Full Version : Are we using the right benchmarks for our reviews?



mistert
01-12-2004, 04:18 AM
I don't know what Cam thinks of me posting this thread, but I'm going to on the thought that we could use more community input on our testing.

I'm reaching out to you guys(and gal). I'm sure everyone at TweakTown would love it if there were suggestions for possibly more benchmarks, removal of others, or a combo of both. Below is a list I compiled of the benchmarks TT uses for different types of reviews. From the list, please give your thoughts on the reliability of these benchmarks and ideas on what new tests could be included. This goes for overclocking and more.

One more thing before I get the list out. Does anyone have any thoughts on the base test systems used on TT? In our reviews do we have enough comparisons (ie. 5950 vs. 9800XT, etc.) so that you can decide which board performs best? Do we provide enough information to help you decide if a product is worth your money? What questions do we leave unanswered?

THE LIST:

VIDEO CARDS:
3DMark03(heading out the door)
Aquamark3
Halo
Unreal Tournament 2003(When '04 comes, should we switch, use both, ideas?)
Quake3
Code Creatures
Comanche4
Jedi Knight 2

- For video cards, we currently use the above. One of the main concerns is whether or not we should continue supporting FutureMark benchmarks. There have been an almost intolerable amount of discrepancies with mostly the 3DMark benchmark and we would like to know how our readers feel about this.

CPU/CHIPSET:
SANDRA2004
PCMark2002(2004?)
3DMark2001SE/2003
Comanche4
Jedi Knight 2
Quake 3
UT2003

- We use a currently almost identical setup for testing Processors/Chipsets at TweakTown. Is this adequate? Would you like to see some more consideration in our benchmarking? Perhaps we could run a Pi Benchmark, memory latency tools, or something? Let us know!

STORAGE:
SANDRA2004
Media Encoding
Transfer Rate/DVD Decryption

- Perhaps we should grab a copy of HDTach? Are our storage reviews lacking enough performance details? Should we have sound testing as well? Would you like audio clips available to you to hear for yourself how loud or quiet a storage device is?

That's all I've got for now, back to the SATA article. Please think through what I've said above and give us some feedback. We want the community to recognize that we need your input to improve our site and make everyone's voice heard at the hardware manufacturers...Thanks everyone for any comments.

Mr.Tweak
01-14-2004, 11:58 AM
Anyone? :)

Beefy
01-14-2004, 12:04 PM
For CPU tests, you might also want to consider something along the encoding / decoding lines. Use a program to encode an MP3 from a WAV file, or convert VOB to DIVX or something like that.

kane2g
01-14-2004, 08:14 PM
how about trying to run some DC programs?
but make sure to use the same workunit to compare all the systems
:2cents:

mistert
01-15-2004, 01:32 AM
Any preference to the distributed computing program used? Perhaps someone would like to give us a yay or nay on FutureMark benchmarks (ie. 3DMark, PCMark, SYSMark). It'd be nice if there were some suggestions for 3d testing benchmarks we could use.

Beefy
01-15-2004, 05:51 AM
Any DC should be fine, but I'd either go with SETI or F@H.

Aside from that, the testing is probably good enough to give an idea of how the products perform. In 6 months you might want to review it all again though. :)

Actually, putting in a new DX9 game could be a bonus.

aznx
01-15-2004, 07:18 AM
should use mm, united devices and distributed.net for dc benchmarks. hd tach would be really helpful. also try benching with max payne 2, s.t.a.l.k.e.r, xiii, need for speed underground, splinter cell, ffxi, prince of persia, tron 2.0, microsoft flight simulator, and a few newers ones?

PersianImmortal
01-15-2004, 07:38 PM
VIDEO CARDS:
3DMark03(heading out the door)
Aquamark3
Halo
Unreal Tournament 2003(When '04 comes, should we switch, use both, ideas?)
Quake3
Code Creatures
Comanche4
Jedi Knight 2

The following are my suggestions (with reasons underneath each)

Aquamark3
- Recent benchmark, seems OK

Unreal Tournament 2003 (then 2004 when it arrives)
- The Unreal engine is the basis for many games and is also a good indication of gaming performance

Code Creatures
- Ageing benchmark, but still seems powerful enough to use.

Jedi Academy
- Use instead of Jedi Outcast because it's a newer version of the Outcast engine, especiall with the Dynamic Glow option which is very intensive on ATi cards.

3DMark03 (but not 3DMark01)
- Yes I know the controversy very well, but the latest version of 3DMark03 has removed a lot of the "optimizations". As long as Futuremark continues to try to remove such cheating, 3DMark03 is OK to include as long it is in a benchmark suite and not the sole indicator.

RTHDribl 1.2 (http://www.daionet.gr.jp/~masa/rthdribl/)
- This is one of my favorite demos and is a genuine, graphics-card only stress test/benchmark. It only works on PS2.0 or higher compatible cards, so it's a true DX9 test.

Not sure about using Prince of Persia or NFS Underground as these are console ports and can run on even low end cards with little stress. A recent full PC game like Call of Duty, or even a BF1942 benchmark (due to its popularity) would be much more popular and indicative of a graphics card's practical performance.

Fatguy3
01-15-2004, 09:07 PM
Its all about the games!

Although I like seein 3dmark and aquamark scores on different cards, they are almost completely useless. There is no one that is going to be using the Exact same system at the exact same ambient temperatures with the exact same amount of thermal paste etc. etc., so, synthetic benchmarks are pretty much useless to me. We all know how GFX cards stack up, ATI for example goes 9600Pro->9600XT->9700Pro->9800->9800Pro->9800XT.
(I only used card that are retail avaliable)

Most everyone can recite that from memory and knows that the benchmarks will be higher as you keep going up the list.

But when it comes to certain games, thats where it all counts. Certain game engines might run better/worse with different settings on certain cards, and at least If im using a barton 2500 and you're using a barton 2500, and same memory/HD speed (very possible to happen) the FPS's will most likely be the same, as other factors wont affect too much. So if im only getting 80 FPS in RTCW on my ti4200, and I see you getting 150 or so with a different card, I can see a real world gain in performance.

So, basically without rambling, I'm saying if you just stopped using synthetic benchmarks (I'd keep Sandra for CPU/Memory just because Intel vs AMD is kinda confusing) I wouldnt miss it, I can always head over to Tom's or Anand and get all the synthetic benchamrk scores I need, you could get a little niche here by spending all the synthetic bench money on new games.

Game suggestions:

RTCW (most widely played Q3 engine)
NFSU: Underground (Dx9, and VERY well coded engine)
Jedi Academy (What Persian Said)
Max Payne 2: (Prop Engine, Great game, Substitute for 3d2001, both use same engine)
UT2k4 (When it comes out, I'm sure a ton of games will use its engine)
HL2 and Doom3 of course in the future.

PlanetSide or some other MMORPG.
NFL2k4 (just trying to include all genres)
Halo (Most Graphic Intensive DX9)

Thats all I can think of now, other than that, make sure you always throw in some other benchs from different products on the same system. I really dont think a review is helpful unless it is at least a little bit of a roundup, as most people who dont own a game will know what the Average FPS is with what products.

Hope that helps. Peace.

Wiggo
01-15-2004, 09:15 PM
How about Science Mark (http://www.sciencemark.org/) for CPU's and chipsets/mobo's? :?:

SearanoX
01-16-2004, 05:16 AM
But when it comes to certain games, thats where it all counts. Certain game engines might run better/worse with different settings on certain cards, and at least If im using a barton 2500 and you're using a barton 2500, and same memory/HD speed (very possible to happen) the FPS's will most likely be the same, as other factors wont affect too much. So if im only getting 80 FPS in RTCW on my ti4200, and I see you getting 150 or so with a different card, I can see a real world gain in performance.

That's odd. On my 3.0 Pentium 4, 512 PC3200, and Radeon 9800 Pro, I'm getting perhaps worse speeds than what you're getting on the Ti. Oftentimes, UT2003 runs better than RTCW.

I'm basing this on Enemy Territory, not the original, but it's still somewhat strange to be getting those scores.

As for the games to use...all of the ones that have been stated are pretty much fine. Can't think of anything else coming out in the near future that could be used...although, come to think of it now, Splinter Cell: Pandora Tomorrow might work out as a benchmark...it's going to be very CPU and DX9 intensive.

sepherum2ya
02-07-2004, 12:04 PM
I would like to see Real Storm's Raytrace benchmark under CPU/CHIPSET

http://www.realstorm.com/benchmark.html

PrairieDawg
02-07-2004, 12:53 PM
Gl Excess is one of my favorite benching/testing programs. It's getting a bit long in the tooth but there is supposed to be a new version in the near future.

amd_man2005
02-07-2004, 12:56 PM
Personally I think we should use PCmark2002, untill PCmark2004 catches on, as PCmark02 gives a good indication and a good way to compare scores

zeradul
02-07-2004, 02:58 PM
When Doom3 comes out, it will be adopted as the sole relevant graphical benchmark. Doom 3 will have more new engine features than any game has introduced since Q1. Nearly all FPS's in the next 4 years will be based on the Doom 3 engine, and therefore, if your card plays D3 well, it will pretty much own any other petty game currently on the list.

E.G. Does it matter whether or not Card A gets 350 FPS in Jedi Knight 2 (Based on Q3) while Card B gets 400 FPS?? NO THAT DIFFERENCE IS IRRELEVANT ! (just as 3,000 FPS in Doom1 is irrelevant)
1.) No user can appreciate that difference.
2.) The difference is not applicable to other games or will ever be!
3.) The Q3 engine is 4 years 4 months old, and still being used as a benchmark! D3 will have equal staying power.

Doom 3 will also provide a direct relevant ratio to all future games based on the engine, which people will be buying cards to play anyways.

Soulburner
02-07-2004, 03:10 PM
One statement:

John Carmack = God

sepherum2ya
02-08-2004, 12:54 AM
Well you can't forget about Half-Life 2 now...:D..Even though the first one was barely based off the quake 1 engine.. And you could only get a max of 100 fps. HL2 will be diffrent. John Carmack is good indeed, Doom III has the best shading, and shadows i have ever seen. Half Life 2 has the overall game engine.. Espcially with there HaVok Physics system.


But don't forget about my post on RayTrace.. No matter how fast your CPU is.. This Bencmark WIll Really test it. And it seems to have better preformance on Athlon's since they have 9 instructions per clock cycle rather than the Pentium 4's 6.

zeradul
02-08-2004, 02:07 AM
Well you can't forget about Half-Life 2 now...:D..Even though the first one was barely based off the quake 1 engine.. And you could only get a max of 100 fps. HL2 will be diffrent. John Carmack is good indeed, Doom III has the best shading, and shadows i have ever seen. Half Life 2 has the overall game engine.. Espcially with there HaVok Physics system.

Half Life will be used as a benchmark for a year or so, but none the less, if a system plays Doom3 well, it will play HL2 well, BUT if a system plays HL2 well, it will not necessarily play Doom3 well, or at all. That makes Doom3 a more relevant, longer lasting benchmark.

sepherum2ya
02-08-2004, 02:54 AM
Well then don't forget about my RealStorm Raytrace idea. IT's the best.. IT makes the fastest cpu's crumble..

zeradul
02-08-2004, 04:00 AM
Wow, the 26 meg video on this page looks awfully graphically intense... http://www.realstorm.com/Screenshots.html

sepherum2ya
02-08-2004, 06:49 AM
Well download the benchmark and run it on your fastest machine.. And you can see it full screen. The graphics engine is great. but it doesn't use your video card at all. It is all based on math for the cpu to do. It's also a great Burn-In test. If you make it loop.