Opengl for zsnes under win.

Strictly for discussing ZSNES development and for submitting code. You can also join us on IRC at irc.libera.chat in #zsnes.
Please, no requests here.

Moderator: ZSNES Mods

Post Reply
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

MaxSt wrote:
Reznor007 wrote:What if I feel like compiling a new MAME build while I play ZSNES?
I'd say it's your problem.
Reznor007 wrote:The way I see it, if you can offload some work onto a separate device, and devote more CPU time to improving emulation then it's an easy answer.
Emulation won't be improved automatically just because you'll throw more CPU cycles at it. After you achieved 60/60, what else CPU is suppose to do?

MaxSt.
I'm talking about how ZSNES is currently be rewritten in C, which will slow it down, and is improving in accuracy, which will slow it more.

I really don't see why you are so against the idea of this. All it does is allow enhancement filters for basically no CPU cost.
Deathlike2
ZSNES Developer
ZSNES Developer
Posts: 6747
Joined: Tue Dec 28, 2004 6:47 am

Post by Deathlike2 »

I believe MaxSt is making a decision via a development standpoint rather than a viewer's ideal standpoint.

SNES emulation isn't going to be that much better (as to requiring new processors) and using all these cpu cycles can be better put with software (cpu) based filters.

I would think it would be nice to put some of those filters through the video card... but you need to look at this. The baseline video card needed to do this would probably be greater than the current integrated solution out there.

The majority of systems that go out there have POS video cards... thus giving no incentive to goto such a method.

Besides... the lowest common denominator ends up having more powerful cpus.

Even if you wanted to implement stuff to utlilize the video card, it'll probably waste unnecessary development time because of quirks or minor things that can affect the output (cpu rendering is essentially foolproof).

Look at it down the road... 2 years from now.. CPUs will be fast enough to handle HQ4x perfectly... why bother?
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Nightcrawler wrote:I agree. Just because we have a powerful machines, does that make it so we should all code inefficiently?
ZSNES code is very efficient and highly optimized.
Nightcrawler wrote:If an obvious opportunity arises to free some CPU time.. why not take it?
It was proven that it's possible to implement 2xSaI in shaders, but that way you'll free maybe 1% of CPU time. What's the point? Do it if you want, but it's a lot of work, and no one will say "thank you" at the end, because nobody will notice additional 1% of CPU time.
Nightcrawler wrote:You're mentality is basically saying who cares about code optimization if the CPU can already runs it full speed. I'd say most programmers would not share your views there.
I care a lot about optimization. The only reason that ZSNES is playable with hq4x filter is because I highly optimized it, using lots of MMX. Other emulators like AdvanceMAME or VBA are using very unoptimized, very slow C code for HQ filters. Lecture them about optimization.

MaxSt.
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

MaxSt wrote:
Nightcrawler wrote:I agree. Just because we have a powerful machines, does that make it so we should all code inefficiently?
ZSNES code is very efficient and highly optimized.
Nightcrawler wrote:If an obvious opportunity arises to free some CPU time.. why not take it?
It was proven that it's possible to implement 2xSaI in shaders, but that way you'll free maybe 1% of CPU time. What's the point? Do it if you want, but it's a lot of work, and no one will say "thank you" at the end, because nobody will notice additional 1% of CPU time.
Nightcrawler wrote:You're mentality is basically saying who cares about code optimization if the CPU can already runs it full speed. I'd say most programmers would not share your views there.
I care a lot about optimization. The only reason that ZSNES is playable with hq4x filter is because I highly optimized it, using lots of MMX. Other emulators like AdvanceMAME or VBA are using very unoptimized, very slow C code for HQ filters. Lecture them about optimization.

MaxSt.
Of course, AdvanceMAME can be compiled for non x86 CPU's. And just because something is ASM doesn't mean it's faster than C. Ask RBelmont or Farfetch'd from Modeler/MAME.
byuu

Post by byuu »

Offtopic
And just because something is ASM doesn't mean it's faster than C. Ask RBelmont or Farfetch'd from Modeler/MAME.
Sure, if you suck at writing assembly code. This is just one of those arguments high-level language programmers made up to justify to themselves that writing code in c instead of assembly is perfectly ok. It is, but saying well written c can generate binary code that's just as fast as well written assembly is just stupid. Arguing that well written c is faster than poorly written assembly is blatantly obvious, and even more stupid.
In order to write effective x86 code, you need to understand how the processor works (and optimize for all processors, not just one type of x86 processor) and be able to recognize which routines need to be in assembly and which don't matter. Most people don't understand this and then wonder why their compiler generated code runs faster than their assembly code.

I almost always gain at least a 200% speed increase when I rewrite a routine in assembly, and I have nearly ten years experience in both c and assembly.
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

byuusan wrote:Offtopic
And just because something is ASM doesn't mean it's faster than C. Ask RBelmont or Farfetch'd from Modeler/MAME.
Sure, if you suck at writing assembly code. This is just one of those arguments high-level language programmers made up to justify to themselves that writing code in c instead of assembly is perfectly ok. It is, but saying well written c can generate binary code that's just as fast as well written assembly is just stupid. Arguing that well written c is faster than poorly written assembly is blatantly obvious, and even more stupid.
In order to write effective x86 code, you need to understand how the processor works (and optimize for all processors, not just one type of x86 processor) and be able to recognize which routines need to be in assembly and which don't matter. Most people don't understand this and then wonder why their compiler generated code runs faster than their assembly code.

I almost always gain at least a 200% speed increase when I rewrite a routine in assembly, and I have nearly ten years experience in both c and assembly.
I'm not the one saying it, it's the devs that have written several emulators using both C and ASM.
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Reznor007 wrote:I'm not the one saying it, it's the devs that have written several emulators using both C and ASM.
Show me the quotes.

MaxSt.
funkyass
"God"
Posts: 1128
Joined: Tue Jul 27, 2004 11:24 pm

Post by funkyass »

snes9x is potentially slower than zsnes. That may be more of design that choice of tool however. the core in 80x86 machines is ASM, not C.
byuu

Post by byuu »

I'm guessing that the most likely reasons for snes9x being slower is a mix between design and because the PPU is written in c/c++.
I don't know anything about audio emulation, but I do know the PPU requires (around) twice the processing power to emulate than both the 65816 and spc700 combined.
Clements
Randomness
Posts: 1172
Joined: Wed Jul 28, 2004 4:01 pm
Location: UK
Contact:

Post by Clements »

OpenGL would also allow hardware bilinear filtering over the top of a software filter at no performance penalty. Some software filters look much better this way in my opinion. Visualboy Advance does this.
bohdy
Rookie
Posts: 13
Joined: Sun Feb 13, 2005 9:28 pm

Post by bohdy »

"Moe" has finally released his latest hq2x patch for dosbox, including a gpu accelerated version as promised.

http://sourceforge.net/tracker/index.ph ... tid=467234

If anyone is feeling up to compiling it with SDL1.3, then you can give it a go, or you can just look at the code.

Is this proof enough for you, Max?
Last edited by bohdy on Sat Mar 05, 2005 7:30 pm, edited 1 time in total.
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

MaxSt wrote:
Reznor007 wrote:I'm not the one saying it, it's the devs that have written several emulators using both C and ASM.
Show me the quotes.

MaxSt.
http://tinyurl.com/6obcm [use tinyurl for great justice - grinvader]

That's just 1 topic though. It doesn't contain the quote I mentioned, but Rbelmont is the guy that said it. There were more but the MAME board only goes back 3 months.
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Reznor007 wrote:And just because something is ASM doesn't mean it's faster than C.
When I choose to implement something in ASM, it's because I know it will be faster. I know the advantages of MMX and SSE, and I can see when some function will benefit from MMX or SSE. It's a matter of experience.

MaxSt.
Aerdan
Winter Knight
Posts: 467
Joined: Mon Aug 16, 2004 10:16 pm
Contact:

Post by Aerdan »

Reznor007 wrote:long-assed link
Shorten that URL, goddamnit.
Shogetsu
Rookie
Posts: 29
Joined: Thu Jul 29, 2004 1:36 am

Post by Shogetsu »

From what I remember it's not exactly like Reznor comments... the discussions involving Modeler were more headed to "why don't you make this driver in MAME in asm, it will be goddam faster, I'm sure!" and the replies from Belmont were that asm isn't the best and choice, and the gain of speed isn't worth the compatibility lost and other issues, and he talked of Modeler as an example, it is faster than the MAME driver and at the time the emulation was the same more or less, and to gain that speed they didn't use asm, hacks nor some other similar way, it was a mix of C and C++.

Those discussions were a long, long time ago... maybe around two years ago.
I'll tell you the meaning of life: It's not to live, but to die
Noxious Ninja
Dark Wind
Posts: 1271
Joined: Thu Jul 29, 2004 8:58 pm
Location: Texas
Contact:

Post by Noxious Ninja »

Aerdan wrote:
Reznor007 wrote:long-assed link
Shorten that URL, goddamnit.
Better?
[u][url=http://bash.org/?577451]#577451[/url][/u]
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Clements wrote:OpenGL would also allow hardware bilinear filtering over the top of a software filter at no performance penalty.
That functionality is already exist in ZSNES (Win), even without OpenGL.
In so-called "S" modes.

MaxSt.
Aerdan
Winter Knight
Posts: 467
Joined: Mon Aug 16, 2004 10:16 pm
Contact:

Post by Aerdan »

Noxious Ninja wrote:
Aerdan wrote:
Reznor007 wrote:long-assed link
Shorten that URL, goddamnit.
Better?
Har har. No.
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

bohdy wrote: Is this proof enough for you, Max?
OK, I looked into it, and that's what I found in the comments:
This code implements a hardware-accelerated, cross-platform OpenGL
based scaler quite similar to the well known Hq2x..Hq4x suite of
software scalers. The general idea is exactly the same, but nothing
else remain.

...

- Okay, software Hq2x may be slightly faster.
So the guy created his own filter.

MaxSt.
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

Shogetsu wrote:From what I remember it's not exactly like Reznor comments... the discussions involving Modeler were more headed to "why don't you make this driver in MAME in asm, it will be goddam faster, I'm sure!" and the replies from Belmont were that asm isn't the best and choice, and the gain of speed isn't worth the compatibility lost and other issues, and he talked of Modeler as an example, it is faster than the MAME driver and at the time the emulation was the same more or less, and to gain that speed they didn't use asm, hacks nor some other similar way, it was a mix of C and C++.

Those discussions were a long, long time ago... maybe around two years ago.
Yeah, the basic thing he was saying was that the algorithm matters more than the language used(I think that was really close to the exact phrase he used).
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

MaxSt wrote:
bohdy wrote: Is this proof enough for you, Max?
OK, I looked into it, and that's what I found in the comments:
This code implements a hardware-accelerated, cross-platform OpenGL
based scaler quite similar to the well known Hq2x..Hq4x suite of
software scalers. The general idea is exactly the same, but nothing
else remain.

...

- Okay, software Hq2x may be slightly faster.
So the guy created his own filter.

MaxSt.
It may not produce 100% exact copies of the software version, but it's supposedly similar(I haven't seen side by side shots). It's also something that could now be added to ZSNES without too much effort since Pagefault said he was going to add the basic OpenGL code, and the source for the filter is available already.
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Reznor007 wrote: It may not produce 100% exact copies of the software version, but it's supposedly similar(I haven't seen side by side shots).
I guess it's not similar at all. But we'll see.
Reznor007 wrote:It's also something that could now be added to ZSNES
There is no point of doing this right now. It's slower then software hq2x.
And how do we know it looks good?

MaxSt.
Kagerato
Lurker
Posts: 153
Joined: Mon Aug 09, 2004 1:40 am
Contact:

Post by Kagerato »

And how do we know it looks good?
By "we", you mean the general ZSNES userbase?

In terms of widely-used programs, there tends to be a high probability that, if a feature can be coded, someone out there will use it. In terms of visual quality, everyone has a distinct opinion. Therefore I find it difficult to dismiss any filter based on the opinion of a small number of individuals, especially when/if the code is actively maintained.
MaxSt
ZSNES Developer
ZSNES Developer
Posts: 113
Joined: Wed Jul 28, 2004 7:07 am
Location: USA
Contact:

Post by MaxSt »

Kagerato wrote:if a feature can be coded, someone out there will use it.
Coding the features takes effort. If only couple of people will use the feature, it's not worth the effort of the developers. Especially if it requires total rewriting of all windows graphics code.
Kagerato wrote:In terms of visual quality, everyone has a distinct opinion.
There are no screenshots yet available. So what is your opinion on visual quality?
Is it worth all that code-rewriting effort? I'll say it's a little premature to ask that question.

MaxSt.
bohdy
Rookie
Posts: 13
Joined: Sun Feb 13, 2005 9:28 pm

Post by bohdy »

MaxSt wrote:There are no screenshots yet available. So what is your opinion on visual quality?
Is it worth all that code-rewriting effort? I'll say it's a little premature to ask that question.

MaxSt.
*time passes*

Well, 'Moe' coded the scaler into Dosbox a little while ago, but I have only recently been able to try it myself on my new Radeon 9550. My impressions are quite positive!

Here is a comparison that I put together:

Image

Both images are at 3X scaling, with one of them being scaled with HQ3X, and the other scaled on my GPU with the openglhq shader.

Can you recognize your original scaler, Max? :wink: :twisted:

As you can see, they are not that far apart, but openglhq has the advantage of no fixed scaling factor, ie. you can scale at 1.5X, 5.3X, etc.

As for speed, it runs 40%+ faster than the software equivalent on my card, although that is comparing Moe's software adaptation of Max's original HQ2X, which he claims is faster than Max's optimised version (which I can't directly compare to speed-wise).

So, worth coding into Zsnes, or what?
Post Reply