English Amiga Board


Go Back   English Amiga Board > Main > Retrogaming General Discussion

 
 
Thread Tools
Old 29 December 2020, 21:05   #41
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
I only have a cell phone available right now, so I can't browse through the full MC68040.pdf, but from my limited understanding of 68040, it has a 6-stage pipeline and non-EC version has full FPU.

What's the pipeline architecture like on 040? Can you execute FP code in parallel with Integer ops?
I would assume that if that's even possible, that the FP execution units would consume quite a few stages of the 6 stages.

But, the idea is, that I would interleave the FP code with Integer code for the inner texturing loops.

While 040 would be busy fetching the texels and moving them around, the FP code would be computing the index into the current texture scanline (in parallel).

I'm doing something similar for one of the version of my flatshader on 080- the FP unit computes the line equation during scanline traversal, the Integer unit does the scanline fill.
And because the scanline traversal fits into FP registers, there are more registers available for the scanline fill, without having to access RAM for variables. Win:Win

Can 040 do the same parallelism between FP and Integer unit?
VladR is offline  
Old 29 December 2020, 21:53   #42
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
Quote:
Originally Posted by VladR View Post

What's the pipeline architecture like on 040? Can you execute FP code in parallel with Integer ops?
I would assume that if that's even possible, that the FP execution units would consume quite a few stages of the 6 stages.

...

Can 040 do the same parallelism between FP and Integer unit?
Curiosity got the best of me and I painfully browsed through the PDF on the tiny phone screen and found this in section 1.4:
"Floating point instructions in the FPU execute concurrently with the Integer instructions in the IU."

So, we can break down the texturing into two parallel processes and execute them on both FP and Integer units.

The latency of FMOVE (transfer of FP indices into Integer unit for addressing) will dictate the final efficiency, so I suspect up to a half dozen versions might have to be written, given the 6-stage pipeline to minimize the pipeline bubbles.

But, it's doable :-)
VladR is offline  
Old 29 December 2020, 22:12   #43
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
Question

Quote:
If we used the 67% screen coverage estimate (it does get to 100% in tunnels, though), that's around 0.67*240*160 = ~25,800 px.
So, let's do some basic math.
33 MHz 040, at 60 fps, gives us 33,000,000/60 = 550,000 clocks per frame.

At 20 fps, that's 3 frames, which gives us 3*550,000=1,650,000 clocks.
1,650,000/25,800=~64 clocks per pixel.

A brief look at Timing table of 040 reveals that's not much at all and can be easily consumed by a simple Move.

Of course, there's 6 stages and parallel FP execution unit.
But in terms of a brute force, it's not much.
And we still need C2P, and other parts of gameplay, etc...

So, 20 fps might be hard to achieve on 040/33...
VladR is offline  
Old 29 December 2020, 22:15   #44
d4rk3lf
Registered User

d4rk3lf's Avatar
 
Join Date: Jul 2015
Location: Novi Sad, Serbia
Posts: 1,373
As a noob, I am not sure what you are asking, and hopefully someone more competent could answer.

However, I think there's no question if 040 can do GBA stuff.
I think we all agreed it can... easily...
040 can run quake (not very fast, but still... with GBA res, I guess it would be very smooth).

The question is (at least for me), if 030 can do all that GBA can?
Call me Amiga enthusiast, but I also think it can.
d4rk3lf is offline  
Old 30 December 2020, 17:43   #45
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
Quote:
Originally Posted by d4rk3lf View Post
However, I think there's no question if 040 can do GBA stuff.
I think we all agreed it can... easily...
Easily ?!?

Easily means brute-force. With 64 clocks per pixel (and majority of those lost on RAM R/W), it's highly likely not happening at 20 fps. 15 fps, though - yes. But, perhaps I'd be lucky and the very first bruteforce version would run at 18-20 fps, who knows....


040 has a 6-stage pipeline. Theoretically, you can have 6 ops, each spending only single clock in each stage, and reach maximum instruction throughput (as long as you are not touching RAM). Meaning, hypothetically, you could execute 64 ops in those 64 clocks.

But in reality, there will be pipeline stalls. Also, I just read in the 040.pdf that FP unit reuses the instruction decoder and <ea> calculation with the Integer Unit.

This further complicates things, because it increases the latency between the two.



Perhaps your definition of 'easily' assumes coder writing 5-6 versions of the same [already working] code to maximize the pipeline throughput ?

Granted, I still might be traumatized by over 15 undocumented Jaguar's RISC pipeline HW bugs (you insert NOP and sh*t hits the fan and many other 'undesired' instruction combos), so probably that clusterf*ckery doesn't apply to Motorola's pipeline design.


Still, 'easily' in my book means - you write the code, it works [at desired performance levels], and you move on the next task...

So, I would probably say, that 060 can easily do that.
VladR is offline  
Old 30 December 2020, 18:41   #46
khph_re
Registered User

 
Join Date: Feb 2008
Location: Northampton/UK
Posts: 391
Quote:
Originally Posted by VladR View Post
Easily ?!?

Easily means brute-force. With 64 clocks per pixel (and majority of those lost on RAM R/W), it's highly likely not happening at 20 fps. 15 fps, though - yes. But, perhaps I'd be lucky and the very first bruteforce version would run at 18-20 fps, who knows....
Some 030 stuff already got close to that, so I imagine the 040 would be a lot better. Check out videos of some of the games i mentioned earlier in the thread. Some of the demo coders would be best placed to explain how these features are possible, at speed.
khph_re is offline  
Old 30 December 2020, 21:02   #47
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
Quote:
Originally Posted by khph_re View Post
Some 030 stuff already got close to that, so I imagine the 040 would be a lot better. Check out videos of some of the games i mentioned earlier in the thread. Some of the demo coders would be best placed to explain how these features are possible, at speed.
I understand it's possible on 040, I am merely saying it's not easy, as in first version of code you write would run at the desired frame rate.

I wrote plenty of texturing routines on Jaguar that ran at 12 fps as a first version, but gradually, across 3-5 more rewrites, I got to 60 fps.

I guess some people who just saw the last build might be tempted to say - "yeah, it's easy".
But that was not my experience, despite having written the code myself...

Of course, on Amiga, it should not be a huge problem, as it doesn't have an extremely buggy HW decoder pipeline, so any combination of instructions will actually execute correctly 100% of the time (unimaginable concept on Jaguar's RISC GPU/DSP).



Besides, if it was actually "easy", wouldn't there be several dozens of 20 fps textured low poly games already for 25 MHz 040 :-) ?
VladR is offline  
Old 30 December 2020, 21:32   #48
Reynolds
Alien Breeder
Reynolds's Avatar
 
Join Date: Dec 2007
Location: Szigetszentmiklos / Hungary
Age: 44
Posts: 615
I know only that if there would be a well working GBA emulator OR a native racing game for Amiga, I'd pay for it to get it.
Reynolds is offline  
Old 30 December 2020, 22:11   #49
khph_re
Registered User

 
Join Date: Feb 2008
Location: Northampton/UK
Posts: 391
Quote:
Originally Posted by VladR View Post
I understand it's possible on 040, I am merely saying it's not easy, as in first version of code you write would run at the desired frame rate.

I wrote plenty of texturing routines on Jaguar that ran at 12 fps as a first version, but gradually, across 3-5 more rewrites, I got to 60 fps.

I guess some people who just saw the last build might be tempted to say - "yeah, it's easy".
But that was not my experience, despite having written the code myself...

Of course, on Amiga, it should not be a huge problem, as it doesn't have an extremely buggy HW decoder pipeline, so any combination of instructions will actually execute correctly 100% of the time (unimaginable concept on Jaguar's RISC GPU/DSP).



Besides, if it was actually "easy", wouldn't there be several dozens of 20 fps textured low poly games already for 25 MHz 040 :-) ?
I feel your pain, we had the same problem with buggy Wii dev kits back in the day :-)

I guess the the thread could be taken to mean,: 'given the ideal engine, from an ideal coder, what spec would it take ?

It helps that the GBA is lower resolution than an AGA Amiga, gives an 030/040
a bit of breathing room.

It's also possible to overcome the 256 colour limit using HAM mode in higher resolutions, and treating 3 pixels as one to overcome fringing, but how much resources that would leave left is beyond me.
khph_re is offline  
Old 30 December 2020, 22:38   #50
d4rk3lf
Registered User

d4rk3lf's Avatar
 
Join Date: Jul 2015
Location: Novi Sad, Serbia
Posts: 1,373
@VladR

Bro, when I said "easily", I didn't meant that it's easy to program (personally, I respect everybody who knows to program Pang game, and above), but for experienced Amiga programmer, 040 should give him enough raw power to do everything GBA do (maybe not weakest 040/25Mhz, but 40 and 50Mhz).
Especially, if we take the low resolution in mind


As I said, I can't go in debate with you, in technical questions, because you are so much more experienced in these stuff. My assumption is based only on what I saw on both machines. Unfortunately, there isn't enough (if any) games specially designed for 040, but, for example, I think GBA couldn't make game go smooth like Genetic Species run on 040 (and especially not in that res).
These driving games that GBA have I think they are doable even on 030.

I just THINK! Don't hang me if I am wrong.

I could be very wrong of course.
d4rk3lf is offline  
Old 30 December 2020, 23:15   #51
VladR
Registered User

 
Join Date: Dec 2019
Location: North Dakota
Posts: 422
You are right, I didn't consider an experienced 040/060 coder who has written plenty of Asm code optimized for the 6-stage pipeline. And/or FP/Integer parallelism.

For somebody with that kind of experience, it would, indeed, be easy.

I guess I just get triggered because I burnt through hundreds of hours on Jaguar discovering undocumented illegal instruction combos, so when I hear easy pipeline-optimized Asm code, my brain explodes :-)
My apologies, if I was sarcastic, don't want to be like that anymore...
VladR is offline  
Old 30 December 2020, 23:24   #52
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,152
I still distinctly remember lusting after the 68040 (and later 68060) back in the day. Even though I fully understand that technology has moved on and they're completely obsolete in terms of performance, I still feel they're really high end stuff. Probably in part because I never had either one myself.

Back on topic, I guess that part of the reason there's so few 3D games with textures for the Amiga is probably that most of that stuff happened after the Amiga was basically 'dead' in the market and 3D games are quite difficult to get right - making them less likely for "homebrew".

Of course, C2P issues and low memory bandwidth also play a big part here. Making an Amiga with native graphics do GBA games is going to be an uphill battle in my opinion, just for that reason alone.

In short, I don't think it's actually the 68040 that's the problem here.
roondar is offline  
Old 18 January 2022, 00:25   #53
eXeler0
Registered User

eXeler0's Avatar
 
Join Date: Feb 2015
Location: Sweden
Age: 48
Posts: 2,188
So I ran into this today..
Custom version of "Open Lara" on GBA (reverse engineered Lara Croft 3d from 1996)
[ Show youtube player ]
eXeler0 is online now  
Old 18 January 2022, 10:57   #54
khph_re
Registered User

 
Join Date: Feb 2008
Location: Northampton/UK
Posts: 391
That's impressive! Hand written assembler though, it would need someone with the skills to pay the bills to take the original source and write it in assembler for the Amiga. Nice he mentions the Amiga.

I think Rise by TRSI/Mellow Chips is the closest we have, and it ran ok on my 030/50.
Of course it's it's not interactive - probably too many tricks behind the scenes for that.

[ Show youtube player ]
khph_re is offline  
Old 18 January 2022, 11:53   #55
gimbal
cheeky scoundrel

gimbal's Avatar
 
Join Date: Nov 2004
Location: Spijkenisse/Netherlands
Age: 40
Posts: 5,004
Quote:
Originally Posted by eXeler0 View Post
So I ran into this today..
Custom version of "Open Lara" on GBA (reverse engineered Lara Croft 3d from 1996)
[ Show youtube player ]
Just goes to show that pretty much anything is possible in tech-land.

But that still doesn't imply that everything is feasible, Tomb Raider GBA came to be because the stars were aligned and someone very skilled and who is not afraid to ask others for help decided to bite down hard and not let go.
gimbal is online now  
Old 18 January 2022, 13:21   #56
Galahad/FLT
Going nowhere

Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 48
Posts: 8,404
Quote:
Originally Posted by khph_re View Post
That's impressive! Hand written assembler though, it would need someone with the skills to pay the bills to take the original source and write it in assembler for the Amiga. Nice he mentions the Amiga.

I think Rise by TRSI/Mellow Chips is the closest we have, and it ran ok on my 030/50.
Of course it's it's not interactive - probably too many tricks behind the scenes for that.

[ Show youtube player ]
Also font forget the screen resolution of the GBA is much smaller, so whilst the small screen might look odd on Amiga, it makes conversions more doable by having to process less data to put onscreen.

That doesn't mean Amiga A1200 would be able to do everything, but certainly it could make a valiant attempt
Galahad/FLT is offline  
Old 18 January 2022, 15:13   #57
khph_re
Registered User

 
Join Date: Feb 2008
Location: Northampton/UK
Posts: 391
Quote:
Originally Posted by Galahad/FLT View Post
Also font forget the screen resolution of the GBA is much smaller, so whilst the small screen might look odd on Amiga, it makes conversions more doable by having to process less data to put onscreen.

That doesn't mean Amiga A1200 would be able to do everything, but certainly it could make a valiant attempt
Yeah, I was thinking that super hi-res trick that gives a lower than low res copper chunky screen.

Or the treating 4 ham pixels as one pixel, another very low resolution but fast and high colour way of getting a small screen.

both demo scene tricks I believe.

Not sure an 14 mhz 020 would cut the mustard, even with fast ram, then again I never thought i'd see the GBA doing tomb raider!
khph_re is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Gameboy advance for Amiga xboxown request.Apps 2 18 March 2019 23:08
for sale gameboy advance with 128 games sidrulez! MarketPlace 5 15 November 2014 13:22
Another Amiga conversion on Gameboy Advance Big-Byte Retrogaming General Discussion 11 11 December 2002 14:57
Play Jon Ritman's Batman on your Gameboy Advance Uukrul Retrogaming General Discussion 7 19 September 2002 03:27
Gameboy Advance Section Fred the Fop project.EAB 11 01 May 2002 06:55

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 20:01.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, vBulletin Solutions Inc.
Page generated in 0.09595 seconds with 14 queries