English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 19 December 2019, 16:40   #141
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
With such a fast CPU, blitter becomes rather redundant

but on a vanilla 68000 / OCS it is magic

Last edited by DanScott; 19 December 2019 at 16:46.
DanScott is offline  
Old 19 December 2019, 16:41   #142
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
Quote:
Originally Posted by VladR View Post
Wonder how Blitter works there ? Is it blitting at the full RAM bandwidth of 670 MB/s? That would be sweet !
I think they mentioned in the video that AGA runs at 'normal' speed, it's just cpu<->chipram access which is faster.
hooverphonique is offline  
Old 19 December 2019, 19:44   #143
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by grond View Post
The 080 CPU allows 64 bit mem accesses while the blitter does only 16 bit.
Wait, does the CPU actually have 64-bit registers, as in - is it a true 64-bit CPU ?
Or, is it like on a Jaguar, where on GPU (RISC) you have couple 64-bit ops:
STORE R1, (R2) : 32-bit store
STOREP R1, (R2): 64-bit store



?


Quote:
Originally Posted by grond View Post
Since the blitter and CPU memory accesses are serialised in the V4 due to long mem bursts, the blitter always loses when compared with a CPU-only routine regardless of the memory bandwidth.
I've done a crazy amount of benchmarking on Jaguar where I was comparing pure SW vs HW scanline rasterizing of polygons.
Now, Jaguar's Blitter has an option of the 64-bit transfer, which only works with horizontal blits (e.g. scanlines of a polygon). There's a scanline length threshold where it doesn't make sense (even at 64-bit) to spin blitter up, as the CPU will have made the blit long time ago already - just like on Atari ST or Amiga.



Unfortunately, the GPU cache is just 4 KB, so you can't keep all versions of rasterizer inside (and swapping new GPU code back and forth few times per frame is extremely prohibitive to the point of net loss).


That being said, it would be a very interesting benchmark as to at what particular frequency the SW-only approach ALWAYS wins on 68080.
Since they estimated the CPU to be roughly 250 MHz 68060, I'd suspect that particular threshold is crossed around ~100 MHz.
VladR is offline  
Old 19 December 2019, 19:53   #144
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by DanScott View Post
With such a fast CPU, blitter becomes rather redundant

but on a vanilla 68000 / OCS it is magic
Not necessarily, although you'd highly likely wouldn't use it to draw polygons.


On Jaguar, which has a 13.3 MHz 68000 and 26.6 MHz RISC GPU, the parallel work of Blitter is best seen in Clearing Framebuffer. You trigger the ClearScreen at start of frame, and in parallel, let GPU transform the polygon soup.


Given the ultra-high speed bandwidth of 670 MB/s - I wonder if 68080 can actually clear the framebuffer faster than Blitter ?


My point is, it still may be beneficial to let Blitter do clearing in parallel and not kill CPU by doing such a mundane task, when there's full frame worth of 3D data to crunch through...


Especially with 64-bit access, and 4-bit colors, the 320x200 is just 32 KB, which is 500 writes of 64-bit zero...
VladR is offline  
Old 19 December 2019, 20:04   #145
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by hooverphonique View Post
I think they mentioned in the video that AGA runs at 'normal' speed, it's just cpu<->chipram access which is faster.
Thanks, I guess I missed that (more like didn't understand what they mean by it).


From a purely practical standpoint, I just did one such benchmark on Jag last week, when I implemented a classic Voxel terrain (which I already discarded as it looks like crap) and the Blitter version was roughly 4x faster than pure GPU version.


So, that puts the threshold (in that particular application - not that it can be generalized), to around 100 MHz, which obviously 68080 beats by a pretty large margin






Still, it'd be a very sweet toy to play with, as a 3D coder
VladR is offline  
Old 20 December 2019, 09:19   #146
grond
Registered User
 
Join Date: Jun 2015
Location: Germany
Posts: 1,918
Quote:
Originally Posted by VladR View Post
Wait, does the CPU actually have 64-bit registers, as in - is it a true 64-bit CPU ?
Or, is it like on a Jaguar, where on GPU (RISC) you have couple 64-bit ops:
STORE R1, (R2) : 32-bit store
STOREP R1, (R2): 64-bit store
Yes, the 68080 is a full 64 bit CPU and also supports 3-operand address modes. It also has a SIMD-unit called AMMX which borrows instructions from Intel's MMx but has all the powerful 68k EA modes.

Here is a short overview:

https://wiki.apollo-accelerators.com...llo_core:start


Quote:
Since they estimated the CPU to be roughly 250 MHz 68060
Take that estimate with a grain of salt, it is a rough figure that requires good parallel ASM code. I would estimate that legacy code created by a compiler will usually run at about 1.6x the speed of an equally clocked 060. Some code, of course, will be dramatically faster on an 080 and in this case the 250 MHz estimate would even be too low. Examples would be memory intensive code, some FPU code, code that includes 64bit integer MUL/DIV which have to be emulated through an exception on the 060, code that uses bitfield instructions (which have become useful at last). And then there are things you can do with the 080 that you might be able to do with an 060 with dedicated code for each architecture where the 080 will run circles around the 060, e.g. when using AMMX code or the instructions that the 080 added over the previous 68k processors. One exmaple for this would be the mpeg-player RiVA which now has support for the 080 and can play videos in full colour and hires.
grond is offline  
Old 20 December 2019, 10:35   #147
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
Quote:
Originally Posted by VladR View Post
From a purely practical standpoint, I just did one such benchmark on Jag last week, when I implemented a classic Voxel terrain (which I already discarded as it looks like crap) and the Blitter version was roughly 4x faster than pure GPU version.
For what voxel terrain operation(s) did you use the blitter?
hooverphonique is offline  
Old 20 December 2019, 11:42   #148
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by hooverphonique View Post
For what voxel terrain operation(s) did you use the blitter?
Drawing the voxel's rectangles themselves.
Since I'm in the 65,636 color video mode, it takes 2 bytes per pixel. And even though Blitter is just in Pixel mode (not the ~8x faster Phrase mode that could be run if Voxel.XL = 4 (8 Bytes)), it's still significantly faster due to parallelism.


There's still some waiting involved, but I get to fully process the next voxel (load,transform,colorize,clip) on GPU and only wait then, so usually waiting is minimal.


But the GPU-only version has an abysmal framerate, which is directly dependant on number of scanlines in each voxel (tested that).


On Jag, to draw a rectangle via Blitter there's no additional register to set - because even if you draw just single scanline, you still have to set the same registers (just with YL = 1), so that part isn't more taxing anyway.
Of course, chunky helps
In total, there's 5 Blitter registers to be computed and set for each voxel - Position, Delta:X/Y, Length:X/Y, Color and ExecuteCommand.
VladR is offline  
Old 20 December 2019, 12:08   #149
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by grond View Post
Yes, the 68080 is a full 64 bit CPU and also supports 3-operand address modes. It also has a SIMD-unit called AMMX which borrows instructions from Intel's MMx but has all the powerful 68k EA modes.

Here is a short overview:

https://wiki.apollo-accelerators.com...llo_core:start




Take that estimate with a grain of salt, it is a rough figure that requires good parallel ASM code. I would estimate that legacy code created by a compiler will usually run at about 1.6x the speed of an equally clocked 060. Some code, of course, will be dramatically faster on an 080 and in this case the 250 MHz estimate would even be too low. Examples would be memory intensive code, some FPU code, code that includes 64bit integer MUL/DIV which have to be emulated through an exception on the 060, code that uses bitfield instructions (which have become useful at last). And then there are things you can do with the 080 that you might be able to do with an 060 with dedicated code for each architecture where the 080 will run circles around the 060, e.g. when using AMMX code or the instructions that the 080 added over the previous 68k processors. One exmaple for this would be the mpeg-player RiVA which now has support for the 080 and can play videos in full colour and hires.
Thanks, that link was extremely informative, especially the AMMX part that was linked at the bottom !


This monster can do vector operations in a single instruction ! Yeah, for that part, the 250 MHz estimate would surely be low.



It has 32 registers, like RISC chips, but addressing modes of 68000 AND SIMD. Just insane


Writing 3D engine for this thing would be a wet dream
VladR is offline  
Old 20 December 2019, 12:34   #150
Tigerskunk
Inviyya Dude!
 
Tigerskunk's Avatar
 
Join Date: Sep 2016
Location: Amiga Island
Posts: 2,770
Quote:
Originally Posted by VladR View Post
Writing 3D engine for this thing would be a wet dream
Go for it then...

Tigerskunk is offline  
Old 20 December 2019, 13:32   #151
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
Quote:
Originally Posted by Steril707 View Post
Go for it then...

Well, I just might

If I'm reading their site right, even though they were demonstrating it on standard A500/A600, they actually sell the stand-alone version for 549 EUR.


Meaning, no actual Amiga is needed, and I can connect it to my second monitor (where my Jag is connected) and just need to figure out how to deploy builds from my main PC's Notepad++ - ideally somehow sharing SD card between both hosts (can't keep manually messing with SD card for every single build, obviously).


So, am I reading this right ? Is it truly a completely stand-alone solution without having to mess with installing the board into the old Amiga ?
VladR is offline  
Old 20 December 2019, 13:58   #152
grond
Registered User
 
Join Date: Jun 2015
Location: Germany
Posts: 1,918
Quote:
Originally Posted by VladR View Post
If I'm reading their site right, even though they were demonstrating it on standard A500/A600, they actually sell the stand-alone version for 549 EUR.
The V4 stand-alone is available right now through apollo-core.com.
I have one of them on my desk. It's a pretty neat system and getting better every day.


Quote:
Meaning, no actual Amiga is needed, and I can connect it to my second monitor (where my Jag is connected) and just need to figure out how to deploy builds from my main PC's Notepad++ - ideally somehow sharing SD card between both hosts (can't keep manually messing with SD card for every single build, obviously).
I think the best option would be to set up an ethernet connection between the V4 and your PC and mount a smb share provided by the PC on the Amiga side. You could even do that in the startup-sequence and make it execute your test executable generated by a cross-compiler on the PC on every boot.


Quote:
So, am I reading this right ? Is it truly a completely stand-alone solution without having to mess with installing the board into the old Amiga ?
Yes, it is.
grond is offline  
Old 21 December 2019, 15:02   #153
VladR
Registered User
 
Join Date: Dec 2019
Location: North Dakota
Posts: 741
That's pretty cool. I just ordered the Vampire.

Are you doing any coding on it ?

Where's the best place to ask coding / environment questions ? Apollo.com forums don't really look like a place to ask coding questions - just browsed through 50 pages and it's nonsense/flamewars mostly...
VladR is offline  
Old 22 December 2019, 12:30   #154
d4rk3lf
Registered User
 
d4rk3lf's Avatar
 
Join Date: Jul 2015
Location: Novi Sad, Serbia
Posts: 1,645
Quote:
Originally Posted by VladR View Post
Where's the best place to ask coding / environment questions ? Apollo.com forums don't really look like a place to ask coding questions - just browsed through 50 pages and it's nonsense/flamewars mostly...
I'd say, why not?
You have spammers and trolls everywhere.
But I think developers of vampire would be very happy to help anybody that is willing to push their hardware as far as possible.
Maybe it's best that you have 2 threads (here and on Apollo), or even more on some other forums, if you want to.

I am not in coding at all, but I am personally also very interested what you will cook. I guess, there are many people like me.

And yes, I really like your idea of using just flat shaded polygons (untextured), with hi poly models. One of the things I never understood, is why many people insist on low res (often very ugly) textures for the 3D games, and flat shaded (like Frontier), generally looks much nicer to me.

There's an exception of course, where textures looks really cool, like in this A500 doom clone:
http://eab.abime.net/showthread.php?t=98654

That guy did amazing job for 68000, I am posting you link in case you missed it (I know it's not what you are planning to do, but I guess it might be interesting to you)

Word by word, I've gone of topic, sorry for that.
d4rk3lf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Games that are Full Frame Rate or Slower - Limitations or Choice? Foebane Retrogaming General Discussion 35 08 April 2018 13:22
F1 grand prix frame rate universale support.Games 18 13 July 2015 21:45
The First Person Shooter frame rate tolerance poll... DDNI Retrogaming General Discussion 41 30 June 2011 03:32
Vsync Fullscreen and Double Buffer, incorrect frame rate? rsn8887 support.WinUAE 1 07 April 2011 20:43
Propper speed request when recording with "Disable frame rate" turned on. Ironclaw request.UAE Wishlist 9 02 August 2006 07:21

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 12:47.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10198 seconds with 15 queries