28 March 2024, 11:23 | #201 |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,480
|
|
28 March 2024, 11:25 | #202 |
Retro Freak
Join Date: Nov 2001
Location: Slovenia
Age: 51
Posts: 1,669
|
|
28 March 2024, 13:06 | #203 |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
@all
Thank you! Yep, sorry, that was an idea I had just before making the video I had focused mostly on the stock A1200 configuration and until then I had made only separate tests without video output of just the bare scaling routine to make an alternative version for FAST RAM-equipped machines. If I manage, later today I'll make tests with the data cache burst, which I expect to produce even better results (at least when the shrinking is not extreme - but in that case the burst could be disabled). Last edited by saimo; 28 March 2024 at 13:36. |
28 March 2024, 13:09 | #204 | |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
Quote:
I envy you a little bit for having a CRT. Now I have only the memories of how this stuff looked like between 1997-2003 (when I still had a functional Commodore 1940 monitor). |
|
28 March 2024, 23:48 | #205 |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
@tomcat666 @alexh
Given that you asked, I have just uploaded a new version that allows to enable/disable the fps limit by means of [F3] Code:
* The number shown in the top-left corner of the effects screen is the fps indicator, which reports the number of frames rendered in the last second. It is limited to 999. * When the fps limit is on, the maximum number of frames rendered per second is 50 also on the most powerful machines, as the display refresh rate is 50 Hz. When the fps limit is off, frames are rendered without pausing when the previously rendered frame/frames has/have not (completely) displayed yet. On machines which cannot run the program at 50 fps or more, turning off the limit has no effect whasoever; on the other machines, the only visible effect is that the fps indicator goes beyond 50, thus giving a measure of the maximum speed that the machine can reach. @all Now Zoomaniac runs 1-2 fps faster on 68030 thanks to the data cache burst: Code:
* on 68030 tests proved that: it is advantageous to turn the data cache burst on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots (i.e. with an X scaling factor greater than 1/16); with a scaling factor of 1/16 or less the difference proved to be minimal when both the source and destination rectangles were 256 dots tall; considering that turning the data cache burst off would therefore be advantageous only with very narrow and tall rectangles (which are uncommon and intrinsically rather inexpensive), it is not worth it to implement a data cache burst management inside the scaling routine; Code:
* given that a stock Amiga 1200 reaches about 25.5 fps, it manages to render 128*256*25.5 = 835584 dots per second; considering that the 68020 is clocked at 14.187580 MHz, rendering 1 dot requires about 14187580/835584 = 17 cycles; v1.1 (28.3.2024) * Turned the 68030 data cache burst on for slightly faster performance. * Made a couple of minor optimizations. * Added frames rendering limit toggle ([F3]). * Worked on fps indicator: added hundreds digit; made digits smaller; made digits auto-clearing, so that they read correctly also when they are not cleared before drawing. * Made staggered lines toggle as soon as [F1] is pressed (instead of when it is released). * Updated splash screen. * Redesigned the 'M' in the logo. * Updated and extended manual. |
29 March 2024, 07:17 | #206 |
Retro Freak
Join Date: Nov 2001
Location: Slovenia
Age: 51
Posts: 1,669
|
"Unlimited" fps version on pistorm32: 214 fps when zooming (staggered lines mode brings it down to 208).
[ Show youtube player ] |
29 March 2024, 13:34 | #207 | |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
Quote:
And that isn't even the PiStorm's limit, but the Amiga's: 214 fps mean that your PiStorm writes in 1 second 214x128x256 = 7012352 bytes to CHIP RAM, which is about the CHIP bus limit of 3546895/2 longwords = 7093790 bytes per second. Considering that 214 is a rounded value, basically your card saturates the CHIP bus entirely. |
|
29 March 2024, 18:08 | #208 | |
Retro Freak
Join Date: Nov 2001
Location: Slovenia
Age: 51
Posts: 1,669
|
Quote:
|
|
02 April 2024, 22:00 | #209 |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
To have a complete set of scaling routines (which hopefully I'll use for something someday), I added support for color-keying, zero-keying (color-keying with color 0), and horizontal and vertical flipping.
Morever, given that initially the focus was on the stock A1200, the performance on expanded machines was not optimal (as the rendering was done directly in CHIP RAM), so I added also an alternative buffering method that, when 2 rasters can be allocated in FAST RAM, allows rendering in FAST RAM and then copies the rendered raster to the raster in CHIP RAM as quickly as possible, starting when the beam reaches the bottom of the screen. This, relatively to the first effect in the test program (which is the only one whose performance was measured until now), produced a gain of 8-9 fps on my 68030-equipped Amiga 1200. The updated test program (available at https://retream.itch.io/ped81c), to demostrate the new features, streches and shrinks a color/zero-keyed texture covering almost the entire screen over a full-screen zooming background, with all the possible flipping combinations. That is of course a bit taxing for a stock A1200, whose performance drops between 12 and 16 fps in the busiest cases. [ Show youtube player ] (Side note: the video was recorded before finalizing the test program, so it shows an outdated splash screen and zooming jumps relatively to the background when passing from/to the color/zero-keying effects.) This snippet from the updated manual provides further details. Code:
-------------------------------------------------------------------------------- OVERVIEW Zoomaniac has been written to evaluate the performance on stock and modestly- accelerated Amiga 1200s of some general-purpose texture scaling routines in conjunction with PED81C. -------------------------------------------------------------------------------- GETTING STARTED Zoomaniac requires: * Amiga computer * AGA chipset * 170 kB of CHIP RAM * 1.2 MB of any RAM * PAL SHRES support * keyboard * 1 MB of storage space To install Zoomaniac, unpack the LhA archive to any directory of your choice. To start Zoomaniac, open the program directory and double-click the program icon from Workbench or execute the program from shell. If your monitor / graphics card / scan doubler do(es) not support SHRES, the colors will look off or even not show at all. In such case, to hopefully fix the colors a bit, try the staggered lines option. -------------------------------------------------------------------------------- CONTROLS KEY | SPLASH SCREEN | EFFECTS SCREEN ----------+-----------------------------+---------------------------- [SPACE] | go to effects screen | [F1] | turn staggered lines on/off | turn staggered lines on/off [F2] | turn fps indicator on/off | turn fps indicator on/off [F3] | turn fps limit on/off | turn fps limit on/off [ESCAPE] | quit to AmigaOS | go to splash screen -------------------------------------------------------------------------------- MISCELLANEOUS * The staggered lines shift the odd lines by 1 SHRES pixel to the right. On systems which handle SHRES correctly, that will reduce the jailbars effect (but give the screen a kind of wavy look). On system which handle SHRES as HIRES (for example, MNT's VA2000 graphics card and Irix Labs' ScanPlus AGA - contrary to how is was originally marketed - display only the even or odd columns of pixels, so only reds and blues or greens and grays show), that helps improving the colors a bit (giving the screen a kind of scanline effect). On other systems, the results are unpredictable, but the option is still worth a try. * The number shown in the top-left corner of the effects screen is the fps indicator, which reports the number of frames rendered in the last second. It is limited to 999. * When the fps limit is on, the maximum number of frames rendered per second is 50 also on the most powerful machines, as the display refresh rate is 50 Hz. When the fps limit is off, frames are rendered without pausing when the previously rendered frame/frames has/have not (completely) displayed yet. On machines which cannot run the program at 50 fps or more, turning off the limit has no effect whasoever; on the other machines, the only visible effect is that the fps indicator goes beyond 50, thus giving a measure of the maximum speed that the machines can reach. -------------------------------------------------------------------------------- PERFORMANCE The following results are relative to the full screen effect that zooms the cosmonaut in and out without flipping. The source textures are 256x512 dots and the screen internally consists of 128x256 dots. Since a dot is represented by a byte, 128x256 = 32768 bytes are fetched and written to render a frame. On a stock Amiga 1200, the execution speed is between 25 and 26 fps. If the staggered lines are turned on, the performance drops by about 1 fps (albeit all that such option adds is a Copper WAIT and a Copper MOVE for each rasterline). Given that the DMA load caused by PED81C is "double" (see its documentation for the details), a version that uses only half the number (2) of bitplanes has been made to check the performance as if the Amiga had a native chunky video mode. Surprisingly, the performance did not improve at all: relatively to the CHIP bus access, the scaling code must interleave so nicely with the bitplane data fetches that having more bus cycles available does not make any/much difference. An Amiga 1200 equipped with a 68030 clocked at 50 MHz and 60 ns FAST RAM easily performs at steady 50 fps. To find out the maximum performance, tests were made with the fps limit off. The speed when running the program normally was between 84 and 86 fps. The staggered lines option lowered the fps by about 1. The 2 bitplanes versions ran at the same speed - in this case, that is because most of the CHIP RAM accesses happen when no bitplanes DMA is going on (see TECHNICAL DETAILS section). The following table sums up the results. staggered lines | off | on -------------------+-------+-------- stock Amiga 1200 | 25-26 | 24-25 exanded Amiga 1200 | 84-86 | 84-85 expanded Amiga 1200: Blizzard 1230 IV, 68030 @50 MHz, 60 ns FAST RAM Notes: * given that a stock Amiga 1200 reaches about 25.5 fps, it manages to render 128*256*25.5 = 835584 dots per second; considering that the 68020 is clocked at 14.187580 MHz, rendering 1 dot requires about 14187580/835584 = 17 CPU cycles; * on 68030 tests proved that: it is advantageous to turn the data cache burst on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots (i.e. with an X scaling factor greater than 1/16); with a scaling factor of 1/16 or less the difference proved to be minimal when both the source and destination rectangles were 256 dots tall; considering that turning the data cache burst off would therefore be advantageous only with very narrow and tall rectangles (which are uncommon and intrinsically rather inexpensive), it is not worth it to manage the data cache burst inside the scaling routines. -------------------------------------------------------------------------------- SCALING ROUTINES The scaling routines fit any rectangle from a texture into a rectangle of any size and ratio of another texture with nearest-neighbor matching. Optionally, they can flip the rectangles horizontally and/or vertically, and treat as transparent the dots of a specific color (color-keying) or of color 0 (zero- keying). Color/zero-keying allows to render graphics of arbitrary shapes without masks (which saves RAM and CPU cycles). Thanks to the fact that PED81C graphics always use at most 81 colors, there are 256-81 = 175 colors that can be used for color- keying without causing any visual loss. For performance reasons, there are the 3 separate routines. routine | color-keying | zero-keying | speed rating -----------------------+--------------+-------------+-------------- v_ScaleRectangle() | | | *** v_ScaleRectangle_CK() | * | | * v_ScaleRectangle_ZK() | | * | ** -------------------------------------------------------------------------------- OTHER TECHNICAL NOTES * Logic and rendering are totally asynchronous: the logic runs always at 50 Hz and the rendering never stops (unless it reaches 50 fps and the fps limit is on), thus exploiting the machine's full potential. * The screen is triple-buffered. * When 2 rasters can be allocated in FAST RAM: 1. the graphics are rendered always to the available raster in FAST RAM; 2. after the rendering has completed and as soon as the bottom rasterline has has been displayed, the rendered raster is copied as quickly as possible to the raster in CHIP RAM (which is the one that gets displayed). The copy successfully races the beam (on the expanded Amiga 1200 mentioned in the PERFORMANCE section, it requires about 57 rasterlines during the vertical blanking and 35 rasterlines during the fetching of the top rasterlines), so no tearing occurs. Such method yields a faster performance than rendering directly to a raster in CHIP RAM (especially when there is overdraw and/or data gets also read from the raster). * The screen resolution is 1020x256 SHRES pixels, which correspond to 255x256 LORES-sized physical dots and to 128x256 logical dots. * The code is 100% assembly. * The program takes over the system entirely and returns to AmigaOS cleanly. |
03 April 2024, 13:42 | #210 |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,480
|
Very cool. As always. I hope you can find time to write more things using this pseudo screen mode.
|
10 April 2024, 15:16 | #211 |
Registered User
Join Date: May 2006
Location: Spain
Age: 42
Posts: 76
|
Hi!
I am trying to implement this display system. Everything seems fine except the colors. If I choose an RGBW model, the green color does not appear. I get variations of red, blue, magenta colors. This is my copper list: Code:
FMODE, 0xf, BPLCON0, 0x4241, BPLCON1, 0x10, BPLCON2, 0x224, BPL1MOD, 0, BPL2MOD, 0, DIWSTRT, 0x2C82, DIWSTOP, 0x2CC1, DDFSTRT, 0x0038, DDFSTOP, 0x00d0, DIWHIGH, 0xA100, BPL1PTH, 0, BPL1PTL, 0, BPL2PTH, 0, BPL2PTL, 0, BPL3PTH, 0, BPL3PTL, 0, BPL4PTH, 0, BPL4PTL, 0, COLOR00, 0x0000, COLOR01, 0x0800, COLOR02, 0x0800, COLOR03, 0x0f00, COLOR04, 0x0000, COLOR05, 0x0080, COLOR06, 0x0080, COLOR07, 0x00f0, COLOR08, 0x0000, COLOR09, 0x0008, COLOR10, 0x0008, COLOR11, 0x000f, COLOR12, 0x0000, COLOR13, 0x0888, COLOR14, 0x0888, COLOR15, 0x0fff, BPLCON3, 0x220, COLOR00, 0x0000, COLOR01, 0x0000, COLOR02, 0x0000, COLOR03, 0x0f00, COLOR04, 0x0000, COLOR05, 0x0000, COLOR06, 0x0000, COLOR07, 0x00f0, COLOR08, 0x0000, COLOR09, 0x0000, COLOR10, 0x0000, COLOR11, 0x000f, COLOR12, 0x0000, COLOR13, 0x0000, COLOR14, 0x0000, COLOR15, 0x0fff, BPLCON3, 0x20, 0xffff, 0xfffe, 0xffff, 0xfffe There are 3 bit planes allocated (ptr10, ptr11, ptr12), where ptr10 is the raster, and planes 3 and 4 (ptr11, ptr12) are filled with 0x55 and 0x33, and the bit planes are defined in the copper list: Code:
ptrl = (ULONG*)ptr11; for (i = 0; i < ((_SCREENWIDTH * _SCREENHEIGHT) >> 5); i++) { *ptrl++ = 0x55555555; } ptrl = (ULONG*)ptr12; for (i = 0; i < ((_SCREENWIDTH * _SCREENHEIGHT) >> 5); i++) { *ptrl++ = 0x33333333; } cop1[BPL0] = (ULONG)ptr10 >> 16; cop1[BPL0 + 2] = (ULONG)ptr10 & 0xffff; cop1[BPL1] = (ULONG)ptr10 >> 16; cop1[BPL1 + 2] = (ULONG)ptr10 & 0xffff; cop1[BPL2] = (ULONG)ptr11 >> 16; cop1[BPL2 + 2] = (ULONG)ptr11 & 0xffff; cop1[BPL3] = (ULONG)ptr12 >> 16; cop1[BPL3 + 2] = (ULONG)ptr12 & 0xffff; Any idea why the color system is failing? Thanks! Last edited by balrogsoft; 10 April 2024 at 15:22. |
10 April 2024, 16:10 | #212 | |||
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
Thanks for giving it a try!
Quote:
Quote:
(On another note: are you planning to manipulate the display dynamically? If not, you can remove everything but the reloading of the bitplanes and the final wait, and thus save precious CHIP bus cycles.) Quote:
|
|||
10 April 2024, 18:53 | #213 | ||
Registered User
Join Date: May 2006
Location: Spain
Age: 42
Posts: 76
|
It is very interesting, so much so that I started programming and testing this system.
Quote:
I just downloaded the example, until now I was basing all my code on the documentation you posted on the first page. The palette looks like the same one I get on my code, until I press F1 in Zoomaniac which colors seems ok. So these colors are obtained because I'm using a HIRES screen instead of a SHRES? This is the image you provided as shown in my code: Quote:
Yes, I have read this in your posts on the first page, for now I want to have the colors working before optimizing this. Yes, _SCREENWIDTH = _RASTERWIDTH*8, and _RASTERWIDTH is 160. Last edited by balrogsoft; 10 April 2024 at 19:01. |
||
10 April 2024, 19:12 | #214 | |||||
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
Quote:
Quote:
Quote:
Quote:
Quote:
|
|||||
10 April 2024, 20:51 | #215 |
Registered User
Join Date: May 2006
Location: Spain
Age: 42
Posts: 76
|
Thank you very much, then I would only have to implement the staggered lines in copper list by shifting 1 shres pixel in odd lines to look like I saw in Zoomaniac.
|
10 April 2024, 21:59 | #216 |
Registered User
Join Date: Aug 2010
Location: Italy
Posts: 858
|
It's a good idea, so that everybody has a chance to enjoy the result (even if in further degraded quality).
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
No native AGA screens on PIV since P96 v3 upgrade | LoadWB | support.Apps | 0 | 30 October 2020 01:57 |
Extra bottom line on native screens, chipset feature or WinUAE? | PeterK | support.WinUAE | 5 | 11 September 2019 21:21 |
My pseudo 3D jump code | Brick Nash | Coders. AMOS | 24 | 03 September 2016 00:18 |
Chunky to Planar (C2P) -- USELESS GIMMICK?! | crosis38 | support.Hardware | 10 | 09 July 2016 04:17 |
Pseudo Ops Viruskiller | Promax | request.Apps | 0 | 28 July 2010 22:21 |
|
|