English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 11 January 2024, 19:13   #21
reassembler
Registered User
 
reassembler's Avatar
 
Join Date: Oct 2023
Location: London, UK
Posts: 92
Quote:
Originally Posted by pink^abyss View Post
Well done!
I wonder how much the blitter will slow it all down when it comes to blitting the gazillon
of sprites in Outrun...

Although very simple in comparison to Outrun I was prettry amazed back then by the
Super Hang On demo OCS@50fps [ Show youtube player ]
Indeed. That's what we'll be finding out next. If the road rendering is anything to go by, I will be rewriting the sprite code at least 8 times from scratch. Incidentally, I reduced the number of bitplanes used by the road from 3 to 2 yesterday, without losing anything. Well - it runs even faster now!

For the sprites (and I mean OutRun sprites here, not Amiga sprites), rather than jumping straight in and coding like a possessed madman, I'm spending time analysing the OutRun level structure (I coded a level editor 10 years ago which helps), palette usage, and planning out as much as possible on paper in advance. It provides a break from pure coding, as I've been burning the midnight oil a bit.

Beyond that it will be figuring out where best to use the Blitter, and where to live in fast memory and pure CPU. My initial assumption is that I may be better off bypassing the Blitter completely for sprites. But I will probably be wrong.

I have seen the SHO demo. It's cool, although you can take short-cuts with demos that won't cut it for an interactive game. There are a lot of edge-cases to cover. Well, most of them aren't even edge-cases - just part of the game engine.
reassembler is offline  
Old 12 January 2024, 14:14   #22
pink^abyss
Registered User
 
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 408
I don't think more then 10-15 fps will be possible when nothing is left out (on an 1230) regardless if using the Blitter or CPU/C2P only.

Nonetheless, I guess that a vanilla A1200 could handle an amigafied port with less colors, less & smaller sprites pretty well at 25fps while maintaining a lot of the original feel. Like with Gradius/Tinyus or BubbleBobble/Tiny Bobble.

Last edited by pink^abyss; 20 January 2024 at 08:46.
pink^abyss is offline  
Old 18 January 2024, 23:29   #23
agermose
Registered User
 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 220
Great news. This is looking fabulous.
agermose is offline  
Old 18 January 2024, 23:37   #24
reassembler
Registered User
 
reassembler's Avatar
 
Join Date: Oct 2023
Location: London, UK
Posts: 92
Quote:
Originally Posted by pink^abyss View Post
I don't think more then 10-15 fps will be possible when nothing is left out (on an 1230) regardless if using the Blitter or CPU/C2P only.

Nonetheless, I guess that a vanilla A1200 could handle an amigafied port with less colors, less & smaller sprites pretty well at 30fps while maintaining a lot of the original feel. Like with Gradius/Tinyus or BubbleBobble/Tiny Bobble.
I'd like to tell you how many times I've rewritten the same code over the last month, but you might have me sectioned. It's been an interesting exercise in learning the Amiga hardware. Once/if I get to a stage where you can 'debug style' navigate a genuine stage with the cursor keys, with scenery, I'll post a binary.

I'm getting closer.
reassembler is offline  
Old 18 January 2024, 23:39   #25
Torti-the-Smurf
Registered User
 
Torti-the-Smurf's Avatar
 
Join Date: Dec 2018
Location: Earth
Posts: 1,064
When i think about the "US GOLD" published Outrun, i get cold shivers ..

To have a version, that do the Amiga justice, would be only fair and very welcomed !

Thumbs up for your Project
Torti-the-Smurf is offline  
Old 19 January 2024, 07:49   #26
pink^abyss
Registered User
 
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 408
Quote:
Originally Posted by reassembler View Post
I'd like to tell you how many times I've rewritten the same code over the last month, but you might have me sectioned. It's been an interesting exercise in learning the Amiga hardware. Once/if I get to a stage where you can 'debug style' navigate a genuine stage with the cursor keys, with scenery, I'll post a binary.

I'm getting closer.

Looking forward to it! I guess no one knows Outrun better then you. And bringing it in high quality to low spec Amiga is a puzzle that is almost impossible to solve so thumbs up when even trying!

I once had the idea for "Tiny Outrun" on OCS. Running in a tiny screen (160x112) but in high quality. Because of the small size a lot of hardware tricks would have been possible and the blitter could move a lot of bobs because plenty of dma time would be available, especially when using from the copper list. All together with Pretracker music in a single 500kb exe.. ideas are cheap
pink^abyss is offline  
Old 27 January 2024, 19:16   #27
aNdy/AL/COS
Registered User
 
aNdy/AL/COS's Avatar
 
Join Date: Jan 2022
Location: Wales
Posts: 91
Wow! The road routine looks great!
aNdy/AL/COS is offline  
Old 28 January 2024, 10:31   #28
Amigajay
Registered User
 
Join Date: Jan 2010
Location: >
Posts: 2,889
Quote:
Originally Posted by pink^abyss View Post

I once had the idea for "Tiny Outrun" on OCS. Running in a tiny screen (160x112) but in high quality. Because of the small size a lot of hardware tricks would have been possible and the blitter could move a lot of bobs because plenty of dma time would be available, especially when using from the copper list. All together with Pretracker music in a single 500kb exe.. ideas are cheap
Still sounds like a great idea! Those pico-8 low res racers are fun, playability is where it’s at in the Amiga’s twilight years!
Amigajay is online now  
Old 01 February 2024, 22:10   #29
reassembler
Registered User
 
reassembler's Avatar
 
Join Date: Oct 2023
Location: London, UK
Posts: 92
It's been a month since my last update. And I'm going to an 'Amiga Jungle' music event in London tomorrow, so figured I won't be too productive this weekend.

TL;DR: I was quite pleased with my code, until I tried it on hardware again.

Here are some highlights:
1/ Implementation of OutRun's Sprite hardware in software: Realtime Sprite Zooming (!), Realtime Sprite Flipping, Rendering of original source sprite graphics, with no modifications, all original palettes, data, etc.
2/ Master CPU code for scenery rendering and stage data parsing up and running
3/ Custom Sprite Palette Caching Implementation. OutRun fills up to 128 x 16 colour sprite palettes sequentially as each stage progresses. OutRun loves its colours - jeepers. We don't have the luxury of that many simultaneous colours. We've also got to reserve palette space for tilemaps, Ferraris, HUDs, and traffic in the future.

I've managed to reduce the number of palettes the scenery uses from 128 down to a mere 8 per stage! This isn't by culling colours and creating an ugly monstrosity in the process. Instead, I've implemented an AGA palette caching algorithm. This monitors which palettes have been used most recently and cycles them in and out of the colour registers in real-time as you 'drive' through the level. (This is relatively elegant and fast).

I've only had to make minor modifications to level data - its probably unnoticeable to 99% of players. This is relatively easy to do, because I coded Python scripts to export the entire level structure into easily readable commented Assembler. Stage 1 is the most technically demanding tour de force in OutRun. So if that renders, I'm confident the rest will follow. Actually I'm not, but let's pretend I know what I'm doing.

Other stuff:
- Guess what - I rewrote the road code again! I've lost count of the number of rewrites. No, seriously, it's even faster now (it's just slowed down by the new code - bloody hundreds of realtime scaled sprites). The blitter iteration I previously showed was a step forward - however ultimately a limitation too. But the work I did towards it provided a massive boost to the latest version. I rewrote everything the blitter was doing in software. Effectively, spam lots of pixels to memory and merge the roads. To think that the first cut took 8 VBlanks to render a single frame of road. That being said, I haven't figured out a way of writing 68k that doesn't involve multiple rounds of optimization if it has to run at blazing speed.

- Converted rendering to c2p. Chunky and funky. I didn't write my own routine, that doesn't look fun.

Bugs:
- I haven't done any work towards shadows so they resemble bright blobs of randomly coloured vomit. At first, I thought I might turn them off as a performance optimization. But I have a lofty goal of implementing translucent shadows. This is probably a stupid goal. This is possibly a stupid project. Gotta have some stretch goals though I guess.

- There is a frustratingly weird issue with the sprite distribution. It's wrong. I've looked at the obvious engine things and not uncovered the issue yet. You might not notice. But it is annoying the hell out of me. Could turn into a 6 hour debugging session cross referencing Mame. I love those. There are tens of thousands of lines of code at this point. What's worse, I genuinely don't know if it's a bug in porting the core OutRun engine, my rendering code, or some sort of timing oddity. I expect I'll eventually find a misplaced '+' symbol incrementing an address register, or some sort of garbage byte/word mismatch.

What's Next?
- Optimization of sprite code. Optimization in this case = lots of special case routines and losing some of the generalisation I currently have. More code to maintain makes me sad. More speed, however, is goooooood. I've completed some optimization already. Optimization naturally leads to breaking things that already work, and unreadability so it's a fine balance. But realistically, I need every cycle I can find for this bad boy.

Side note: I went back and looked at the System 16 emulator - the first ever emulator to reproduce OutRun from the mid 90s. It required a Pentium running at 200Mhz for decent performance. Sure, that emulated the 68000 code and there's no emulation here - but it was mostly written in x86 assembler and contained a lot of hacks to remove interrupt synchronization between the CPUs. Gotta get this done with a fraction of that power.

- Fix the sprite distribution/layout bug. I hate this bug.
- Convert the remaining level data so every stage renders. Boring from an intellectual point of view. Exciting from a 'pretty stuff' point of view.
- Sort the shadows out. Worst case - turn them off. Best case - bask in translucent glory.
- Tilemaps. They probably won't actually be tilemaps anymore. Need those clouds.

Not bothering with any gameplay stuff at this point. More interested in getting the engine running at speed. Will post a preview in a few weeks so you can run it for yourself. Or find it doesn't run. Which could be useful.

I want to give a shout out to this VSCode plugin - very handy when optimizing. I really appreciate how many amazing dev tools the community has built over the years:
https://github.com/grahambates/68kcounter

And also this guy: https://github.com/Kalmalyzer/kalms-c2p

Coding Questions
1/ I need a way of profiling performance on real hardware or a way of more accurately simulating hardware performance via emulation. My UAE configuration matches my hardware setup theoretically. I've turned on the cycle accurate settings, no JIT etc. However, the performance differential is significant. UAE is so much faster. If I can't figure out where the bottlenecks lie, I'm stabbing around in the dark making guesses. All advice welcome.

2/ Tonight, VSCode with the Amiga Assembly plugin imploded after it auto-updated. I posted a report here:
https://github.com/prb28/vscode-amig...iscussions/290

Does anyone have a workaround to this?

3/ The binary is currently clocking in at a massive 2mb. Not surprising, as there's 1mb of sprite data direct from the arcade. Should I rely on OS functionality to load this binary blob? I noticed the file system threw a wobbly when I tried extracting anywhere other than the ram disk.

Finally, a video
[ Show youtube player ]

Last edited by reassembler; 01 February 2024 at 22:39. Reason: typos, more questions etc.
reassembler is offline  
Old 01 February 2024, 22:30   #30
trixster
Guru Meditating
 
Join Date: Jun 2014
Location: England
Posts: 2,339
It looks tremendous, nice one!
trixster is offline  
Old 01 February 2024, 22:45   #31
Amigajay
Registered User
 
Join Date: Jan 2010
Location: >
Posts: 2,889
Looks amazing so far!
Amigajay is online now  
Old 01 February 2024, 23:00   #32
reassembler
Registered User
 
reassembler's Avatar
 
Join Date: Oct 2023
Location: London, UK
Posts: 92
Quote:
Originally Posted by trixster View Post
It looks tremendous, nice one!
It looks tremendous until it moves

Immense amount of hours getting to this point - which isn't really even impressive. If you were to bill a programmer commercially to do a project like Final Fight Enhanced from Brick Nash, it would probably end up costing 6 figures in time!
reassembler is offline  
Old 01 February 2024, 23:20   #33
saimon69
J.M.D - Bedroom Musician
 
Join Date: Apr 2014
Location: los angeles,ca
Posts: 3,528
You might need to go pre-scaled route and halve the background sprites (or better make the number proportional to RAM and processor), that could help
saimon69 is offline  
Old 01 February 2024, 23:35   #34
Superman
Super Member
 
Superman's Avatar
 
Join Date: Sep 2014
Location: Wakefield
Age: 48
Posts: 1,334
Nice job so far on what I'm sure is a very challenging project. I look forward to watching your progress.
Superman is offline  
Old 02 February 2024, 07:25   #35
DanyPPC
Registered User
 
Join Date: Dec 2016
Location: Italy
Posts: 732
The new engine is fantastic !!!

What about the chance to have 68030 as minimal requirements ?
Nowadays accellerators on Amiga are very common and quite cheap.
DanyPPC is offline  
Old 02 February 2024, 08:26   #36
Tsak
Pixelglass/Reimagine
 
Tsak's Avatar
 
Join Date: Jun 2012
Location: Athens
Posts: 1,032
Quote:
Originally Posted by saimon69 View Post
You might need to go pre-scaled route and halve the background sprites (or better make the number proportional to RAM and processor), that could help
That might be a solution, but seems not that practical in this case. With this amount of objects, RAM can easily bloat fast and become massive. And then he'd also lose smooth scaling he's got now.

Perhaps Blitter driven c2p might help? Not sure how applicable or necessary is in this case. Granted c2p is one of the components slowing this down.

I would definitely drop real time sprites flipping as it is known to be costly. Granted again this is affordable RAM wise.
Tsak is offline  
Old 02 February 2024, 10:21   #37
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,196
Quote:
I need a way of profiling performance on real hardware or a way of more accurately simulating hardware performance via emulation.
I suppose this could be done with some code I've written that logs the current PC at each VBLANK interrupt and python post processing to gather the addresses.
Each VBLANK could turn into "each scanline" with a custom copper interrupt.

BTW I suppose that you optimize blit heights so you don't blit behind the road just to overwrite it by the road?

The difficult part of such a project is that you don't know if you can make it run properly despite all optimization efforts. Reducing framerate can be a solution ... to a point.
jotd is offline  
Old 02 February 2024, 10:35   #38
reassembler
Registered User
 
reassembler's Avatar
 
Join Date: Oct 2023
Location: London, UK
Posts: 92
Quote:
Originally Posted by jotd View Post
BTW I suppose that you optimize blit heights so you don't blit behind the road just to overwrite it by the road?
OutRun's sprite rendering code actually does this! It crops the sprites vertically so that they aren't rendering behind the road. And the hardware can start rendering from any line, so you kind of get this out the box.

Where there will be serious performance loss is the overdraw between sprites, especially when there's multiple massive sprites filling the entire screen close to the 'camera'. There will be a performance gain by culling sprites over a certain size, but you have to be careful - the scenery has a gameplay purpose - even if it's just collision.

First step of optimization for me is really a full understanding of where the cycles are going on the hardware. Having slept on this, I feel like I'll get a debug display going on the Amiga showing the timing of various blocks of code. I'll also implement the ability to toggle layers on and off. For example:

Key 1 - Toggle Road Layer
Key 2 - Toggle Sprite Layer
Key 3 - Toggle Shadows
Key 4 - Toggle Game Logic

Display timings for all of the above, C2P routine etc.

Before I get there, I'll need to implement some sort of text rendering. I also need to fix the issue where the keyboard is 'sticky' on hardware, or just completely unresponsive. I'm using the keyboard implementation from Bare Metal Amiga programming that can be found here.
https://www.edsa.uk/blog/downloads

Keyboard works fine in UAE... not so great embedded in my code on hardware.
reassembler is offline  
Old 02 February 2024, 11:44   #39
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,975
If Your target is 68030, then meynaf's c2p routine is the fastest, if my memory not failed.
Don_Adan is offline  
Old 02 February 2024, 14:29   #40
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,196
Quote:
Keyboard works fine in UAE... not so great embedded in my code on hardware.
it probably doesn't wait enough between clearing & setting BFEE01 bit 6. But there are better routines that use CIA timer to avoid blocking wait for the keyboard (wastes a few precious cycles). Well, most games use blocking wait and there are no speed issues, as keyboard isn't heavily used / the blitter is the bottleneck

If you share that part of the code I may be able to help you out.

Quote:
Where there will be serious performance loss is the overdraw between sprites, especially when there's multiple massive sprites filling the entire screen close to the 'camera'.
That's very difficult to handle that in a smart way, and a lot of pixels are drawn then overwritten... Unless you perform a ton of horrible computations to handle clipping between bobs (argh....)
jotd is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Outrun AGA agermose project.Amiga Game Factory 252 18 April 2024 12:57
Better Outrun port for Amiga tekopaa Retrogaming General Discussion 399 14 April 2022 17:56
Outrun adfs macce2 request.Old Rare Games 3 18 April 2021 21:22
would you like to have an Outrun like for Aga? sandruzzo Retrogaming General Discussion 50 30 January 2013 12:03
Aweb: New APL 3.5Beta AOS4 PPC code + Milestone: KHTML porting started Paul News 0 05 November 2004 11:21

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:38.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10327 seconds with 14 queries