English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 25 April 2018, 06:51   #121
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by britelite View Post
I don't need to try it to know that rendering quads is slower than the way I do it. If you take a moment to think about how you interpolate u/v in quad, you will probably realize it too.

I do agree with you that there are ways to make the raycasting part more efficient, but I don't agree on your choice of rendering.
I don't know how you pick up texel value, but simple linear mapping is a lot fast!
sandruzzo is offline  
Old 25 April 2018, 07:42   #122
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
I don't know how you pick up texel value, but simple linear mapping is a lot fast!
But I am using linear mapping for every strip, in the form of unrolled loops of move.w (a1)/(a1)+,(a0)+ and every instruction handling a pair of pixels. Now, please tell me how using quads could improve on this?
britelite is offline  
Old 25 April 2018, 07:46   #123
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by britelite View Post
But I am using linear mapping for every strip, in the form of unrolled loops of move.w (a1)/(a1)+,(a0)+ and every instruction handling a pair of pixels. Now, please tell me how using quads could improve on this?
You can use linear mapping to since every quads' line can be seen like a strip
sandruzzo is offline  
Old 25 April 2018, 07:50   #124
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
You can use linear mapping to since every quads' line can be seen like a strip
Like I said, that's what I'm already doing and that's why having the buffer rotated 90 degrees is useful. So, please explain how you'd make this faster without rotating the buffer, because at the moment it feels you don't have a grasp on how my implementation works.

EDIT: And in general, if you really don't have any concrete suggestions then please stay off this thread.
britelite is offline  
Old 25 April 2018, 08:15   #125
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
When it comes to the raycasting, my current code uses an array of 160 elements (as the screen is 160 pixels wide), where every element consists of the height of the wall and a texture coordinate). The rendering then goes through every element of the array and draws the corresponding strip.

Now, this array can be built up in various ways. You can raycast all 160 elements, or raycast only parts of it and use interpolation where possible. Or you could even build up this array without raycasting.

But in whatever way the array is built up, it doesn't affect the way the screen is rendered. And this is why I won't accept ideas that just say "use quads" without an explanation on how this would in any way speed up the rendering process. Especially if you claim this will work without a 90 degrees rotated buffer.
britelite is offline  
Old 25 April 2018, 09:29   #126
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Agreed with @britelite. There are lots of choices left in the details when saying "use quads".


Slightly different topic; if there is a desire to reduce the CPU time spent for raycasting, then here is an alternative approach:

Discover wall segments near the player by walking through the map, starting at the player's position, and gradually moving away. This can be accomplished via a flood fill. It can also be handled via iteration across a 2D subrect. These can then be combined with culling against the view field. This sort of walk can be made to discover walls in an exact front-to-back order.

Whenever a non-empty block is encountered, it vertices would get projected 2D->1D to screen space. It would then be clipped against previously-discovered spans through insertion into an 1D spanbuffer or similar.

The dual of the spanbuffer is the set of on-screen regions which are not yet covered by walls. Each such non-occluded region could be thought of as a 2D portal. The wall discovery should restrict itself against the current set of portals. (It could either do exact clipping against the portal's edges, or just use it for culling.)

Once you have a set of line segments, you could then perform 1D rasterization to create the list of spans to draw (heights and corresponding texture coordinates). The primary tricky thing here, for a 68000, is that it would involve perspective correct interpolation for the texture U coordinates. It _may_ be possible to simplify that bit by noting that A) the walls are always axis aligned and/or B) perform raycasting per-span but knowing ahead of time that all rays in the group will hit the same wall segment.

Last edited by Kalms; 25 April 2018 at 09:39.
Kalms is offline  
Old 25 April 2018, 11:05   #127
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
@britelite

When you hit a block with ray casting you know if it's void or have a wall. And i think you can even know how large it'll be. So instead shoot rays for all the wall, you can shoot less of them, let's say vevery 8-16 pixel. So you'll have some kind of mini wall, and know all the texture coordinate. and fill it with simple linear interpolation.

In this way, if it's possible, as I said many time, you'll have: less ray to shoot, and linear fast interpolation without rotating chunky buffer.

I hope this is clear

EDIT: stop to be so rude, I'm big enough to know how live in life, dont' be so childish, you're great man
sandruzzo is offline  
Old 25 April 2018, 11:23   #128
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
In this way, if it's possible, as I said many time, you'll have: less ray to shoot, and linear fast interpolation without rotating chunky buffer.
You're still mixing up two things. Yes, I agree there's ways to shoot less rays, but this has nothing to do with rendering quads vs rendering strips.

You're still not telling me how the _RENDERING_ of the walls would be improved by your suggestion. PLEASE show me some REAL examples on how rendering quads without rotating the buffer 90 degrees will speed things up. Otherwise, stay out of this thread.

EDIT: I do appreciate different ideas, but I expect you to understand how my current implementation works and to be able to show how your idea would improve on my implementation. Otherwise it's just random nonsense.

Last edited by britelite; 25 April 2018 at 11:33.
britelite is offline  
Old 25 April 2018, 11:37   #129
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by britelite View Post
You're still mixing up two things. Yes, I agree there's ways to shoot less rays, but this has nothing to do with rendering quads vs rendering strips.

You're still not telling me how the _RENDERING_ of the walls would be improved by your suggestion. PLEASE show me some REAL examples on how rendering quads without rotating the buffer 90 degrees will speed things up. Otherwise, stay out of this thread.

EDIT: I do appreciate different ideas, but I expect you to understand how my current implementation works and to be able to show how your idea would improve on my implementation. Otherwise it's just random nonsense.
Drawing left from right, after been determinated how wide will be this "little wall, sice I assume that we can all worth information, and since blocks size are fixed?
sandruzzo is offline  
Old 25 April 2018, 11:42   #130
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
Drawing left from right, after been determinated how wide will be this "little wall, sice I assume that we can all worth information, and since blocks size are fixed?
Sigh, not good enough. How would you interpolate u/v across the quad, and how would this be faster than my implementation? Give me some REAL examples, preferrably a snippet of code to further explain how you'd do it.
britelite is offline  
Old 25 April 2018, 14:12   #131
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by britelite View Post
Sigh, not good enough. How would you interpolate u/v across the quad, and how would this be faster than my implementation? Give me some REAL examples, preferrably a snippet of code to further explain how you'd do it.
I'll try to do some code, but thinking about that since all is static and knowed, I think that linear interpolation would be fast
sandruzzo is offline  
Old 25 April 2018, 14:22   #132
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
I'll try to do some code, but thinking about that since all is static and knowed, I think that linear interpolation would be fast
Yes, linear interpolation is fast, and that's why I've always done it for the strips. You didn't answer my question though, and until you actually do, please stay out of this thread.
britelite is offline  
Old 25 April 2018, 18:11   #133
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
@britelite

Maybe only my fault to not be able to explaint it very well. But dont' worry, I'll stay away from this thread, but not because you're asking me, but because you're telling it. I thought to talk with adult man...
sandruzzo is offline  
Old 25 April 2018, 18:27   #134
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Ok, let's get back to the real discussion...

Quote:
Originally Posted by Kalms View Post
Slightly different topic; if there is a desire to reduce the CPU time spent for raycasting, then here is an alternative approach:

Discover wall segments near the player by walking through the map, starting at the player's position, and gradually moving away. This can be accomplished via a flood fill. It can also be handled via iteration across a 2D subrect. These can then be combined with culling against the view field. This sort of walk can be made to discover walls in an exact front-to-back order.

Whenever a non-empty block is encountered, it vertices would get projected 2D->1D to screen space. It would then be clipped against previously-discovered spans through insertion into an 1D spanbuffer or similar.

The dual of the spanbuffer is the set of on-screen regions which are not yet covered by walls. Each such non-occluded region could be thought of as a 2D portal. The wall discovery should restrict itself against the current set of portals. (It could either do exact clipping against the portal's edges, or just use it for culling.)

Once you have a set of line segments, you could then perform 1D rasterization to create the list of spans to draw (heights and corresponding texture coordinates). The primary tricky thing here, for a 68000, is that it would involve perspective correct interpolation for the texture U coordinates. It _may_ be possible to simplify that bit by noting that A) the walls are always axis aligned and/or B) perform raycasting per-span but knowing ahead of time that all rays in the group will hit the same wall segment.
I remember we had a talk about this at Revision, and I wasn't completely sure about what you meant. But seeing it written makes more clear, and I agree that this could be a nice way to implement the wall calculations. Especially if the rooms/levels are designed in a way to take advantage of the method

One thing I was wondering regarding the flood fill, is how well it would handle small "pillars" compared to walls. As I was first thinking of stopping the fill for a certain line/column when hitting a pillar, but then realized that depending on angle the wall behind might still be partially visible.

EDIT: Actually, flood fill would handle pillars just great, so disregard my previous little brainfart
I noticed that you could split the room into four segments for the flood fill, with the player position being the dividing point. Then taking the player field of view into account, you can directly disregard at least two of the segments. And when hitting a solid block, you would only add the vertices/wall segment depending on from which direction your filling and not for the whole block.

Last edited by britelite; 26 April 2018 at 07:18.
britelite is offline  
Old 26 April 2018, 10:45   #135
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Quote:
Originally Posted by britelite View Post
I noticed that you could split the room into four segments for the flood fill, with the player position being the dividing point. Then taking the player field of view into account, you can directly disregard at least two of the segments. And when hitting a solid block, you would only add the vertices/wall segment depending on from which direction your filling and not for the whole block.
Yep, exactly.

Also, a naive flood fill approach (combined with "only process the 2 relevant sides of a block") will not necessarily result in a strict front-to-back ordering.

Getting strict front-to-back ordering requires more detailed control over the order in which locations are visited/tested. One way to describe this would be to have the flood fill put to-visit items into a priority queue. The priority is the z distance of the block's center. You would need slightly more complicated maths when computing the Z distance for the first block, but after that, you can get the Z distance for a flood fill's neighbour as "<Z distance for originating block" + "<delta Z distance between two blocks along axis>". You would compute the delta Z distance for each of the 2 primary axes based on the user's current orientation before beginning the flood fill.

I am unsure about the overall performance however. Is it worthwhile to implement a priority queue on 68k to process this number of items?

The other way I can think of is to go more brute-force. Start by sweeping all cells within the view field. (I presume you would use 4 different sweeps, one per quadrant.) I'm not sure if it matters whether you do an X-major or Y-major sweep. Then, try to combine that with a scheme to skip over certain locations because they are known to be occluded. (In a sense, this becomes sweep + incremental raycasting only from the corners of already-found blocks.) Mind, the latter approach is half-baked and may end up with complex maths (multiplies etc) in many places.
Kalms is offline  
Old 19 July 2018, 20:11   #136
rothers
Registered User
 
Join Date: Apr 2018
Location: UK
Posts: 487
I'm just putting this explanation of BSP trees here for reference.

[ Show youtube player ]

I'm playing around with a render engine and found it useful, thought it was worth watching if anyone is interested in how BSP trees can be faster than normal raycasting.
rothers is offline  
Old 24 July 2018, 19:02   #137
ReadOnlyCat
Code Kitten
 
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
I failed to find any mention of code generation for the rendering in this thread (I might have missed it though).

Wolfenstein3D does generate unrolled versions of the scaling routines for a given number (if not all) of spans heights so that each call to DrawSpan() essentially uses an "optimal" routine for the current span height.
(I think it might even create a variable sized cache of these depending on the available memory but my memory might be failing me there.)

Did you try that or did you code many routines for various heights by hand yourself?

Quote:
Originally Posted by chb View Post
Just an idea: If you render from the middle to the upper and lower end, you can exploit some symmetry. if you put texel i above the center, you'll always put texel h-i-1 below (h texture height). If you store your texture 90 deg. rotated and scramble it like 0,h-1,1,h-2,2,h-3..., you can read and write a word with two pixels instead of a byte with one pixel
This is a good idea but one should probably map pixels together like so:
(0, h/2), (1, h/2 + 1), (2, h/2 + 2), etc.
This does not give any speed advantage but I am worried that using symmetrical pairs (start, end), (start+1, end-1), etc. could create artifacts at wall mid-height especially when the scaling is not integer because both pixels h/2-1 and h/2 will be present or absent at the same time.

Quote:
Originally Posted by LaBodilsen View Post
That is just incredible, and close to insane.. having somthing like this running mostly 25fps on a plain A500 would have been regarded as "magic" 25 years ago.
Still is.
But if I am not mistaken, this is rendering only right? No raycasting yet?

Quote:
Originally Posted by LaBodilsen View Post
OT: I've coded my own nontextured wallrender hacked into your "framework", it's not using unrolled loops per se, and is dog slow (worst case is over 4 frames). but it was just for learning purpose, and it's all starting to make a lot more sence now.
I am sooo tempted to do the same but I have an Oric version to write first.

Quote:
Originally Posted by sandruzzo View Post
I'll try to do some code, but thinking about that since all is static and knowed, I think that linear interpolation would be fast
As britelite mentioned, by using columns (called "spans" in Carmack's parlance) one takes advantage of the fact that the texels and frame buffer pixels are aligned for each column. This allows to use a super simple one dimension linear scaling routine for linear interpolation.

Any technique which does not draw a single column requires linear interpolation/scaling on two dimensions and loses the advantage obtained by doing the linear interpolation on X only once per column.

On raycasting:

(About Master484's pixel comparison suggestion
Quote:
Originally Posted by britelite View Post
I have to agree with Kalms here, posting random suggestions without displaying any knowledge of the subject at hand is better suited for the other Wolf3d-thread.
To his credit though, Master484's suggestion got me thinking, but on the raycasting side, not the rendering one.

I will keep the suspense running for now but how do you clear the chip frame buffer before the next draw? And how many cycles do you use for that?

Quote:
Originally Posted by britelite View Post
I noticed that you could split the room into four segments for the flood fill, with the player position being the dividing point. Then taking the player field of view into account, you can directly disregard at least two of the segments. And when hitting a solid block, you would only add the vertices/wall segment depending on from which direction your filling and not for the whole block.
Quote:
Originally Posted by rothers View Post
I'm just putting this explanation of BSP trees here for reference.

I'm playing around with a render engine and found it useful, thought it was worth watching if anyone is interested in how BSP trees can be faster than normal raycasting.
Yup, I was reading the (slightly off-topic isn't it? ) discussion about raycasting and thinking that BSPs were probably useful to determine the list of surfaces to draw at a minimal cost.

If I remember correctly, if there are no intersecting walls, given the linear nature of a Wolf3D "labyrinth, a BSP should give the list of all visible walls for a given position/orientation of the player, eliminating the need to raycast at every column.
ReadOnlyCat is offline  
Old 24 July 2018, 20:13   #138
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by ReadOnlyCat View Post
I failed to find any mention of code generation for the rendering in this thread (I might have missed it though).
It's explained in the first few pages of the thread, but in a nutshell I calculate the scaling for all different heights and then generate an unrolled loop of move.w (a1),(a0)+ and move.w (a1)+,(a0)+ instructions for each column.

Quote:
But if I am not mistaken, this is rendering only right? No raycasting yet?
Indeed, hence the title of the thread. I precalculated the raycasting-part so that the cpu-usage would be constant, which helps when trying to optimize the rendering part.

Quote:
I will keep the suspense running for now but how do you clear the chip frame buffer before the next draw? And how many cycles do you use for that?
I currently have two buffers, and I clear one with the blitter while rendering to the other.
britelite is offline  
Old 29 August 2018, 16:27   #139
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
So, I finally had some time to work on the realtime raycasting. Even though this is maybe a bit off-topic as this thread is about the rendering part of things, I thought I'd post a little preview here again.

The raycasting isn't yet very optimized and the collision detection is a bit wonky, but at least you can now run around the little maze I prepared for you all (the exit is marked with a few blue bricks, although you can't really exit yet).

I still have the sprite rendering and special cases, like doors, on my to-do list, and this will probably be the last preview I'll post here. The next version will hopefully be done in time for Easter next year, with at least a few fully playable levels.
britelite is offline  
Old 30 August 2018, 13:34   #140
LaBodilsen
Registered User
 
Join Date: Dec 2017
Location: Denmark
Posts: 179
Quote:
Originally Posted by britelite View Post
So, I finally had some time to work on the realtime raycasting. Even though this is maybe a bit off-topic as this thread is about the rendering part of things, I thought I'd post a little preview here again.
NIIICE!..

Quote:
The raycasting isn't yet very optimized and the collision detection is a bit wonky,...
Well, it's still running fairly smooth on the old 68k setup, Which in itself is pretty impressive stuff. It don't like fast setups though, as the movement becomes a little jittery. But that is not the target platform anywho, so all is good.

Quote:
I still have the sprite rendering and special cases, like doors, on my to-do list, and this will probably be the last preview I'll post here. The next version will hopefully be done in time for Easter next year, with at least a few fully playable levels.
Any kind of "build blog" posted here would be greatly appreciated, no need to post previews
LaBodilsen is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Wolf3D on stock A500 gururise Retrogaming General Discussion 9 08 November 2017 14:03
Wolf3d: more ideas. AndNN Coders. Asm / Hardware 7 17 October 2017 13:03
Optimizing HAM8 renderer. Thorham Coders. Asm / Hardware 5 22 June 2017 18:29
NetSurf AGA optimizing arti Coders. Asm / Hardware 199 10 November 2013 14:36
rendering under wb 1.3 _ThEcRoW request.Apps 2 02 October 2005 17:23

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 10:22.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.65868 seconds with 16 queries