English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 03 December 2007, 12:11   #101
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
Thanks I might just do that. First I want to know exactly how opening screens works, because, if at all possible, I want to understand the code I use. Thats why my c2p routine is unoptimized, I don't completely understand all the optimizations.
Happy RTFM then

Quote:
Originally Posted by Thorham View Post
That's interesting. It would add functionality to your viewer, and save me time doing one from scratch (more or less). I'm going to see if I can write a proper bmp.s for your source code.
Not hard to do, just 3 routines to write. You may have a look in ilbm.s and gif.s for examples. And, of course, don't hesitate to ask me !

Quote:
Originally Posted by Thorham View Post
Actually it's not. Code which modifies itself during runtime is evil. Copying code around at the init stage of a program is not, as long a the code itself stays the same. It's just like runtime code generation where you generate a jump table each time the running program gets new data. AmigaOs actually does it's code relocating by modifying the code itself. Since it happens before the code is executed, this is fine.
This would be fine if it didn't rely on any particular processor implementation. It'll be a complete waste on 040/060. And you're lucky there is no 070+

Anyway the gain on 020/030 would not be enormous, provided you'll gain anything at all. If you think you can really get a good deal of performance like that then I want to see it.

Then again, I've just disassembled too much code with holes in it.

I remember having seen code where each function was aligned to long boundaries... or wanted to be so. The linker apparently didn't respect this and all code ended up unaligned
... and, of course, completely unreadable.
Quote:
Originally Posted by Thorham View Post
Sorry about the archive I've uploaded being rather big. The three bmps I've included are rather large (in particular the 1280x1024 one: Almost 4mb). And I just couldn't resist adding some extras, IrfanView's ability to display iff files properly is just all to handy, and it contains a very flexible batch conversion feature, so I included it.
You can display iff files properly with acdsee too.
The big archive is no problem, I've got room on my CF card


EDIT: thanks a lot for your pics Thorham. They made me definitively validate my little "vbrb" code, that you can activate or not via an equate in the beginning of the source.
Just inactive it, then display your 1024x768 image (use bmptoppm to convert it) : scroll it to the right at its maximum and look.
Now do it again with the equ reactivated...

Last edited by meynaf; 03 December 2007 at 12:17.
meynaf is offline  
Old 03 December 2007, 13:58   #102
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
Happy RTFM then
I'm afraid you have me at a disadvantage there. What's RTFM (until a month or two ago, I didn't even know what LOL meant )

Quote:
Originally Posted by meynaf
Not hard to do, just 3 routines to write. You may have a look in ilbm.s and gif.s for examples. And, of course, don't hesitate to ask me !
Good, that sounds easy enough.

Quote:
Originally Posted by meynaf
Anyway the gain on 020/030 would not be enormous, provided you'll gain anything at all. If you think you can really get a good deal of performance like that then I want to see it.
It does make 256 byte loops fit in the cache exactly, and there might be some gain from that.

Quote:
Originally Posted by meynaf
You can display iff files properly with acdsee too.
The big archive is no problem, I've got room on my CF card
The last time I tried acdsee (a million years ago), it didn't handle ham iffs properly. And isn't it share ware? Irfan view is free.

Quote:
Originally Posted by meynaf
EDIT: thanks a lot for your pics Thorham. They made me definitively validate my little "vbrb" code, that you can activate or not via an equate in the beginning of the source.
Just inactive it, then display your 1024x768 image (use bmptoppm to convert it) : scroll it to the right at its maximum and look.
Now do it again with the equ reactivated...
You're welcome.

Ah, now I've got a good reason to test the software under winuae. I'm using my miggy's composite output with a video/svideo to vga converter which refuses to display max overscan properly. Winuae does this properly, I believe, so I'll check it out.
Thorham is offline  
Old 03 December 2007, 13:59   #103
BippyM
Global Moderator
 
BippyM's Avatar
 
Join Date: Nov 2001
Location: Derby, UK
Age: 48
Posts: 9,355
Quote:
Originally Posted by Thorham View Post
I'm afraid you have me at a disadvantage there. What's RTFM (until a month or two ago, I didn't even know what LOL meant )
Read The Fucking Manual
BippyM is offline  
Old 03 December 2007, 14:20   #104
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
I'm afraid you have me at a disadvantage there. What's RTFM (until a month or two ago, I didn't even know what LOL meant )
bippym explained it

Quote:
Originally Posted by Thorham View Post
It does make 256 byte loops fit in the cache exactly, and there might be some gain from that.
I still need to see a use for it in real life. Aligning code makes the thing unreadable while debugging, makes the code grow in size, and may become utterly useless on another cpu.
Quote:
Originally Posted by Thorham View Post
The last time I tried acdsee (a million years ago), it didn't handle ham iffs properly. And isn't it share ware? Irfan view is free.
It did ok when I tried (last week). Well, of course, it ain't free.
Alternatively you can use the viewer I wrote, it supports ham pictures

Quote:
Originally Posted by Thorham View Post
Ah, now I've got a good reason to test the software under winuae. I'm using my miggy's composite output with a video/svideo to vga converter which refuses to display max overscan properly. Winuae does this properly, I believe, so I'll check it out.
You don't need max overscan, just use the autoscroll feature (move the mouse to the right until it scrolls).
And you'll see that, without special treatment, the ham fringing nearly reaches half of the screen !


If you simply want a viewer which does scaling in order to make the image fit on the screen, then John Hendrikx already did the job with fastview (see attachment).
So long with Visage !

Last edited by meynaf; 12 May 2011 at 08:32.
meynaf is offline  
Old 03 December 2007, 15:46   #105
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by maynaf
If you simply want a viewer which does scaling in order to make the image fit on the screen, then John Hendrikx already did the job with fastview (see
attachment).
So long with Visage !
What I want is scaling to fit the screen and fix the aspect ratio for the 3x1 ham mode. Furthermore, the scaling seems to just skip pixels, which is ugly... And to add to that, scaled 24 bit bmps are not displayed properly. I think you can do a better job here!

I've included a crude command line version of my 24 bit bmp viewer here so you can get an idea of what I want for bmp viewing. Just use it with the test pictures, in particular the 1280x1024 one, and copy them to the ram disk to get the best speed impression.

Check it out (if you haven't already). By the way, this time it's only the executable.

ShowBmp24.zip
Thorham is offline  
Old 03 December 2007, 16:09   #106
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
What I want is scaling to fit the screen and fix the aspect ratio for the 3x1 ham mode. Furthermore, the scaling seems to just skip pixels, which is ugly... And to add to that, scaled 24 bit bmps are not displayed properly. I think you can do a better job here!
I dunno how it does things, I simply know that it has the feature
Quote:
Originally Posted by Thorham View Post
I've included a crude command line version of my 24 bit bmp viewer here so you can get an idea of what I want for bmp viewing. Just use it with the test pictures, in particular the 1280x1024 one, and copy them to the ram disk to get the best speed impression.

Check it out (if you haven't already). By the way, this time it's only the executable.
Already checked if it's the same as last week. Did it this week-end.

Your bmpview.s assembled correctly with phxass (with the "case" argument because there are some lowercase/uppercase mismatches).
However, for buffer storage you should really use a bss section and the ds (not dcb) directive : your code is 80 kb !

Anyway, it worked ok.

You may want to test my 50%/50% quick rendering. See attached file.
The 800x600 -> 400x300 image is done in 31 frames. Now you've got some work to beat it.
As I still don't display bmp's, you will have to use bmptoppm before (if you didn't do it already).

Also, the goal was speed, not quality. The code for high-quality is present in the source given here but untested and probably doesn't work (as the low-qual one didn't when I first tried it).
Note that a modified version of ham8_test.s is necessary for that.

Last edited by meynaf; 12 May 2011 at 08:32.
meynaf is offline  
Old 03 December 2007, 16:50   #107
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
Already checked if it's the same as last week. Did it this week-end.
It's more or less the same. The interlace bug seems to have been fixed (needs more checking) and it can be used from the command line. Nothing special.

Quote:
Originally Posted by meynaf
Your bmpview.s assembled correctly with phxass (with the "case" argument because there are some lowercase/uppercase mismatches).
However, for buffer storage you should really use a bss section and the ds (not dcb) directive : your code is 80 kb !
I know, I know! It's terrible. I have wanted to fix this. It's just a left-over from a simple dos frame I've made, and never had to fix it, yet. It's definitely not staying like this... Also has to do with asmone habits: Just dcb.b everything so you can use the built-in hex viewer (and then leave it like that).

Quote:
Originally Posted by meynaf
You may want to test my 50%/50% quick rendering. See attached file.
The 800x600 -> 400x300 image is done in 31 frames. Now you've got some work to beat it.
As I still don't display bmp's, you will have to use bmptoppm before (if you didn't do it already).
Yep, it's fast I've only run the program though, and still have to look in to it. Very handy for very large images, indeed.

Haven't tried the hq code, yet. We'll see if it works or not, I'll keep you posted as I have access to my miggy all week.
Thorham is offline  
Old 03 December 2007, 17:20   #108
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
It's more or less the same. The interlace bug seems to have been fixed (needs more checking) and it can be used from the command line. Nothing special.
Ok then. I know what it does.

Quote:
Originally Posted by Thorham View Post
I know, I know! It's terrible. I have wanted to fix this. It's just a left-over from a simple dos frame I've made, and never had to fix it, yet. It's definitely not staying like this... Also has to do with asmone habits: Just dcb.b everything so you can use the built-in hex viewer (and then leave it like that).
Isn't asmone's hex viewer able to access bss sections ?
Curiously the case thingy annoys me much less than the arrays of null bytes in the executable
If you like optimisations, even the smallest ones, then you have the option here to reduce your exe loading time

Quote:
Originally Posted by Thorham View Post
Yep, it's fast I've only run the program though, and still have to look in to it. Very handy for very large images, indeed.

Haven't tried the hq code, yet. We'll see if it works or not, I'll keep you posted as I have access to my miggy all week.
Test it if you want, but it may crash your machine, destroy your house, steal your girlfriend... You've been warned

For the source you won't see much difference from what you knew - apart from the scale.

However, there *was* a possible optimization, and when I spotted it, I thought : oooooooops...

Indeed...
cmp.l a5,d6
is faster than
cmp.l d6,a5
because the latter is cmpa, not cmp ! (4 cycles vs 2)

I have modified my code to take this into account. 3 cmpa became regular cmp (and bhs became bls). Not much effect though (3 frames or so in 1024x768).


Finally, I did my speed testings this week-end. Here are the results.

Test #1
Code:
 move.l (a7),a5        ; 4 clock cycles if dcache active, 8 else
vs
 move.l #adr,a5        ; 6 clock cycles
The 020 has no datacache, and needs more opts because it's slower than 030 (not because of the timings, which are the same, but because of the lower frequency).
On 030 you gain 2, but you lose 2 on 020.
Clock cycles are more important @14mhz than @50mhz.
Sorry pal, but I won't put your optim in my version because of this.

Test #2
Code:
 move.l #adr,a5             ; 6 cycles
 move.l (a5,d4.l),a6        ; 11 cycles (7 if in dcache)
vs
 move.l (adr.l,d4.l),a6     ; 17 cycles (14 if in dcache)
-> uninteresting.

And take care, as you must write :
Code:
 move.l (adr.l,d4.l),a6
and certainly not :
Code:
 move.l (adr,d4.l),a6
because else phxass will do stupid things. I didn't remember that and had crashes
meynaf is offline  
Old 03 December 2007, 18:30   #109
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
Isn't asmone's hex viewer able to access bss sections ?
Curiously the case thingy annoys me much less than the arrays of null bytes in the executable
If you like optimisations, even the smallest ones, then you have the option here to reduce your exe loading time
I didn't know phxass was case sensitive. Personally, this is one of the things I dislike about C/C++, and I firmly believe it shouldn't be in any language, as it's confusing.

And, yes, I like optimizations. Frankly I didn't know about bss sections. Asmone should be able to handle them, though, and if it does, I'll start using them, saves you some allocs in the code.

Quote:
Originally Posted by meynaf
Test it if you want, but it may crash your machine, destroy your house, steal your girlfriend... You've been warned


Quote:
Originally Posted by meynaf
I have modified my code to take this into account. 3 cmpa became regular cmp (and bhs became bls). Not much effect though (3 frames or so in 1024x768).
It's still three frames on top of the other ones, this one should bring it down from the original 155 frames to 137 or so, and thats just from optimizing the code itself, not the algorithm. When you look at it from that point of view, it really does add up.

Quote:
Originally Posted by meynaf
The 020 has no datacache, and needs more opts because it's slower than 030 (not because of the timings, which are the same, but because of the lower frequency).
On 030 you gain 2, but you lose 2 on 020.
Clock cycles are more important @14mhz than @50mhz.
Sorry pal, but I won't put your optim in my version because of this.
If you want it to be useful on a 020 then you don't have a choice

Quote:
Originally Posted by meynaf
I didn't remember that and had crashes
Oh yeah, the good old 'assembler screwed up, but I stared at my code for three hours' problem. I remember having this sort of thing with some other software... Drives one mad
Thorham is offline  
Old 04 December 2007, 10:35   #110
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
I didn't know phxass was case sensitive. Personally, this is one of the things I dislike about C/C++, and I firmly believe it shouldn't be in any language, as it's confusing.
With phxass you can turn case sensitivity off (either via command line or by changing global opts).
No big deal for me either, because nearly all my labels are in lower case

Quote:
Originally Posted by Thorham View Post
And, yes, I like optimizations.
Wanna get thousands of lines of code to optimize ? I might have something for you then

Quote:
Originally Posted by Thorham View Post
Frankly I didn't know about bss sections. Asmone should be able to handle them, though, and if it does, I'll start using them, saves you some allocs in the code.
Separating data from code has the disadvantage of not being able to access the data with pc-relative code (you can compensate this with the basereg directive) ; on the other hand if memory is fragmented your huge blocks of data will fit better. Not to mention the much smaller executable you get with bss sections (and I love small exes ).

Quote:
Originally Posted by Thorham View Post
It's still three frames on top of the other ones, this one should bring it down from the original 155 frames to 137 or so, and thats just from optimizing the code itself, not the algorithm. When you look at it from that point of view, it really does add up.
I really shouldn't have written this at first place anyway.
However, would you make an optimization which uses 240k of memory, just for gaining 0.02 seconds at best (unnoticeable) for an (unscaled) 1280x1024 image ?

Quote:
Originally Posted by Thorham View Post
If you want it to be useful on a 020 then you don't have a choice
Sure. I like the speed of my gif decoder on such a machine. Makes me feel powerful

Quote:
Originally Posted by Thorham View Post
Oh yeah, the good old 'assembler screwed up, but I stared at my code for three hours' problem. I remember having this sort of thing with some other software... Drives one mad
Problem is : it wasn't the assembler, but the generated program who crashed, because the assembler generated a mess of offsets
Hopefully it didn't call the guru.
The bad thing about this is that the bug is clearly documented in phxass docs
meynaf is offline  
Old 04 December 2007, 14:59   #111
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
With phxass you can turn case sensitivity off (either via command line or by changing global opts).
No big deal for me either, because nearly all my labels are in lower case
Good thing you can, because personally I don't like it.

Quote:
Originally Posted by meynaf
Wanna get thousands of lines of code to optimize ? I might have something for you then
What is it?

Quote:
Originally Posted by meynaf
Separating data from code has the disadvantage of not being able to access the data with pc-relative code (you can compensate this with the basereg directive) ; on the other hand if memory is fragmented your huge blocks of data will fit better. Not to mention the much smaller executable you get with bss sections (and I love small exes ).
Small execs definitely are nice, and this way, most of what you code will end up pretty small. Using dcb for everything is a nasty habit I got into.

Quote:
Originally Posted by meynaf
I really shouldn't have written this at first place anyway.
However, would you make an optimization which uses 240k of memory, just for gaining 0.02 seconds at best (unnoticeable) for an (unscaled) 1280x1024 image ?
That depends. If it's only once or twice, then sure. If it starts using up whole megabytes then no. It's because I'm used to having the memory to do so, and I don't take plain a1200s in account. If I would do that, then I would probably make a version specifically for such a machine. Much depends on the application, too. If the program doesn't use up a whole lot of memory to begin with, then a couple of such optimizations are fine with me.

Quote:
Originally Posted by meynaf
Sure. I like the speed of my gif decoder on such a machine. Makes me feel powerful
I wish I could see that, but I can't turn my accelerator off as this requires pressing a key and I'm using a custom keyboard interface (original keyboard broken). Sometimes it's a problem if I do want to check something on a plain a1200.

Quote:
Originally Posted by meynaf
Problem is : it wasn't the assembler, but the generated program who crashed, because the assembler generated a mess of offsets
Hopefully it didn't call the guru.
The bad thing about this is that the bug is clearly documented in phxass docs
But do you need to read the entire manual, just to assemble some code? To me, it sounds like a quirk which should have just been fixed in the first place, especially if older versions of phxass behave like that, too.
Thorham is offline  
Old 04 December 2007, 15:46   #112
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
What is it?
I knew you'd bite
It has nothing to do with the actual subject, so it will require to open a new thread.
The basics is that I re-sourced the (no longer maintained) mpega.library to get more speed (-> more quality) out of it. I gained 10% or so ; still unsatisfying, and a lot of code hasn't been touched. The job looks like jpeg decoding because of the dct and all that sort of things.
Interested ?

Quote:
Originally Posted by Thorham View Post
That depends. If it's only once or twice, then sure. If it starts using up whole megabytes then no. It's because I'm used to having the memory to do so, and I don't take plain a1200s in account. If I would do that, then I would probably make a version specifically for such a machine. Much depends on the application, too. If the program doesn't use up a whole lot of memory to begin with, then a couple of such optimizations are fine with me.
Hmmm... so you still want to do your *16 thingy to remove that lsr, don't you ?
But please compute the gain on a 400x300 image, then count the time taken to fulfill your 256k buffer... (I can give you the results if you feel lazy)

Quote:
Originally Posted by Thorham View Post
I wish I could see that, but I can't turn my accelerator off as this requires pressing a key and I'm using a custom keyboard interface (original keyboard broken). Sometimes it's a problem if I do want to check something on a plain a1200.
So you're also unable to use programs that bang on the hardware to read the keyboard, are you ?

Quote:
Originally Posted by Thorham View Post
But do you need to read the entire manual, just to assemble some code? To me, it sounds like a quirk which should have just been fixed in the first place, especially if older versions of phxass behave like that, too.
I agree. But (in an answer for a bug report) Frank Wille told me the code for phxass ended up quite messy, making some "architectural" bugs quite hard to fix (without breaking everything, that is). So I excuse him
meynaf is offline  
Old 04 December 2007, 16:19   #113
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
I knew you'd bite
It has nothing to do with the actual subject, so it will require to open a new thread.
The basics is that I re-sourced the (no longer maintained) mpega.library to get more speed (-> more quality) out of it. I gained 10% or so ; still unsatisfying, and a lot of code hasn't been touched. The job looks like jpeg decoding because of the dct and all that sort of things.
Interested ?
Yes, I am! Taking a look won't hurt.

Quote:
Originally Posted by meynaf
Hmmm... so you still want to do your *16 thingy to remove that lsr, don't you ?
But please compute the gain on a 400x300 image, then count the time taken to fulfill your 256k buffer... (I can give you the results if you feel lazy)
I will do it today.

Quote:
Originally Posted by meynaf
So you're also unable to use programs that bang on the hardware to read the keyboard, are you ?
No, I can use these. It's one of those interfaces which plugs into the keyboard connector internally, and enables you to use a standard pc keyboard. The interface simply doesn't emulate an a1200 keyboard 100% accurately, not a problem really.

Quote:
Originally Posted by meynaf
I agree. But (in an answer for a bug report) Frank Wille told me the code for phxass ended up quite messy, making some "architectural" bugs quite hard to fix (without breaking everything, that is). So I excuse him
This is exactly why you should write neat and tidy code for larger projects. It makes them much easier to maintain. This is also the reason why I like languages such as visual basic .net, the object orientation and clear syntax make it easy to write neat code, although assembler is fine for smaller projects like image viewers, for example. And messy code is also fine for these, as there aren't that many internal dependencies. For something like an assembler, you don't really have an excuse to write messy code, however, as theres way to much relying on such a program. It's crucial for software like that to be able to fix every found bug.
Thorham is offline  
Old 04 December 2007, 16:56   #114
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
Yes, I am! Taking a look won't hurt.
I'll open a thread as soon as I can upload the code.

Quote:
Originally Posted by Thorham View Post
I will do it today.
Hint : 8 clock cycles per write ; *16 (because you write the same thing 16 times), *4096 (number of colors to write).
Just compare this to 4 clock cycles per pixel.

Quote:
Originally Posted by Thorham View Post
No, I can use these. It's one of those interfaces which plugs into the keyboard connector internally, and enables you to use a standard pc keyboard. The interface simply doesn't emulate an a1200 keyboard 100% accurately, not a problem really.
Poor mite... a pc keyboard...

However I still suggest you try to press the "2" key (not numpad) while booting.
And I think Vesalia Computer still sells A1200 keyboards.

Quote:
Originally Posted by Thorham View Post
This is exactly why you should write neat and tidy code for larger projects. It makes them much easier to maintain. This is also the reason why I like languages such as visual basic .net, the object orientation and clear syntax make it easy to write neat code, although assembler is fine for smaller projects like image viewers, for example. And messy code is also fine for these, as there aren't that many internal dependencies. For something like an assembler, you don't really have an excuse to write messy code, however, as theres way to much relying on such a program. It's crucial for software like that to be able to fix every found bug.
Some of the bugs would have required to rewrite everything I'm afraid. Development of it was abandoned ; we're lucky just to get support.
meynaf is offline  
Old 04 December 2007, 18:12   #115
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
I'll open a thread as soon as I can upload the code.
Cool I'm looking forward to it.

Quote:
Originally Posted by meynaf
Hint : 8 clock cycles per write ; *16 (because you write the same thing 16 times), *4096 (number of colors to write).
Just compare this to 4 clock cycles per pixel.
It might be possible to get this code to be faster...

Quote:
Originally Posted by meynaf
Poor mite... a pc keyboard...

However I still suggest you try to press the "2" key (not numpad) while booting.
And I think Vesalia Computer still sells A1200 keyboards.
I'm not crying. The a1200 has a low quality keyboard, that will wear out too fast. Being able to replace a broken keyboard with another one from any computer shop for a few bucks is just great, and the quality is generally much higher. I'm quite happy with it

I checked Vesalia and they do still have them, but at almost 20 bucks I'm not buying any unless I'm going to fully restore my a1200 (it's a bit messy, at the moment, but electronically in great shape), and that's not happening any time soon, I'm afraid.

By the way: If you want a brand new a1200, then amikit has them (old stock of new amigas) Probably not for very long... They also have refurbished a1200s for less then 60 euros (last time I checked)!

Quote:
Originally Posted by meynaf
Some of the bugs would have required to rewrite everything I'm afraid. Development of it was abandoned ; we're lucky just to get support.
Thats bad news. I suppose phxass' users are indeed lucky to still receive support. Frank Wille seems like a nice guy, not everyone keeps supporting their abandoned programs.
Thorham is offline  
Old 05 December 2007, 10:37   #116
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
It might be possible to get this code to be faster...
You don't give up, do you ?
But how are you going to make a row of 16 move.l a6,(a1)+ go faster ?

Quote:
Originally Posted by Thorham View Post
I'm not crying.
You, surely not. But your miga ? Seeing itself looking like a dumb peecee ?
Seeing that is has - horror ! - windows keys ???

Quote:
Originally Posted by Thorham View Post
The a1200 has a low quality keyboard, that will wear out too fast. Being able to replace a broken keyboard with another one from any computer shop for a few bucks is just great, and the quality is generally much higher. I'm quite happy with it
Mine took more than 10 years of intensive use to wear out. And Pc components are definitely not built to last.
Alternatively you could look for an A4k keyboard...

Quote:
Originally Posted by Thorham View Post
I checked Vesalia and they do still have them, but at almost 20 bucks I'm not buying any unless I'm going to fully restore my a1200 (it's a bit messy, at the moment, but electronically in great shape), and that's not happening any time soon, I'm afraid.
20 bucks for several years of use isn't very much. I think it's worth, and especially because I hate pc keyboards layout.

Quote:
Originally Posted by Thorham View Post
By the way: If you want a brand new a1200, then amikit has them (old stock of new amigas) Probably not for very long... They also have refurbished a1200s for less then 60 euros (last time I checked)!
I already have one full spare A1200, so...

Quote:
Originally Posted by Thorham View Post
Thats bad news. I suppose phxass' users are indeed lucky to still receive support. Frank Wille seems like a nice guy, not everyone keeps supporting their abandoned programs.
The actual version works well enough for me ; i'm using it because it has all the features I need and it's the only one able to optimize forward branches.
Else we can make a new, open-sourced, asm


Hmmm... it looks like if we're a bit off-topic right now.

I think I can go with my renderer like it is - thanks for your help.
... or did you spot some further opt. that we can do ?

After we're done with the rescaling, the next step for me is the jpeg decoder ; this may require to open another thread.
meynaf is offline  
Old 05 December 2007, 11:08   #117
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
First, I'm sorry for not doing what I said I would do I was rather tired yesterday and went to bed early, so I'm going to do it now.

Quote:
Originally Posted by meynaf
You don't give up, do you ?
But how are you going to make a row of 16 move.l a6,(a1)+ go faster ?
They are actually move.b! After looking at the source code for a few hours, I didn't want to continue, so I didn't optimize this yet.

Quote:
Originally Posted by meynaf
You, surely not. But your miga ? Seeing itself looking like a dumb peecee ?
Seeing that is has - horror ! - windows keys ???
Yes, you're right, those windows keys are very uncool. But the advantages of a peecee keyboard do outweigh the disadvantages any day.

Quote:
Originally Posted by meynaf
Mine took more than 10 years of intensive use to wear out. And Pc components are definitely not built to last.
Alternatively you could look for an A4k keyboard...
Thats quite some time. The main problem I have with an a1200 keyboard is that it's internal, and my current setup makes it extremely unhandy to use: Theres a flat cable coming out of the side for the hd and cd rewriter, and the whole computer is positioned in such a way that I can swap the amiga keyboard with the pc keyboard, depending on what I'm doing. So I guess the peecee keyboard is a blessing in disguise!


Quote:
Originally Posted by meynaf
20 bucks for several years of use isn't very much. I think it's worth, and especially because I hate pc keyboards layout.
You're absolutely right if you look at it like that. I still think they're a bit flimsy, though.

Quote:
Originally Posted by meynaf
I already have one full spare A1200, so...
Lucky you! I wish I did...

Quote:
Originally Posted by meynaf
The actual version works well enough for me ; i'm using it because it has all the features I need and it's the only one able to optimize forward branches.
Else we can make a new, open-sourced, asm
I'd rather make a compiler. But then I would still need an assembler/linker. Hmmm, don't know about that one.

Quote:
Originally Posted by meynaf
I think I can go with my renderer like it is - thanks for your help.
... or did you spot some further opt. that we can do ?
You're welcome. I don't know. My miggy is turned on right now, ready to have another look. Still, to go from 155 frames to 137 is quite good already, but you never know, so I'll keep you posted.

Quote:
Originally Posted by meynaf
After we're done with the rescaling, the next step for me is the jpeg decoder ; this may require to open another thread.
Yes, this definitely requires a new thread, and I'm looking forward to it.

Filling the larger table actually doesn't make the program slower. Since your bench program takes everything in account, including file reading, and the number of frames still went down, I'd say that speed-wise it's no problem, but I can understand that you don't like the extra memory overhead.

Edit: I've tested pre-calculating the palette table addresses, but unless I'm doing something wrong, this actually made it slower again, so I changed it back to indexes (64kb for 4069x16).

Changing the cmpa to cmp does make a difference here, the number of frames for 800x600 is now 136! Since it was 155, it's now 22% faster.

Last edited by Thorham; 05 December 2007 at 12:46.
Thorham is offline  
Old 05 December 2007, 13:44   #118
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
First, I'm sorry for not doing what I said I would do I was rather tired yesterday and went to bed early, so I'm going to do it now.
Drunken again, eh ? Went to bed early in the morning ?
(sorry... it was just too easy...)

Quote:
Originally Posted by Thorham View Post
They are actually move.b! After looking at the source code for a few hours, I didn't want to continue, so I didn't optimize this yet.
They'll become move.l if you use my opt. of storing the address in the palette but still want to have the lsr removed.

Quote:
Originally Posted by Thorham View Post
Yes, you're right, those windows keys are very uncool. But the advantages of a peecee keyboard do outweigh the disadvantages any day.
The uncool part of a pc keyboard, for me, is more than that. Misplaced ctrl-key I press instead of shift, 3 useless keys at the top right (ever used scroll lock ?), damn "insert" hit by mistake (when it's not num lock)...

Quote:
Originally Posted by Thorham View Post
Thats quite some time. The main problem I have with an a1200 keyboard is that it's internal, and my current setup makes it extremely unhandy to use: Theres a flat cable coming out of the side for the hd and cd rewriter, and the whole computer is positioned in such a way that I can swap the amiga keyboard with the pc keyboard, depending on what I'm doing. So I guess the peecee keyboard is a blessing in disguise!
Take care, you'll end up with a whole pc one day

Quote:
Originally Posted by Thorham View Post
You're absolutely right if you look at it like that. I still think they're a bit flimsy, though.
I's like a girlfriend, you must be delicate.

Quote:
Originally Posted by Thorham View Post
Lucky you! I wish I did...
You still can...

Quote:
Originally Posted by Thorham View Post
I'd rather make a compiler. But then I would still need an assembler/linker. Hmmm, don't know about that one.
A compiler for what, then ?

Quote:
Originally Posted by Thorham View Post
I don't know. My miggy is turned on right now, ready to have another look. Still, to go from 155 frames to 137 is quite good already, but you never know, so I'll keep you posted.
From 155 to 137 frames ? Errh...

Here are my actual values :
500x333 : old=54, new=50, quick=33
800x600 : old=145, new=135, quick=86
1024x768 : old=234, new=217, quick=137

Quote:
Originally Posted by Thorham View Post
Yes, this definitely requires a new thread, and I'm looking forward to it.
Maybe next week.

Quote:
Originally Posted by Thorham View Post
Filling the larger table actually doesn't make the program slower. Since your bench program takes everything in account, including file reading, and the number of frames still went down, I'd say that speed-wise it's no problem, but I can understand that you don't like the extra memory overhead.
My benchmark program doesn't take anything into account before the screen opens.

EDIT:
Quote:
Originally Posted by Thorham View Post
Edit: I've tested pre-calculating the palette table addresses, but unless I'm doing something wrong, this actually made it slower again, so I changed it back to indexes (64kb for 4069x16).

Changing the cmpa to cmp does make a difference here, the number of frames for 800x600 is now 136! Since it was 155, it's now 22% faster.
The cmpa->cmp removed 3 frames for 1024x768. A detail, but it had to be done.

I'm @135 frames right now. Look at my code for the palette table addresses.
But maybe this is incompatible with the 4096*16 writes (because you end up with 256 kb).

Last edited by meynaf; 05 December 2007 at 13:53. Reason: answer to an edit...
meynaf is offline  
Old 05 December 2007, 14:10   #119
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,840
Quote:
Originally Posted by meynaf
Drunken again, eh ? Went to bed early in the morning ?
(sorry... it was just too easy...)
No comment...

Quote:
Originally Posted by meynaf
They'll become move.l if you use my opt. of storing the address in the palette but still want to have the lsr removed.
Still optimizing the older version, I'm afraid. The one I downloaded last, bugs when you turn the scaling of. Also the quick version is fast, so I'm still trying to milk the hq version. Tried to implement it, but I screwed up...

Quote:
Originally Posted by meynaf
The uncool part of a pc keyboard, for me, is more than that. Misplaced ctrl-key I press instead of shift, 3 useless keys at the top right (ever used scroll lock ?), damn "insert" hit by mistake (when it's not num lock)...
I'm completely used to it, but then again, I've been using it for two years

Quote:
Originally Posted by meynaf
Take care, you'll end up with a whole pc one day


Quote:
Originally Posted by meynaf
A compiler for what, then?
For an object oriented language I've been thinking about. I've already done some work on it in c, but it's just an experiment, basically it's to see if I can do it.

Quote:
Originally Posted by meynaf
From 155 to 137 frames ? Errh...

Here are my actual values :
500x333 : old=54, new=50, quick=33
800x600 : old=145, new=135, quick=86
1024x768 : old=234, new=217, quick=137
The 155 frames is from the first version you uploaded, without any of the optimizations. I still have it and use it for comparing the optimized versions.

Quote:
Originally Posted by meynaf
My benchmark program doesn't take anything into account before the screen opens.
It does actually count the file reading, too. I've noticed, because when the image is read from hd the number of frames is higher then from ramdisk (which is what all the values I've posted are based on).

Anyway, I've found another optimization:

move.b (a6)+,d4
move.l d1,d0
sub.l d4,d0
bpl.s .n0
neg.b d0

Can be replaced with:

move.l d1,d0
sub.b (a6)+,d0
bpl.s .n0
neg.b d0

This brings down the frame count to 133 for 800x600.

By the way, the ham table is 64kb!
Thorham is offline  
Old 05 December 2007, 14:59   #120
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
Still optimizing the older version, I'm afraid. The one I downloaded last, bugs when you turn the scaling of. Also the quick version is fast, so I'm still trying to milk the hq version. Tried to implement it, but I screwed up...
I screwed up too but I didn't say my last word. This week-end...

Quote:
Originally Posted by Thorham View Post
For an object oriented language I've been thinking about. I've already done some work on it in c, but it's just an experiment, basically it's to see if I can do it.
I'm sure you can do it

Quote:
Originally Posted by Thorham View Post
The 155 frames is from the first version you uploaded, without any of the optimizations. I still have it and use it for comparing the optimized versions.
I've got so many versions that I probably mixed up things

Quote:
Originally Posted by Thorham View Post
It does actually count the file reading, too. I've noticed, because when the image is read from hd the number of frames is higher then from ramdisk (which is what all the values I've posted are based on).
I thought I did it to just take the rendering into account... I'll look at that.

Quote:
Originally Posted by Thorham View Post
Anyway, I've found another optimization:

move.b (a6)+,d4
move.l d1,d0
sub.l d4,d0
bpl.s .n0
neg.b d0

Can be replaced with:

move.l d1,d0
sub.b (a6)+,d0
bpl.s .n0
neg.b d0

This brings down the frame count to 133 for 800x600.
I thought to write it like that. But it makes the assumption that the difference will never overflow a signed byte, which is not the case because we're subtracting two unsigned bytes to get the diff as a signed value (before taking its absolute value).

EDIT: it *can* be done !
It may overflow a signed byte, but not an unsigned one.
So it must be written that way : (same but with bcc instead of bpl)
move.l d1,d0
sub.b (a6)+,d0
bcc.s .n0
neg.b d0

Thanks for it. I wouldn't have believed something could still be done !

Last edited by meynaf; 05 December 2007 at 15:18.
meynaf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
HAM8 screen question. Thorham Coders. General 28 04 April 2011 19:26
HAM8 C2P Hacking NovaCoder Coders. General 2 25 March 2010 10:37
Problem making ham8 icons. Thorham support.Apps 0 12 March 2008 22:30
Multiple HAM8 pictures? killergorilla support.Other 4 15 February 2007 14:41

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 19:20.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.29945 seconds with 13 queries