24 May 2018, 12:54 | #541 | ||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
So rather than one-liner examples, that can be biased toward some particular architecture, a whole routine would be better (especially one that puts some pressure on the register file). Why not a code contest ? Everyone interested designs his own ISA (or chooses an existing one to defend) and then writes some routine (doing something useful). We could then finally see who's powerful and who's not. |
||
24 May 2018, 13:05 | #542 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
It's always the same old reason why folks do complicated things in place of simple ones. Something like : - Hey, i'm managing complex projects handling several gigabytes of data ! In comparison to : - Huh, i'm doing my work with only a few MB of memory. You see ? Usual "we've got the biggest balls" stuff. Unfortunately they do that without knowing and telling them does not help. |
|
24 May 2018, 13:21 | #543 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
I do not want to defend any ISA, but I will start with the m most familiar around here:
Find the maximum and the minimum of two values - both are already in registers d0 and d1 - result needs to be in the same registers. Code:
sub.l dl,d0 subx.l d2,d2 and.l d0,d2 eor.l d2,d0 add.l d1,d0 add.l d2,d1 |
24 May 2018, 13:34 | #544 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,757
|
Faster on 68020 and 68030:
Code:
cmp.l d0,d1 bgt.s .l1 exg d0,d1 .l1 |
24 May 2018, 14:05 | #545 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
shorter but not faster ;-)
(my example needs always 12 cycles, yours 6/10 + jmp) |
24 May 2018, 14:05 | #546 | |
Registered User
Join Date: Aug 2006
Location: Scunthorpe/United Kingdom
Posts: 1,986
|
Quote:
Now for playback of such a track you could render the whole thing to a WAV file (we allow for this) but it takes minutes to do, and making adjustments involves going back to individual samples, so to make things flow a little easier, we keep the lot in memory where necessary. That means that there may be a slight delay when loading in samples that haven't been used yet, but that's fine for editing. Where it's absolutely not fine is in a live performance. In that situation we cannot tell ahead of time which samples will be needed and we certainly cannot allow any time at all to pull samples off a disk. We need the whole lot in memory. Then there's effect mixing, which if done on 44.1khz 16bit sound samples quickly gathers aliasing errors so to minimise that we use 192khz 32bit float samples. It all adds up, I'm afraid, and HDDs (even SSDs) are not yet fast enough. |
|
24 May 2018, 14:18 | #547 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
They have more or less same speed on 020/030. Linear example is always 12. Branch example is 10/14 depending on the case (2+8 if taken, 2+6+6 if not taken). Of course this can be just 2 cycles on a cpu doing instruction fusing. |
|
24 May 2018, 14:19 | #548 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,757
|
The cmp is 2 cycles, the exg is 4 cycles, the bgt.s is 4 cycles when it's not taken and 8 cycles when it is. This adds up to 10 cycles in both cases. Note that this is based on 68030 timings, so it may actually not be faster on 68020.
|
24 May 2018, 14:22 | #549 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
In my memory exg is 6 and branch is 6 when not taken and 8 if taken.
|
24 May 2018, 14:23 | #550 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
|
|
24 May 2018, 14:26 | #551 | |||
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
Quote:
The maximum a single person can handle is probably a arrangement similar to big pipe organ in a church with all the panels and registers. So yes, you do know what limited options of samples are needed. You are not going to evaluate different microphone settings of your samples in a live performance - these are things you chose upfront. Quote:
But again: you only need to do that within your calculation, but there is no need to store the instruments in this "quality" since it is only intermediate redundant information. Quote:
first you blow up your data by a factor of >8 without adding information and than you complain about the transfer speed... Last edited by Gorf; 24 May 2018 at 15:14. |
|||
24 May 2018, 14:27 | #552 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,757
|
|
24 May 2018, 14:41 | #553 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Anyway, instruction timings depend heavily on the implementation, so we'd better favor small code - simply because it's small everywhere. |
|
24 May 2018, 14:54 | #554 | |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
Quote:
https://www.nxp.com/docs/en/referenc.../MC68030UM.pdf (page 11-48) |
|
24 May 2018, 15:04 | #555 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
One sure thing is that the code is 6 bytes
|
24 May 2018, 15:09 | #556 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
ok - now we got the 68k case more than covered.
Next ISA please ;-) |
24 May 2018, 15:41 | #557 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,757
|
Quote:
|
|
24 May 2018, 16:05 | #558 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
|
24 May 2018, 16:17 | #559 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
|
24 May 2018, 17:13 | #560 | |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
|
Quote:
you mentioned instruction fusing... and maybe your instruction set would be a good intermediate representation: a sophisticated decoder/translator in FPGA would find that both code snippets do the same in the end and can be represented by a single (intermediate) instruction. The FPGA would take every instruction and identify the group. it can do that in parallel with many instructions.(parallelism) In the second step it compares every instruction with the one that follows - if it belongs to the right group and such a comparison makes sense. Meanwhile the next group of instructions are passing through step one. (pipelining) in the third step matching couples of instructions are fused - there can be more than one fusing step. (meanwhile an other group of instructions enters step one und former step one instructions go to comparing in step two....) Now we would end up with a architecture independent and very short intermediate representation of the code. Traversing a LUT or a tree each intermediate instruction would be translated in either host-cpu code or send to some special simd-unit in FPGA. there could be more than one of these decoders/translators allowing for some kind of "speculative translation" of branches. |
|
Currently Active Users Viewing This Thread: 2 (0 members and 2 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Has anyone got an Amiga 1200 T12 Gen II? | ccorkin | support.Hardware | 10 | 14 April 2017 23:18 |
What do people think about this as next Gen AMIGA? | Gunnar | Amiga scene | 111 | 05 July 2014 20:59 |
Classic 1st Gen EA games for the Amiga | illy5603 | support.Games | 8 | 03 July 2010 02:59 |
Next-gen Amiga development | LaundroMat | Coders. General | 3 | 05 October 2002 00:30 |
|
|