23 February 2021, 13:15 | #1081 | ||||||||||||||
Registered User
Join Date: Mar 2016
Location: Ozherele
Posts: 229
|
This table shows that the 80386 is slightly faster than 68030 for the same frequency. This proves my point about this matter.
Quote:
Quote:
It seems that you fight with you own delusions. I have only written that the 6502 more effectively utilizes clock cycles. It is the common truth. Do you still miss it? BTW I even know that the Z80 code density is better than 6502. Quote:
Quote:
Quote:
The 68020 was a beginning of the end of the 68k. The 68020 was good for a PC but too expensive until 1991, but it was slow for workstations which migrated to the RISC architecture. That the return to real mode is not a necessity in theoretically right architecture. But DOS was a reality, and Intel actively supported it since the 80386. They, unlike Moto, were more realistic and didn't push people like Moto did. MOVE from SR, or BE byte order are a classical examples of such pushing. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
My cite has a number 10240. Indeed we can easily make 10K stack on the 8080 or Z80 but it consumes 1/6 of our total address space and doesn't guarantee safe recursion. |
||||||||||||||
23 February 2021, 13:35 | #1082 | |||||||||||||||||||||||
Registered User
Join Date: Mar 2016
Location: Ozherele
Posts: 229
|
Quote:
And you know, the 68k is not orthogonal. Even its MOVE is not completely orthogonal. I want to have MOVE offset1(PC),offset2(PC). The 68k is not VAX or PDP-11. And even the VAX and PDP-11 are not 100% orthogonal. The best architecture (IBM mainfraimes, RISC, x86) just skipped all this orthogonality crap. It has no practical usefulness, it is just a poetry around true IT. Quote:
Quote:
Quote:
Quote:
What do you mean about better code? Faster? Can you give any link where code for the 68k is proved as better than for the x86? Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Yes, the 80286 doesn't have a barrel shifter but the x86 has more flexible byte ops: MOV, XCHG, XLAT, ... So this 68020 advantage is rather illusional, in many cases the 80286 is just faster. Let's check numbers 68020: LSR/ASR #1,Dn - 6, LSR/ASR #2,Dn - 6, LSR/ASR #7,Dn - 6, LSR #8,Dn.w - 6. 80286: SAR/SHR reg,1 - 2, SAR/SHR reg,2 - 7, SAR/SHR reg,7 - 12, XOR r8l,r8l and XCHG r8l,r8h - 5. 68020: ASL/ROL/ROR #1,Dn - 8, ASL/ROL/ROR #1,Dn - 8, ASL/ROL/ROR #7,Dn - 8, ROL/ROR #8,Dn.w - 8 80286: SAL/SHL reg,1 - 2, SAL/SHL reg,2 - 7, SAL/SHL reg,7 - 12, XCHG r8l,r8h - 3. 68020: ROLX/ROR/ROL #1,Dn - 12, ROLX/ROR/ROL #2,Dn - 12, ROLX/ROR/ROL #7,Dn - 12. 80286: RCL/RCR reg,1 - 2, RCL/RCR reg,2 - 7, RCL/RCR reg,7 - 12. So it is quite clear that for the most common case (a shift by 1 bit), the 80286 is much faster. Though in some less common cases the 68020 can be a bit faster. However, the x86 can do byte and word (double word since the 80386) shifts on memory, while the 68k can shift only words. For word shifts the 80286 needs 5+n clocks while the 68020 needs 6/8/12+EA and EA is at least 4. So again the 68020 shows that it is slower for more common cases and a bit faster for less common. And the 68k is less flexible for memory shift ops. BTW What strange timings the 68020 has! LSR is faster than ASL, ASL is faster than ROLX - just several more little oddities to the 68k collections of oddities. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Last edited by litwr; 24 February 2021 at 08:29. Reason: corrections in shift ops data |
|||||||||||||||||||||||
23 February 2021, 13:41 | #1083 | ||||||||||||
Registered User
Join Date: Mar 2016
Location: Ozherele
Posts: 229
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Recently I made several Basic programs for a benchmark race - https://gitlab.com/retroabandon/basc.../benchmarks.md - a man from the Atari8 world intervened, he wanted better results for his platform. He used diffrent Basic, different algo, ... and made the result about 5 times faster. I can only suspect that the Amstrad CPC or Commodore people could make results for their PC much better too but they just missed this race. So your point for this case is a real oddity. Quote:
Quote:
I've just compile Xlife v7 sources with -O3 and -Os, I got 537 KB and 341 KB correspondingly. It is not a large program, it is only about 16,000 LOC. The size of stdlib++, Xlib, etc is quite a large common part of both programs. So this makes the difference at least as 2:1. |
||||||||||||
23 February 2021, 15:23 | #1084 | |||||||||||||||||||||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
To quote yourself talking to Thomas : "You are a programmer and you know that every particular case has the same importance as the general case in programming." Now if you really want to have an example where flag manipulation is essential, try emulating another cpu family for a start. Quote:
RISC is designed to ease cpu design and implementation, with no care about the programming model. Quote:
Quote:
Quote:
Quote:
Quote:
I prefer paying for something that will be useful 20 years from now, rather than paying for something that has been useful 20-30 years ago and is now crap. Quote:
Quote:
Quote:
At least if the name differs, we can choose which variant we encode. Quote:
The sandbox can not just set the value to 0. If saving to stack normally reads $2300 from SR, in the sandbox it will be $0300. Even if it sets that back to 0, it is the wrong value - and it has no way to tell which one is right. Quote:
Quote:
Quote:
Quote:
Something like add al,mem. Arm v6 and v7 aren't the same as aarch64. Quote:
(Note that, actually, FPGA 68080 present in Vampire accelerators CAN do 64-bit accesses like that.) Quote:
I remember having made a basic vs basic test long ago. That was simple prime number factorization. Atari ST (gfa basic) : 11.6 secs, vs PC 386 DX40 (qbasic) : 9.2 secs. Same algorithm, very few changes in the code and yes, 40Mhz 386 DX only slightly faster than 8Mhz 68000. Quote:
But maybe you consider that a MacBook of today is a PC ? Interpret it as you wish. Why would i care. In french we say "j'ai déjà donné". Quote:
And you still have to disassemble that code. Alternatively you could just compress the executables, to put unrolled loops out of the equation... |
|||||||||||||||||||||||||
23 February 2021, 18:14 | #1085 | |||||||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
Quote:
Quote:
Quote:
Quote:
Quote:
Ah, did you check the right instruction this time? I mean the 32/16 division? Quote:
Please, think arguments to their very end. In fact, all the PPC did to switch between endianness is to fiddle with the lower adress bits. Quote:
You can never guarantee "safe recursion" on a finite stack space, but there is quite some difference between a 256 byte stack, and a (potential) 64-K stack. While 128 recursions on the 6502 sounds like a lot, it implies that you cannot really use this itsi-bitsi stack for parameter passing. |
|||||||
23 February 2021, 18:15 | #1086 | |
Registered User
Join Date: Jun 2016
Location: europe
Posts: 1,039
|
Quote:
---- All timings are for best case and do not take into account wait states, instruction alignment, the state of the prefetch queue, DMA refresh cycles, cache hits/misses or exception processing. ---- LSR #1,Dn - 6 cycles? Did you find that in a 68020 manual written by *intel*? Most common case? That's your assumption. Here's one of mine: in many cases I don't even have to shift because index scaling is *free*. Mem shift 6/8/12+EA and EA is at least 4? Where did you see those numbers, they're incorrect. EA at least 4? Incorrect. Sure, it's less flexible for mem ops. Because it has lots of registers, duh. I could count on fingers how many times in 30+ years I've used mem shift on M68K. And what if... I'm using 32-bit data? Pretty much multiply all 286 timings by 2. And btw, I don't care about 386. You started about 286/020 so that's that. Byte swaping and access to upper byte is neat (mainly from 000/010 perspective), I won't dispute that. Although greatly diminshed for a long while now. |
|
23 February 2021, 18:47 | #1087 | ||||||||||||||||||||||||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
Quote:
Code:
if (a=b) { } It only means that you cannot dispatch a move upfront a branch as superscalar pair. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
I remember optimizing some code for the x86 - the biggest improvement I received by removing divisions and replace them by proper multiplication and shifting. If you want high-speed algorithms, avoid divisions. Also on x86. FYI, the algorithm was a quantizer. |
||||||||||||||||||||||||
23 February 2021, 19:19 | #1088 | |
Global Moderator
Join Date: Nov 2001
Location: Derby, UK
Age: 48
Posts: 9,355
|
Considerations taken on bored.. Look there's like 3/4 members interested in this thread and the last. One has now pulled out... Anyone else getting involved is either very brave, patient or a bit stupid... Either way... It's tiresome reading your considerations. It's not about obeying,
It's about keeping the shit stom in one thread... I could just lock the thread if you prefer? Quote:
|
|
23 February 2021, 20:06 | #1089 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
One sure thing is that this thread leads nowhere.
However, our friend litwr here is a strange guy. He writes a lot of drivel all day long, but hardly ever goes into real name calling. This is something i've never seen before. I really wonder what motivates him. |
23 February 2021, 22:49 | #1090 | |||
Registered User
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,546
|
Quote:
Quote:
It's pointless comparing specs or release dates without taking into account affordability. Apart from rich Americans, no home computer user in 1984 considered the PC a viable option. Quote:
I had the CPC664 which came out soon after - the first computer I owned with a built in floppy drive. Very soon after that Amstrad released the CPC6128, which had 128k RAM for almost the same price. Many 664 users got upset about it, but I didn't. I unsoldered the 64k DRAM chips from the motherboard and replaced them 256k chips, and made a bank switching board compatible with the 6128 that could also map any RAM bank to the screen. I upgraded about a dozen 664's to 256k for users here in New Zealand (wonder what happened to them?). Sadly the 664's keyboard went bad so I stupidly threw it away a few years ago (not knowing that replacement membranes were available!). However I kept a mint condition CPC6128 that somebody gave me. Some day I hope to upgrade it with a ridiculous amount of RAM so I can run SymbOS on it. The CPC464 was released in NZ in 1985. Below is an advert from the 'Bits and Bytes' magazine issue that reviewed it (Manukau computers was a shop in Auckland run by a friend of mine, where I purchased my Amiga 1000 from a few years later). On launch the CPC464 with color monitor and floppy drive cost NZ$2190. In the same issue there is an advert for a '100% IBM compatible' Sperry Model 20 with 128k RAM, 2 disk drives and mono screen for NZ$8100 ("other configurations up to $15,560"). And finally here is an advert from the Sept 1985 issue of 'Bits and Bytes' for the CPC664 - NZ$1895 with RGB color monitor or $1495 with green screen (which unlike the PC was still 'color', just displaying the image in shades of green). I bought the green screen model because I wanted a sharper screen for programming, and because it was cheaper! |
|||
08 March 2021, 08:57 | #1091 | ||
Registered User
Join Date: Mar 2016
Location: Ozherele
Posts: 229
|
I don't know what is going on here. It seems that some men just want to degrade this thread and they don't want to discuss their reasons. I am just a mere participant. Sorry, in this situation, I can only reply to a few messages.
Quote:
I have checked our discussion around the pi-spigot and found out that my implementation is just much faster. So I couldn't use any code from this thread in my project. Moreover my special optimized for smaller size code was the smallest... So I still can't figure out what you mean? BTW I can again express my gratitude to you because you helped me to optimize code for the 68000 multiplication. I used your advice in the PDP-11 (noEIS) code. It happened several years before this thread started. Quote:
It is strange that Alan Sugar stopped updating the CPC line after the 6128. IMHO he could use the Z80 @6 or even @8 MHz in 1986 or 1987. For the Apple II, the Z80 @8Mhz cards were available... |
||
08 March 2021, 11:16 | #1092 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
|
|
08 March 2021, 16:03 | #1093 | |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
Quote:
I'm not sure that "68K is better than x86" is "reality". It is an opinion one can share or not. There are certainly merrits in the x86 architecture, as in "it does still exist", and "there is a 64 bit version of it", and "it is quite powerful". But giving arguments in favour of the overall architectural design of the x86 seems really bewidlering to me. The CPU design looks like several layers of chewing gum and duct tape wrapped around an outdated 8-bit core. While I understand why intel did that - namely to keep in control of the market - I still appreciate the more orthogonal design of the 68K. Or any other processor I see on the market today. It's really hard to make a design as unorthogonal as x86. |
|
08 March 2021, 16:24 | #1094 | |||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
Here, on the other hand... Quote:
Yet if i had to make that choice, i'd take x86 over mips or alpha for doing asm on, without hesitation. |
|||
08 March 2021, 17:06 | #1095 | |||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,215
|
Quote:
Quote:
Quote:
We still had, 10 years ago, an in-house specialist that gave some heavy-duty algorithms "the final touch" by implementing them in hand-tuned assembler. We don't do that nowadays anymore. It makes no sense. We use compiler intrinsics, and we reach the same if not better performance by letting the compiler generate the code. The compiler knows better which instruction takes how long, how to unroll loops and where to inline. Where it needs help is to get the architecture of the code right, and the vectorization (there are no good auto-vectorizing compilers at the moment, except for trivial cases, this is still better done by hand). What you need to do is compile - look at the code - tune the source - measure the speed - reiterate. |
|||
08 March 2021, 17:30 | #1096 | |||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
Quote:
So the programmer will always have an edge. Hand-tuned assembler isn't just "playing compiler". It's not about converting the code, it's about converting the algorithm. And that's something a compiler can not do. Say what you want, but no compiler will ever beat me. Compilers being better than asm programmers is a myth. The reason why asm isn't written anymore today is something else - it's simply because all currently available cpus are a pita in that aspect but are fast enough so it's not worth the effort. |
|||
08 March 2021, 17:37 | #1097 |
Registered User
Join Date: Jun 2020
Location: Brno
Posts: 90
|
Programming 68k in assembler is still popular and fun in 2021 (Amiga, ST, Megadrive). On the other hand, x86 assembler is used by a few PC intro coders only.
And that is, my friends, a testament to a good design :-) |
09 March 2021, 11:36 | #1098 |
Registered User
Join Date: Jun 2015
Location: Germany
Posts: 1,918
|
|
09 March 2021, 11:46 | #1099 |
Registered User
Join Date: Dec 2019
Location: Ur, Atlantis
Posts: 1,899
|
Absurdisms like this one is why I love these threads
|
09 March 2021, 12:12 | #1100 | |
Registered User
Join Date: Jun 2015
Location: Germany
Posts: 1,918
|
Quote:
It often gets mentioned that Intel put RISC cores inside their CISC processors and thus could keep up with RISC's clock frequency increases (which was the original aim of RISC). The reality today is that RISCs have CISC ALUs to increase the number of instructions executed. I wouldn't be surprised if the CPUs were better at bundling super-instructions from typical compiler-generated code (that's what the CPUs are designed for) than from hand-written code optimised for low instruction count. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Any software to see technical OS details? | necronom | support.Other | 3 | 02 April 2016 12:05 |
2-star rarity details? | stet | HOL suggestions and feedback | 0 | 14 December 2015 05:24 |
EAB's FTP details... | Basquemactee1 | project.Amiga File Server | 2 | 30 October 2013 22:54 |
req details for sdl | turrican3 | request.Other | 0 | 20 April 2008 22:06 |
Forum Details | BippyM | request.Other | 0 | 15 May 2006 00:56 |
|
|