14 July 2017, 17:10 | #1 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
Occasional bus error during PCI transfers
Hi all,
I'm currently working on a project where a PPC CPU on the PCI bus (mediator) reads and writes to config registers of a graphics card on the same PCI bus. If I issue a lot of commands through the CPU to the config area of the graphics board the graphic cards locks and after a while a bus error pops up. I was thinking, could there be a conflict between the PPC CPU and 68K CPU both trying to drive the PCI bus resulting in a lock up? I know the mediator had a bus master jumper. Can this help? The code already looks for a sign whether the gfx card is idle before issueing the next commands so it shouldn't be an overflow of commands. If I add delays between the commands issued the system becomes stable. (but slowing it down....) Any experts who can point me in a right direction? |
14 July 2017, 23:31 | #2 |
Registered User
Join Date: Jun 2009
Location: Dublin, then Glasgow
Posts: 6,334
|
The Mediator uses its own little DMA system with the graphics card memory, so some transfers are probably cached and queued to be transferred to the Amiga side. Perhaps you're running foul of issues with that cache between the two CPUs? Just speculation though, I've never coded that sort of hardware combo...
|
15 July 2017, 00:25 | #3 |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Hi,
What values do the Base Address Registers (BAR) of the GPU report? The last nibble denotes if the device supports 32/64 bit, I/O or mem space and importantly cacheable support. Try an eieio command or msync(0) depending on the CPU. What PPC chip and northbridge or is it custom chip? The PCI arbiter should stop the PPC and 68K both driving the bus. When the error occurs, what status is returned by the various PCI devices? Sorry for so many questions but I don't know enough about your system to go into specifics. |
15 July 2017, 05:48 | #4 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
PPC is MPC750. Northbridge is MPC107.
When the error occurs the whole Amiga halts in such a bad way no status can be retrieved. I need to reset the machine. Even the reset takes a while to get through.. Not sure if eieio is fully supported on the MPC750, btw. I think it is treated as a nop in most cases. For now, most of the errors are gone as, you guessed it, I made an error in setting up the memory in which the config registers of the GPU were. I set it up to be cache inhibited using a page table, which is correct. But I also set up a BAT which was Write_Through and overlapping the page table setup. And BAT is checked first before page table. You always find that kind of stuff out moments after you post a question... So the idle check was actually done on a value in cache.... idle check works correctly now and I have gotten rid of the delays and the test programs now seem to work. So I can now draw triangles and stuff and texture them. I actually use the Warp3D examples of Kas1e from os4coding forum, but compiled for WarpOS. The more complex programs like gearsppc still don't work (bus error). Have to dig deeper for that, Can be an error in the driver itself also in this case. |
16 July 2017, 00:08 | #5 |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Hi,
@Hedeon Had not realised you were the guy working on the Sonnet 7200 software. Have you got the Errata document for the TSI107? One thing that caught me out on one design, using the MPC8245 (PPC603 + TSI107 in 1 chip) , was issues with DMAs and back to back transfers. Try clearing bit 9 (Fast back to back) of the command register of the target device, in your case, the GPU. Also try setting bit 0 of the AMBOR register at offset 0xE0, this removes an issue with speculative reads of local memory. The MPC8245, which has identical registers to the TSI107, has an errata document freely available here: http://www.nxp.com/docs/en/errata/MPC8245CE.pdf Good luck. |
01 July 2020, 11:47 | #6 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
In the end I worked around it. It was bad code haha.
But over the years.....I have seen it pop-up occasionally again. Is there also such a thing as a bus time-out? I read somewhere that PCI solutions for the Amiga give a bus time-out (in the shape of a bus error) when addressing slow stuff on the bus (e.g. a ROM from let's say Voodoo or Radeon). I think the readme of the new FireStorm firmware states something around that lines. |
27 January 2021, 17:43 | #7 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
@Stedy
I want to revisit this. Are you available? In this case it is the Prometheus/Firestorm being the culprit. In combination will all different north bridges. Are you available? :-) |
28 January 2021, 02:49 | #8 | |
Registered User
Join Date: Dec 2015
Location: USA
Posts: 2,902
|
https://forum.amiga.org/index.php?topic=33092.15
http://www.e3b.de/prometheus/prometheus_V05.txt Quote:
|
|
03 February 2021, 13:13 | #9 | |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Quote:
I'll try my best. Do you have something equivalent to "lspci" on the Amiga? This is useful as itprettifies and prints the PCI config registers of every device and is a good starting point. |
|
03 February 2021, 15:54 | #10 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
Not really. But are there fields you are especially interested in? Prmscan does not show all, but maybe I can add.
|
03 February 2021, 22:18 | #11 |
Registered User
Join Date: Dec 2015
Location: USA
Posts: 2,902
|
OpenPCIInfo dumps a lot of the config space too.
|
03 February 2021, 22:54 | #12 |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Hi,
Interested in the PCI Command and Status registers, latency timer, cache line, interrupt line/pin, MIN grant, MAX LAT and the base address registers. It's also useful to know what memory regions are cacheable, from either CPU. When the PowerPC system hangs, can you still perform PCI transactions from the 68K processor/Zorro bus? |
04 February 2021, 01:57 | #13 | |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
Quote:
The whole range is cache inhibited regarding the 68K. PrmScan: Code:
Prmscan 1.6 by Grzegorz Kraszewski. PCI cards listing: ------------------------------------------------- Board in slot 0, function 0 Vendor: Realtek Audio/Lan?Maker Device: RTL8028 PCI Full-Duplex Ethernet Controller with PnP Function Revision: 0. Device class 02, subclass 00. Address range: 5FE01100 - 5FE0111F (32 B). Board driver: prm-rtl8029.device. ------------------------------------------------- Board in slot 1, function 0 Vendor: ATI Technologies Inc. / Advanced Micro Devices, Inc. Device: unknown ($5960) Revision: 1. Device class 03, subclass 00. Address range: 40000000 - 47FFFFFF (128 MB). Address range: 5FE01000 - 5FE010FF (256 B). Address range: 48060000 - 4806FFFF (64 kB). 128 kB of ROM at 48040000 - 4805FFFF. Board driver: NONE. ------------------------------------------------- Board in slot 2, function 0 Vendor: Motorola Device: unknown ($480B) Revision: 2. Device class 06, subclass 00. Address range: 48070000 - 48070FFF (4 kB). Address range: 48071000 - 48071FFF (4 kB). Address range: 50000000 - 57FFFFFF (128 MB). Address range: 48000000 - 4803FFFF (256 kB). Board driver: NONE. ------------------------------------------------- The frame buffer (in this case 0x40000000-0x480000000 is Write-Through. The VGA config (in this case 0x48060000-0x48070000) is cache inhibited/guarded. The PPC memory (in this case 0x50000000-0x58000000) is mostly copy-back. The PPC configs (in this case 0x48070000-0x48072000) is cache inhibited/guarded. Rest is less important, I'd think I can still access the PPC card memory and config ranges from the 68K debugger when the PPC hang happens. Access to the frame buffer or VGA config registers by the 68K debugger results in a bus error. I'll look up the other values of the cards soon. The only thing OpenPCIscan has more is some status/command stuff, but not the rest. I do expect a time out of some kind (see grelblarlk reference) Last edited by Hedeon; 04 February 2021 at 02:02. |
|
05 February 2021, 12:57 | #14 |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Hi,
From what you describe, the PowerPC processor has had a critical fault, on some PPC devices, this is a machine check exception. Have seen this in my day job, the CPU core would hand up but another processor could access RAM on the 'dead' card. All was restored on a reboot. What processor and North bridge are you using? |
08 March 2021, 20:52 | #15 |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
Processor is MPC7410. Northbridge is the 1057, 480b (PCI id).
It's in LE. Originally, Latency Timer was $00 by default. $80 gives less errors.This is with the Prometheus. VendorID, DeviceID, Command, Status RevID, ProgIF, Subclass, Classcode CacheLineSize, Latency Timer, Header Type, BIST Interrupt Line, Interrupt Pin, Min Grant, Max Latency gfx card: $0210, $6059 $0702, $9002 $01, $00, $00, $03 $00, $80, $80, $00 $FF, $01, $08, $00 G4 card: $5710, $0B48 $0700, $A0A2 $02, $00, $00, $06 $00, $80, $00, $00 $00, $01, $00, $00 What I have found so far is that the gfx card has crashed, removing it effectively from PCI space. If the 68K then tries to access it, it gives a bus error. If the PPC tries to access it, it just halts. The gfx card crashes as its command FIFO has overflown. That happened as the command processor stopped processing them and that often happens after receiving an invalid command package. So why is it getting wrong info as with the Mediator this does not happen and it is the same code. Maybe something gets corrupted when the PPC is pushed of being bus master while doing a transfer. |
16 March 2021, 01:05 | #16 |
Registered User
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
|
Hi,
Assuming I've byte reversed correctly, the status registers indicated that the graphics card had a parity error on a transfer and the processor detected it and set a master abort. Looking at the command register, you have fast back to back transfers enabled on the graphics card but the CPU cannot support this. Would be worth disabling this on the graphics card. Do you have any exception handlers for the PCI bus or do you use the Machine check as a catch all handler? I have seen PCI errors cause a machine check on E300/PPC603 cores in the past. I can go into more detail, I guess I should look at the SonnetPCI libraries first? |
16 March 2021, 03:56 | #17 | |
Registered User
Join Date: Dec 2015
Location: USA
Posts: 2,902
|
Relevant?
Quote:
|
|
06 May 2021, 03:16 | #18 | |
Semi-Retired
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
|
Most of the errors were due to bugs in the driver (who knew!). At least on MPC107 and Harrier most things are now working. For the rest of the errors I am looking hard if it is software of hardware related. The M1/K1 bridge is much more troublesome, however and cannot run for more than a few seconds before bus error.
I am not sure if the PPC goes into an exception as the 68K crashing takes the whole system with it. The 68K tries to read from PCI during vblank and then bus error (the 2D VGA driver is 68K). Looking into the fast back2back stuff. Quote:
|
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Occasional Red Screen Kiskstart 3.1 | ScottC2010 | support.Hardware | 7 | 02 June 2017 02:51 |
Occasional green/pink tint on A1200 | pedrorq | support.Hardware | 5 | 30 May 2014 11:22 |
PC-Amiga-PC transfers | Yesideez | New to Emulation or Amiga scene | 4 | 21 March 2007 15:15 |
Prometheus PCI & Voodoo 3 PCI GFX Card | Slayer | support.Hardware | 21 | 05 September 2006 10:57 |
PC<> miggy file transfers | arizz | support.Hardware | 3 | 03 April 2005 01:47 |
|
|