English Amiga Board


Go Back   English Amiga Board > Support > support.Hardware

 
 
Thread Tools
Old 14 July 2017, 17:10   #1
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
Occasional bus error during PCI transfers

Hi all,

I'm currently working on a project where a PPC CPU on the PCI bus (mediator) reads and writes to config registers of a graphics card on the same PCI bus. If I issue a lot of commands through the CPU to the config area of the graphics board the graphic cards locks and after a while a bus error pops up.

I was thinking, could there be a conflict between the PPC CPU and 68K CPU both trying to drive the PCI bus resulting in a lock up?

I know the mediator had a bus master jumper. Can this help?

The code already looks for a sign whether the gfx card is idle before issueing the next commands so it shouldn't be an overflow of commands. If I add delays between the commands issued the system becomes stable. (but slowing it down....)

Any experts who can point me in a right direction?
Hedeon is offline  
Old 14 July 2017, 23:31   #2
Daedalus
Registered User
 
Daedalus's Avatar
 
Join Date: Jun 2009
Location: Dublin, then Glasgow
Posts: 6,334
The Mediator uses its own little DMA system with the graphics card memory, so some transfers are probably cached and queued to be transferred to the Amiga side. Perhaps you're running foul of issues with that cache between the two CPUs? Just speculation though, I've never coded that sort of hardware combo...
Daedalus is offline  
Old 15 July 2017, 00:25   #3
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Hi,

What values do the Base Address Registers (BAR) of the GPU report?
The last nibble denotes if the device supports 32/64 bit, I/O or mem space and importantly cacheable support. Try an eieio command or msync(0) depending on the CPU.

What PPC chip and northbridge or is it custom chip?

The PCI arbiter should stop the PPC and 68K both driving the bus.

When the error occurs, what status is returned by the various PCI devices?

Sorry for so many questions but I don't know enough about your system to go into specifics.
Stedy is offline  
Old 15 July 2017, 05:48   #4
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
PPC is MPC750. Northbridge is MPC107.

When the error occurs the whole Amiga halts in such a bad way no status can be retrieved. I need to reset the machine. Even the reset takes a while to get through..

Not sure if eieio is fully supported on the MPC750, btw. I think it is treated as a nop in most cases.

For now, most of the errors are gone as, you guessed it, I made an error in setting up the memory in which the config registers of the GPU were. I set it up to be cache inhibited using a page table, which is correct. But I also set up a BAT which was Write_Through and overlapping the page table setup. And BAT is checked first before page table.

You always find that kind of stuff out moments after you post a question...

So the idle check was actually done on a value in cache.... idle check works correctly now and I have gotten rid of the delays and the test programs now seem to work. So I can now draw triangles and stuff and texture them. I actually use the Warp3D examples of Kas1e from os4coding forum, but compiled for WarpOS.

The more complex programs like gearsppc still don't work (bus error). Have to dig deeper for that, Can be an error in the driver itself also in this case.
Hedeon is offline  
Old 16 July 2017, 00:08   #5
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Hi,

@Hedeon

Had not realised you were the guy working on the Sonnet 7200 software.

Have you got the Errata document for the TSI107?

One thing that caught me out on one design, using the MPC8245 (PPC603 + TSI107 in 1 chip) , was issues with DMAs and back to back transfers. Try clearing bit 9 (Fast back to back) of the command register of the target device, in your case, the GPU.

Also try setting bit 0 of the AMBOR register at offset 0xE0, this removes an issue with speculative reads of local memory.

The MPC8245, which has identical registers to the TSI107, has an errata document freely available here:
http://www.nxp.com/docs/en/errata/MPC8245CE.pdf

Good luck.
Stedy is offline  
Old 01 July 2020, 11:47   #6
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
In the end I worked around it. It was bad code haha.

But over the years.....I have seen it pop-up occasionally again. Is there also such a thing as a bus time-out? I read somewhere that PCI solutions for the Amiga give a bus time-out (in the shape of a bus error) when addressing slow stuff on the bus (e.g. a ROM from let's say Voodoo or Radeon).

I think the readme of the new FireStorm firmware states something around that lines.
Hedeon is offline  
Old 27 January 2021, 17:43   #7
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
@Stedy

I want to revisit this. Are you available? In this case it is the Prometheus/Firestorm being the culprit. In combination will all different north bridges.

Are you available? :-)
Hedeon is offline  
Old 28 January 2021, 02:49   #8
grelbfarlk
Registered User
 
Join Date: Dec 2015
Location: USA
Posts: 2,902
https://forum.amiga.org/index.php?topic=33092.15

http://www.e3b.de/prometheus/prometheus_V05.txt
Quote:
RETRY mechanism
===============
The new Fire Storm upgrade supports a simple RETRY mechanism for accesses from
Zorro III to PCI. Due to timing constraints on the Zorro III bus it is advisable
to access known slow devices only with PCI-PCI DMA disabled.
For the software side no changes are needed; in case a PCI device does issue a
RETRY situation, the Prometheus CPLDs will repeat the bus access immediately.
If a RETRY fails within the timeout of Zorro III, a Bus Error will occur on
Zorro III.

Cards known to produce RETRYs are:
- slow gfx cards when accessing onboard BIOS ROM
- PCI-PCI bridges, especially on CFG cycles on the PCI bus behind the bridge
grelbfarlk is offline  
Old 03 February 2021, 13:13   #9
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Quote:
Originally Posted by Hedeon View Post
@Stedy

I want to revisit this. Are you available? In this case it is the Prometheus/Firestorm being the culprit. In combination will all different north bridges.

Are you available? :-)
Hi,

I'll try my best. Do you have something equivalent to "lspci" on the Amiga?
This is useful as itprettifies and prints the PCI config registers of every device and is a good starting point.
Stedy is offline  
Old 03 February 2021, 15:54   #10
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
Not really. But are there fields you are especially interested in? Prmscan does not show all, but maybe I can add.
Hedeon is offline  
Old 03 February 2021, 22:18   #11
grelbfarlk
Registered User
 
Join Date: Dec 2015
Location: USA
Posts: 2,902
OpenPCIInfo dumps a lot of the config space too.
grelbfarlk is offline  
Old 03 February 2021, 22:54   #12
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Hi,

Interested in the PCI Command and Status registers, latency timer, cache line, interrupt line/pin, MIN grant, MAX LAT and the base address registers.
It's also useful to know what memory regions are cacheable, from either CPU.
When the PowerPC system hangs, can you still perform PCI transactions from the 68K processor/Zorro bus?
Stedy is offline  
Old 04 February 2021, 01:57   #13
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
Quote:
Originally Posted by Stedy View Post
Hi,

Interested in the PCI Command and Status registers, latency timer, cache line, interrupt line/pin, MIN grant, MAX LAT and the base address registers.
It's also useful to know what memory regions are cacheable, from either CPU.
When the PowerPC system hangs, can you still perform PCI transactions from the 68K processor/Zorro bus?
Prometheus range = 0x40000000-0x60000000
The whole range is cache inhibited regarding the 68K.

PrmScan:

Code:
Prmscan 1.6 by Grzegorz Kraszewski.
PCI cards listing:
-------------------------------------------------
Board in slot 0, function 0
Vendor: Realtek Audio/Lan?Maker
Device: RTL8028 PCI Full-Duplex Ethernet Controller with PnP Function
Revision: 0.
Device class 02, subclass 00.
Address range: 5FE01100 - 5FE0111F (32 B).
Board driver: prm-rtl8029.device.
-------------------------------------------------
Board in slot 1, function 0
Vendor: ATI Technologies Inc. / Advanced Micro Devices, Inc.
Device: unknown ($5960)
Revision: 1.
Device class 03, subclass 00.
Address range: 40000000 - 47FFFFFF (128 MB).
Address range: 5FE01000 - 5FE010FF (256 B).
Address range: 48060000 - 4806FFFF (64 kB).
128 kB of ROM at 48040000 - 4805FFFF.
Board driver: NONE.
-------------------------------------------------
Board in slot 2, function 0
Vendor: Motorola
Device: unknown ($480B)
Revision: 2.
Device class 06, subclass 00.
Address range: 48070000 - 48070FFF (4 kB).
Address range: 48071000 - 48071FFF (4 kB).
Address range: 50000000 - 57FFFFFF (128 MB).
Address range: 48000000 - 4803FFFF (256 kB).
Board driver: NONE.
-------------------------------------------------
Regarding PPC cache:

The frame buffer (in this case 0x40000000-0x480000000 is Write-Through.
The VGA config (in this case 0x48060000-0x48070000) is cache inhibited/guarded.
The PPC memory (in this case 0x50000000-0x58000000) is mostly copy-back.
The PPC configs (in this case 0x48070000-0x48072000) is cache inhibited/guarded.

Rest is less important, I'd think

I can still access the PPC card memory and config ranges from the 68K debugger when the PPC hang happens. Access to the frame buffer or VGA config registers by the 68K debugger results in a bus error.

I'll look up the other values of the cards soon. The only thing OpenPCIscan has more is some status/command stuff, but not the rest.

I do expect a time out of some kind (see grelblarlk reference)

Last edited by Hedeon; 04 February 2021 at 02:02.
Hedeon is offline  
Old 05 February 2021, 12:57   #14
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Hi,

From what you describe, the PowerPC processor has had a critical fault, on some PPC devices, this is a machine check exception. Have seen this in my day job, the CPU core would hand up but another processor could access RAM on the 'dead' card. All was restored on a reboot.

What processor and North bridge are you using?
Stedy is offline  
Old 08 March 2021, 20:52   #15
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
Processor is MPC7410. Northbridge is the 1057, 480b (PCI id).

It's in LE. Originally, Latency Timer was $00 by default. $80 gives less errors.This is with the Prometheus.

VendorID, DeviceID,
Command, Status
RevID, ProgIF, Subclass, Classcode
CacheLineSize, Latency Timer, Header Type, BIST
Interrupt Line, Interrupt Pin, Min Grant, Max Latency

gfx card:
$0210, $6059
$0702, $9002
$01, $00, $00, $03
$00, $80, $80, $00
$FF, $01, $08, $00

G4 card:
$5710, $0B48
$0700, $A0A2
$02, $00, $00, $06
$00, $80, $00, $00
$00, $01, $00, $00

What I have found so far is that the gfx card has crashed, removing it effectively from PCI space. If the 68K then tries to access it, it gives a bus error. If the PPC tries to access it, it just halts.

The gfx card crashes as its command FIFO has overflown. That happened as the command processor stopped processing them and that often happens after receiving an invalid command package.

So why is it getting wrong info as with the Mediator this does not happen and it is the same code. Maybe something gets corrupted when the PPC is pushed of being bus master while doing a transfer.
Hedeon is offline  
Old 16 March 2021, 01:05   #16
Stedy
Registered User
 
Stedy's Avatar
 
Join Date: Jan 2008
Location: United Kingdom
Age: 46
Posts: 733
Hi,

Assuming I've byte reversed correctly, the status registers indicated that the graphics card had a parity error on a transfer and the processor detected it and set a master abort.

Looking at the command register, you have fast back to back transfers enabled on the graphics card but the CPU cannot support this. Would be worth disabling this on the graphics card.

Do you have any exception handlers for the PCI bus or do you use the Machine check as a catch all handler?

I have seen PCI errors cause a machine check on E300/PPC603 cores in the past. I can go into more detail, I guess I should look at the SonnetPCI libraries first?
Stedy is offline  
Old 16 March 2021, 03:56   #17
grelbfarlk
Registered User
 
Join Date: Dec 2015
Location: USA
Posts: 2,902
Relevant?

Quote:
Originally Posted by Timtheloon View Post
Hi all

You lot probably know about scanPCI 0.9

Why don’t we use this more often it give lots of info most which goes way over my head

Like my next question

I notice with ScanPCI it states: Detected Parity Error. What does this mean it states it on all the PCI cards with the exception of the sound card


grelbfarlk is offline  
Old 06 May 2021, 03:16   #18
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 1,993
Most of the errors were due to bugs in the driver (who knew!). At least on MPC107 and Harrier most things are now working. For the rest of the errors I am looking hard if it is software of hardware related. The M1/K1 bridge is much more troublesome, however and cannot run for more than a few seconds before bus error.

I am not sure if the PPC goes into an exception as the 68K crashing takes the whole system with it. The 68K tries to read from PCI during vblank and then bus error (the 2D VGA driver is 68K).

Looking into the fast back2back stuff.

Quote:
Originally Posted by Stedy View Post
Hi,

Assuming I've byte reversed correctly, the status registers indicated that the graphics card had a parity error on a transfer and the processor detected it and set a master abort.

Looking at the command register, you have fast back to back transfers enabled on the graphics card but the CPU cannot support this. Would be worth disabling this on the graphics card.

Do you have any exception handlers for the PCI bus or do you use the Machine check as a catch all handler?

I have seen PCI errors cause a machine check on E300/PPC603 cores in the past. I can go into more detail, I guess I should look at the SonnetPCI libraries first?
Hedeon is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Occasional Red Screen Kiskstart 3.1 ScottC2010 support.Hardware 7 02 June 2017 02:51
Occasional green/pink tint on A1200 pedrorq support.Hardware 5 30 May 2014 11:22
PC-Amiga-PC transfers Yesideez New to Emulation or Amiga scene 4 21 March 2007 15:15
Prometheus PCI & Voodoo 3 PCI GFX Card Slayer support.Hardware 21 05 September 2006 10:57
PC<> miggy file transfers arizz support.Hardware 3 03 April 2005 01:47

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 01:44.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09111 seconds with 15 queries