English Amiga Board


Go Back   English Amiga Board > Support > support.Hardware

 
 
Thread Tools
Old 14 December 2014, 07:05   #1
Damion
Registered User

 
Join Date: Mar 2008
Location: US
Posts: 261
A4091, PFS3, and DMA mask

While tinkering with my A4000/A4091/CSMK2, I've discovered that, strangely, the A4091 performs better if DMAing to motherboard RAM, vs accelerator RAM.

Setting motherboard RAM to highest priority increses read performance (RSCP) from 4.3 MB/s to 4.8 MB/s, and again to 5.0 MB/s, if motherboard RAM is set to 60ns vs the default 80ns (SpeedRamsey). SysSpeed also shows up to 5.5 MB/s (from 4.3), and slightly better filesystem benchmarks.

Obviously, permanently setting motherboard RAM priority above what's on the Cyberstorm is not the answer, so I've been trying to restrict the DMAable memory space using the Mask field in HDInstTools.

The memory range starts at 07000000 and ends at 08000000 (beyond that is Cyberstorm RAM), hence I've tried setting Mask to 07ffffff, 07fffffc, etc, but during boot PFS3 complains "Allocated memory doesn't match memory mask" and sets the partition(s) read only. For test, I've also tried restricting the DMA area to chipram, with the same results. Entering some setting within the range of Cyberstorm RAM works fine, so it seems like during boot, buffers are allocated, then the filesystem checks to make sure the mask setting matches - but I could be wrong.

I figure I'm either setting the mask incorrectly (highly probable), or PFS3 has some built in "idiot proofing" feature that's preventing me from doing something it doesn't deem optimum.

Any thoughts?

*edit* BAH, nevermind, this is all over my head, but using the filesystem mask to prevent the driver from DMAing to accelerator RAM is a silly idea and not going to work. I'm not sure why generic benchmarks are faster with motherboard RAM set to a higher priority - perhaps something to do with the 4091 being designed along with the 3640 (DMA to motherboard RAM), rather than 3rd party accelerator boards with local memory - but the benchmarks don't tell the whole story. (Like copying a large file from a partition on the 4091 is still faster going to a ram disk on the accelerator memory). For now I'll return the filesystem mask to ffffffff, drink a beer and be happy it all works together OK...

Last edited by Damion; 14 December 2014 at 11:07.
Damion is offline  
Old 14 December 2014, 11:08   #2
strim
NetBSD developer
 
Join Date: May 2012
Location: Warsaw, Poland
Posts: 410
Quote:
or PFS3 has some built in "idiot proofing" feature that's preventing me from doing something it doesn't deem optimum
I am afraid that's the case here. Your 07ffffff is a correct mask for this situation in my opinion.

The message is a bit weird, since (as far as I know) filesystem should allocate buffers according to the mask. Here it looks like it did the job first, then checked if they are within mask range and complained.

I wonder if the same happens with FFS.
strim is offline  
Old 14 December 2014, 11:13   #3
thomas
Registered User
thomas's Avatar
 
Join Date: Jan 2002
Location: Germany
Posts: 5,918
The main problem is that you cannot choose which FastRAM to allocate memory from. There is a MEMF_24BITDMA flag for Zorro2, but no 27BIT flag.

FFS probably silently degrades performance.
thomas is offline  
Old 14 December 2014, 12:07   #4
Damion
Registered User

 
Join Date: Mar 2008
Location: US
Posts: 261
Thanks for the replies!

Quote:
I wonder if the same happens with FFS.
Gave it a try on a dummy partition,

Quote:
FFS probably silently degrades performance.
Exactly - a "readfile" benchmark slows from 5 to .5 MB/s.

I believe what happens, is that the device is allocating a buffer in the highest priority DMAable RAM, which is why changing priority "works" - differences in the way the controller performs using buffers in various memory types are measurable. Restricting the mask however forces the filesystem to get involved (it recognizes that the device has made a request to a region you've told it not to) and "bungles" (slows) down the transfer, in the case of FFS anyway (PFS throws an error so you know something's wrong, maybe a better idea).

At least, that's my half-assed understanding of it... :P
Damion is offline  
Old 14 December 2014, 15:08   #5
thomas
Registered User
thomas's Avatar
 
Join Date: Jan 2002
Location: Germany
Posts: 5,918
I am sure that neither PFS nor FFS willingly slow down the transfer.

I don't know what PFS does. Actually I am surprised that it runs so much faster than FFS with the wrong memory.

But I guess that FFS just sends the request to the exec device anyway and the exec device disables DMA because the buffer memory cannot be used for DMA.

The mask value just tells the file system which memory areas can be used for DMA. It's up to the file system to react on wrong memory location. IMHO a proper implementation would be to copy buffers forth and back between actual transfers. Perhaps this is what PFS does and this it is the reason why it suffers only a little bit. But then I wonder why it sets the partition to read-only.
thomas is offline  
Old 14 December 2014, 15:48   #6
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,571
PFS source check says:

Read/Write has mask check, if memory asked to be written/read does not match mask, it allocates temporary buffer (if also temp does not match mask -> error -> write protected) and uses the temp buffer, cpu copy to/from temp, disk operation, free temp buffer.

I think nice solution would be to enumerate all memheaders in exec memory list and allocate directly from first memheader that matches mask. (Using Allocate() that takes memheader pointer) instead of "blindly" allocating temp buffer and hoping for the best.
Toni Wilen is online now  
Old 15 December 2014, 10:12   #7
Damion
Registered User

 
Join Date: Mar 2008
Location: US
Posts: 261
Quote:
I am sure that neither PFS nor FFS willingly slow down the transfer.
You're right, bad word choice on my part. I suppose the slow down is all the extra copying that has to transpire.

Quote:
I don't know what PFS does. Actually I am surprised that it runs so much faster than FFS with the wrong memory.
Sorry, I should have clarified - with the wrong mask, I believe both slow down, but I can't readily measure with PFS3 because of the constant nag requesters, and the fact the partition becomes read only. (SysSpeed benchmarks don't work then, and ofc stuff like RSCP that appear to bypass the filesystem are no help, either).

Thanks again fellas for the replies. I was under some mistaken assumptions - namely that the buffer could be restricted to motherboard fastram and provide a speedup consistent with what I observed by changing the RAM priorities (DOH!), when obviously/ideally the accelerator ram also needs to be DMA reachable to avoid extra CPU copying.

Funnily enough, due to the hardware quirks I bet the 4091 performs better with the original 3640 than it does with the Cyberstorm (this will be coming next... :P).

@Toni

Slightly off topic, but I just read in another old thread that your 4091 died. I've known of a few now with the dreaded "leaking capacitor" issue, mine included, and another fellow whose card was dead until they were replaced. Hopefully it is/was that simple...
Damion is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
PFS3 or PFS3 SCSI Direct xalakibaniou project.ClassicWB 105 27 July 2013 23:08
Mask setting problem (PFS3 & A2000/Derringer030+GVP2000HC+8) amigoun support.Other 0 22 March 2012 12:19
A4091 info JuvUK support.Hardware 8 21 July 2010 17:05
A4091 problem? Fieldday support.Hardware 18 08 December 2009 20:17
PFS3 error: INVALID PFS3 COPY !!! WTF? keropi support.Apps 10 18 March 2008 23:30

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 12:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.
Page generated in 0.07077 seconds with 15 queries