09 August 2024, 07:51 | #1 |
Registered User
Join Date: Nov 2009
Location: Top of the world
Posts: 188
|
Questions about AllocMem()-, AllocVec()-alignment and Wipeout
I'm trying to modify Wipeout to do memory-tracking/leak-detection only.
While hunting down a bug I created by deleting too much code when removing the memory-wall code, I found some things that puzzled me. The first is the alignment of the allocations. One of the first things done by the function that preform the allocation is to pad the requested size to be dividable by MEM_BLOCKSIZE. It then goes on to add the sizes of its own headers to this, before preforming the actual allocation. Resulting in an actual allocation that is not always dividable by MEM_BLOCKSIZE, which seems kind of wrong to me. Am I right in thinking that to be completely correct, the actual allocation should also be dividable by MEM_BLOCKSIZE? The second thing that tripped me up was the storing of size in AllocVec() allocations. It adds the size of the "size longword" to the variable "memSize" before the allocation is done. And then it stores "memSize + sizeof(ULONG)" in the "size longword" at the end. This can't be right? And finally. If I am right about that the actual allocation should be dividable by MEM_BLOCKSIZE, I assume this should be the total allocation including the "size longword" for AllocVec(). Is this right? Code in question follows. -Hagbard Celine Code:
BOOL PerformAllocation( ULONG * pc, struct PoolHeader * poolHeader, ULONG memSize, ULONG attributes, UBYTE type, APTR * resultPtr) { struct TrackHeader * th; ULONG allocationRemainder; ULONG postSize; LONG nameTagLen; APTR result = NULL; BOOL success = FALSE; nameTagLen = 0; /* Get the name of the current task, if this is necessary. */ if(NameTag) nameTagLen = GetNameTagLen(*pc); /* If the allocation is not a multiple of the memory granularity * block size, increase the post memory wall by the remaining * few bytes of padding. */ allocationRemainder = (memSize % MEM_BLOCKSIZE); if(allocationRemainder > 0) allocationRemainder = MEM_BLOCKSIZE - allocationRemainder; postSize = allocationRemainder; switch(type) { case ALLOCATIONTYPE_AllocMem: th = (*OldAllocMem)(nameTagLen + TrackHeaderSize + memSize + postSize,attributes & (~MEMF_CLEAR),SysBase); if(th != NULL) { /* Store the name tag data in front of the header, then * adjust the header address. */ if(nameTagLen > 0) { FillNameTag(th,nameTagLen); th = (struct TrackHeader *)(((ULONG)th) + nameTagLen); } AddAllocation(th); } break; case ALLOCATIONTYPE_AllocVec: /* This will later contain the length long word. */ memSize += sizeof(ULONG); th = (*OldAllocMem)(nameTagLen + TrackHeaderSize + memSize + postSize,attributes & (~MEMF_CLEAR),SysBase); if(th != NULL) { /* Store the name tag data in front of the header, then * adjust the header address. */ if(nameTagLen > 0) { FillNameTag(th,nameTagLen); th = (struct TrackHeader *)(((ULONG)th) + nameTagLen); } AddAllocation(th); } break; case ALLOCATIONTYPE_AllocPooled: th = (*OldAllocPooled)(poolHeader->ph_PoolHeader,nameTagLen + TrackHeaderSize + memSize + postSize,SysBase); if(th != NULL) { /* Store the name tag data in front of the header, then * adjust the header address. */ if(nameTagLen > 0) { FillNameTag(th,nameTagLen); th = (struct TrackHeader *)(((ULONG)th) + nameTagLen); } AddPuddle(poolHeader,th); } break; default: th = NULL; break; } if(th != NULL) { STATIC ULONG Sequence; UBYTE * mem; /* Fill in the regular header data. */ th->th_Magic = BASEBALL; th->th_PointBack = th; th->th_PC = *pc; th->th_Owner = FindTask(NULL); th->th_OwnerType = GetTaskType(NULL); th->th_NameTagLen = nameTagLen; GetSysTime(&th->th_Time); th->th_Sequence = Sequence++; th->th_Size = memSize; th->th_PoolHeader = poolHeader; th->th_Type = type; th->th_PostSize = postSize; th->th_Marked = FALSE; if (StackCacheSize) { ULONG * stackmem; int i,j; stackmem = &pc[1]; j = 0; th->th_StackCache[j].sc_sourceName[0] = 0; for(i = 0 ; (void *)&stackmem[i] <= th->th_Owner->tc_SPUpper && j < StackCacheSize ; i++) { if ((stackmem[i] > 0x4000) && (TypeOfMem((APTR)stackmem[i]))) { if (FindAddressEx(stackmem[i], th->th_StackCache[j].sc_sourceName, &th->th_StackCache[j].sc_lineNum, &th->th_StackCache[j].sc_lineOffset)) { th->th_StackCache[j].sc_stack = stackmem[i]; ++j; if ( j < StackCacheSize ) th->th_StackCache[j].sc_sourceName[0] = 0; } } } while (j < (StackCacheSize - 1)) { ++j; th->th_StackCache[j].sc_sourceName[0] = 0; } } /* Calculate the checksum. */ FixTrackHeaderChecksum(th); /* Fill in the preceding memory wall. */ mem = (UBYTE *)(th + 1); mem += StackHeaderSize; /* Zero the memory allocation body if * requested. */ if(FLAG_IS_SET(attributes,MEMF_CLEAR)) memset(mem,0,memSize); /* AllocVec()'ed allocations are special in that * the size of the allocation precedes the header. */ if(type == ALLOCATIONTYPE_AllocVec) { /* Size of the allocation must include the * size long word. */ (*(ULONG *)mem) = memSize + sizeof(ULONG); result = (APTR)(mem + sizeof(ULONG)); } else { result = (APTR)mem; } success = TRUE; } (*resultPtr) = result; return(success); } |
09 August 2024, 14:20 | #2 |
Registered User
Join Date: Nov 2009
Location: Top of the world
Posts: 188
|
EDIT: WARNING: This post is 101% Fake News and Bulls**t, as it was based on incomplete data. I never checked the actual amount allocated and assumed it was equal to result[-1] for AllocVec().
After mulling it over, I realized I could easily check this by doing a few "result=AllocVec()" and checking "result[-1]". And the correct answers seem to be: 1. Yes, the final allocation should be aligned for AllocVec() to match original behavior. AllocMem() does not do alignment of the end of allocation by design (obviously ). 2. One of the duplicate "sizeof(ULONG)" should be replaced with "the size of alignment padding". 3. Yes. -Hagbard Celine Last edited by hceline; 11 August 2024 at 18:20. Reason: False conclusions |
11 August 2024, 10:34 | #3 | |||
Registered User
Join Date: Aug 2010
Location: Germany
Posts: 546
|
Quote:
"Wipeout" makes use of the designated operating system memory allocation/deallocation functions and does not tinker with the underlying data structures. It actually calls AllocMem() and FreeMem(), which in turn make use of the Allocate() and Deallocate() functions, respectively. Allocate() will round up the requested allocation size to a multiple of MEM_BLOCKSIZE. Quote:
Quote:
"Wipeout" does not round up the allocation size all by itself. This is because you should be able to know if memory allocated with AllocMem(4, MEMF_ANY) is getting corrupted after the 4th byte. If "Wipeout" were to round up the allocation size to a multiple of MEM_BLOCKSIZE, you would only be able to detect corruption after the 8th byte. I need to check my code to see if the allocation performed by "Wipeout" is making sure that the address returned by a successful AllocMem() call is a multiple of MEM_BLOCKSIZE. This is important for code which expects this alignment, e.g. for storing MMU tables. |
|||
11 August 2024, 11:22 | #4 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,490
|
Aligned allocation goes through the trick that is documented in the PoolMem.readme. It is an AllocMem with a round-up size, followed by FreeMem() and AllocAbs(), all in a Forbid() state. Replacement allocators need to support this scheme.
|
11 August 2024, 14:48 | #5 | ||
Registered User
Join Date: Nov 2009
Location: Top of the world
Posts: 188
|
Quote:
result=AllocVec(20,0)and then KPrintF("%ld",result[-1]). I get "24" if Wipeout is not running or active, and "28" if Wipeout is running and active. Test done with Exe directly from Aminet archive. Quote:
No, but it aligns all the data surrounding the requested allocation to a multiple of MEM_BLOCKSIZE including adding padding to the "PostWallSize". So the total final allocation requested when original AllocMem() is called is always a multiple of MEM_BLOCKSIZE. Except for AllocVec() calls, where it adds the "sizeof(LONG)" to "memSize" after calculating the "postSize". Going off to do some more tests in light of the info provided. ...And to find and read the PoolMem.readme. Last edited by hceline; 11 August 2024 at 14:50. Reason: Punctuation |
||
11 August 2024, 16:20 | #6 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,490
|
[QUOTE=hceline;1700235]I get "24" if Wipeout is not running or active, and "28" if Wipeout is running and active.
[/QUOTE} This looks a bit dubous to me. It should be the same value, even more so as some programs may use array[-1] to find out region sizes. It's in particular a not too uncommon trick to find the size of segments loaded through LoadSeg(). Pretty much at its obvious location: http://aminet.net/util/sys/PoolMem.lha See there Developer.readme. It contains a couple of hints on memory allocation and (implied) requirements imposed by some programs. |
12 August 2024, 19:50 | #7 | |
Registered User
Join Date: Nov 2009
Location: Top of the world
Posts: 188
|
Did some more testing.
These are my new answers to my original questions. (Mostly repeating my previous post, but now I am sure I checked all variables.) 1. As the system does the aligning in the end it is not strictly needed in the case of Wipeout. But since it already has everything else aligned, having the call to original AllocMem() for the AloocVec() codepath not aligned wastes memory. If I allocate 3 bytes the "allocationRemainder" will come out at 5 bytes. When it then adds " memSize += sizeof(ULONG)" it makes the total allocation unaligned by 4 bytes, which gets added when the system does its own rounding. Total padding by Wipeout and system becomes 9 bytes. Edit: I also just realized that in this case Wipeout would not be able to detect any memory corruption in those 4 bytes that the system rounding adds. If the size longword where included when calculating "allocationRemainder", it would come out at 1 byte. The total allocation when calling original AllocMem() would be already aligned, and the system-rounding would not change the allocation. Total padding by Wipeout and system becomes 1 byte. 2. There is clearly one "+ sizeof(ULONG)" to many in the AllocVec() codepath. As "result=AllocVec(); KPrintF(result[-1]);" always gives a result that is 4 higher when Wipeout is running, than when it is not. 3. Yes, see 1. Quote:
Last edited by hceline; 12 August 2024 at 23:47. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Design idea behind AllocMem()? | Steam Ranger | Coders. System | 16 | 13 July 2024 17:23 |
Is it possible to reduce an AllocMem() chunk? | 8bitbubsy | Coders. Asm / Hardware | 34 | 12 October 2023 19:51 |
AllocVec returning zero - why? | buzzybee | Coders. System | 20 | 26 September 2020 10:57 |
BPTR alignment | sparhawk | Coders. Asm / Hardware | 32 | 28 February 2020 12:59 |
Wipeout / Wipeout 2097 Fly-Through website | killergorilla | Retrogaming General Discussion | 10 | 02 July 2015 15:56 |
|
|