View Single Post
Old 05 August 2022, 17:26   #19
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,107
Cool, I did consider trying to extract the result from the extended precision result, but hadn't had time. Doesn't seem to be a massive speed improvement, but if the conversion/rounding can be optimized it might have potential. Results:
smul64: 69 cycles
sdiv64: 138 cycles
divllu (returning just quotient): 175 cycles
I measured overhead to be ~37 cycles (just calling a dummy asm routine with 2 stack arguments and storing a result in "r").

P.S. isn't it a bit dangerous to use a stack "red zone" like that? Probably OK if not in supervisor mode, but seems sketchy
paraj is offline  
 
Page generated in 0.10009 seconds with 11 queries