16 January 2016, 16:44 | #101 |
Registered User
Join Date: Dec 2007
Location: The World
Age: 50
Posts: 476
|
Slightly ot but related. I remember getting excited about St Pipemania having sampled music. Then I ripped the samples and all notes were separate as above.
|
16 January 2016, 20:33 | #102 |
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,878
|
|
17 January 2016, 16:15 | #103 | |
Code Kitten
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
|
Quote:
Just incredible. |
|
17 January 2016, 17:21 | #104 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
|
18 January 2016, 11:06 | #105 | ||||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
Quote:
Quote:
And again, if you have a data cache, you don't really have the problem (your cpu is fast enough to do better). Quote:
Not necessarily ! Prefetch can make that fail as well. Quote:
Quote:
Quote:
But the LUT has another advantage. A multiply would give you a 14bit entity, where a LUT gives you another 8bit value - it's mul & rescale in one instruction. Might also give you signed to unsigned conversion for free, which is better for adding. Quote:
By the way, what will all these tricks buy ? Fast mixing ? But why the heck was mixing needed at first place ? To get more channels ? But for what did one need more channels ? For better quality music ? But many tricks listed in this thread (such as less notes, less volume levels) will in fact lower the quality ! I'm afraid that for a game the usual 3ch music + 1ch sfx is still the best bet and if you need more channels then anything badly lowering the quality is out of question... Quote:
If you have heaps of memory then perhaps you also have heaps of disk space and pre-mixing the whole music would give better results... |
||||||||
18 January 2016, 11:40 | #106 | |
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,878
|
Quote:
Perhaps blitter can be used in a clever way to perform such operations (on samples without filtering)? |
|
18 January 2016, 11:55 | #107 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
|
18 January 2016, 12:53 | #108 | |
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,878
|
Quote:
Only if reply HW is equipped with some decent SRC and sadly this is not Amiga case - luckily this can be implemented in sample creation stage at a cost of lost quality - low pass filter applied to sample allowing you to perform mixing with sample rate conversion without further low pass filtering - i assume that to further simplify only even sampling factors are applicable... |
|
18 January 2016, 15:12 | #109 | |
Registered User
Join Date: Oct 2009
Location: Germany
Posts: 3,307
|
Offtopic but...
I would guess the best solution for music + sfx in games is http://aminet.net/mus/play/ptplayer.lha by phx or similar stuff. It exists (incl. source code), can be used and you don`t need to shrink quality or artists freedom. Excerpt from ptplayer.readme: Quote:
|
|
19 January 2016, 14:50 | #110 | |||||||||
Registered User
Join Date: May 2014
Location: inside the emulator
Posts: 377
|
Quote:
Personally I think even a relatively modern processor with SIMD instructions should use optimized code. That includes sound mixing. Quote:
I gave cache misses as an example of the advantage, it can help even on a non-cached processor. Think of it as allowing a routine to process several samples where the volume and pitch is constant. Quote:
Quote:
[Not that it is a problem on x86, since the 80286 the processor detects writes into the prefetched queue and flushes it.] Quote:
Quote:
Quote:
Quote:
Quote:
But (again from experience) if one want to mix 32+ 16bit channels with sample interpolation (at least LERP), 256 volume levels and perhaps even a dithered 8bit playback option then one have to optimize even on a relatively fast machine with caches. Saying that one doesn't _need_ to optimize for a relatively modern machine may be correct if one likes the sloppy shit that counts as software nowadays. I don't. |
|||||||||
20 January 2016, 10:43 | #111 | ||||||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
Quote:
Quote:
Quote:
Quote:
Anyway we're supposed to be on Amiga here, so whether x86 support SMC or not, is irrelevant. Quote:
There IS overhead for writing that code. There IS overhead in branching to it. So unless the gain is big - which i don't think it's gonna be - that method isn't worth. But feel free to prove me wrong by posting a mixing routine here Then why not posting a few here ? Quote:
The mere fact you have to fit the end result to 8bit counts a lot more. Quote:
The 6502 is far too slow to even consider mixing audio in real time. Barely playing one channel at low frequency is already taking most of its time, forbidding a game in parallel. The 68k is a lot more powerful. A simple 68000 can mix 4ch in 25khz. A 68020 can do 16. A 68030 can do 32ch with interpolation. The 8086 is intermediate between 6502 and 68000 - but again, only 68k is relevant here. Quote:
If you have many possible simultaneous SFX then you don't need that many channels ; a simple priority system is enough. Quote:
Quote:
And high quality audio mixing is no match for actual cpus. |
||||||||||
26 January 2016, 13:26 | #112 |
Registered User
Join Date: Jan 2016
Location: Knivsta / Sweden
Posts: 20
|
Newcomer to the forum here!
When using the Bresenham algorithm for drawing lines, one could have almost any ratio between the width and height of the line and so the code must be general enough to handle an arbitrary ratio. But when mixing samples, there ought to be a more managable number of possible ratios between the two samplerates (assuming two virtual channels per hardware channel), meaning that it should be possible to write a handtuned version for each ratio which could then omit the compares and branches required in a general Bresenham. (Maybe that is what Megol was referring to with "Reduce the number of sample frequencies + use generated code for changing pitch."?) |
26 January 2016, 14:22 | #113 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
Then welcome
Quote:
So frequency shift is done with fixed-point. Something like that : Code:
add.w d0,d1 addx.w d2,d3 No branch. Nothing that can be optimised for a specific ratio. |
|
26 January 2016, 19:39 | #114 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
this is probably better than the
Code:
swap D1 add.l D0,D1 swap D1 |
27 January 2016, 09:01 | #115 |
Registered User
Join Date: Jan 2016
Location: Knivsta / Sweden
Posts: 20
|
Ok, with addx there is no need for branches, nice.
I think there is. Let's first look at the inner loop in the general case that can handle any ratio: a0 points to prescaled input samples to be played back at the rate of the hardware output a1 points to prescaled input samples to be played back at 0.8 times the hardware rate a2 output of mixed samples d0 fractional rate, preloaded with 0.8 * 65536 d1 integer index of a1 d3 current fractional position of a1 d4 preloaded with zero (integer of 0.8) Code:
add.w d0, d3 ; advance fractional position addx.w d4, d1 ; move forward 0 or 1 step depending on carry move.b (a1,d1.w), d2 ; load sample add.b (a0+), d2 ; mix samples move.b d2, (a2+) ; write output The handtuned version I was referring to would look like this (for handling 5 output bytes at ratio 0.8) Code:
REPEAT 4 move.b (a0+), d2 ; load sample add.b (a1+), d2 ; mix samples move.b d2, (a2+) ; write output ENDREPEAT move.b (a0+), d2 ; load sample add.b (a1), d2 ; mix samples without advancing position move.b d2, (a2+) Last edited by drhex; 27 January 2016 at 10:26. |
27 January 2016, 09:32 | #116 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
D4 being the integer part of the fixpoint, it can be 0 or it can be 1 and sometimes more.
It can happen that the sample is of a higher (or equal) frequency than the replay freq (a likely story if we want to grab every cpu cycle we can !). This doesn't work. The two channels you're mixing here MUST have the exact same pitch - making frequency change useless as they could just be played at the original freq. And of course you won't play music this way. |
27 January 2016, 10:21 | #117 | |
Registered User
Join Date: Jan 2016
Location: Knivsta / Sweden
Posts: 20
|
Quote:
The code in my previous post would advance 4 bytes into one sample for every 5 bytes into the other, thus maintaining a ratio of 0.8. Would you please explain in more detail why you think my solution would not work (and is it the optimized one that doesn't work or my attempt at a general solution as well?) |
|
27 January 2016, 11:26 | #118 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
Quote:
The code may work as expected, but it doesn't do anything useful. |
|
27 January 2016, 12:06 | #119 |
Registered User
Join Date: Jan 2016
Location: Knivsta / Sweden
Posts: 20
|
I am assuming here that it would require too much memory to have a sampling of every instrument for every key on the keyboard, and that one therefore has only one sampling per octave with the same sampling frequencies used for all instruments.
In order to playback the other notes, the playback frequency will have to be adjusted accordingly. Now, if one wants a single hardware voice to playback two simultaneous notes, those two notes may not have the same frequency. Their frequencies may perhaps relate to each other as 1 to 0.8. Then one could use the code above to have them played back at this ratio with respect to each other. One handtuned version for each ratio that could occur for two simultaneous notes would have to be written, of course. That would make it possible to play two virtual voices per hardware voice with one of them having its playback frequency controlled by hardware and the other having its frequency determined by which handtuned code is activated. Thus: a doubling of the number of available voices where each virtual voice can have its own frequency, and no need for "add + addx" in the inner loop. Sounds useful to me. Last edited by drhex; 27 January 2016 at 12:41. |
27 January 2016, 12:41 | #120 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
|
Quote:
But it doesn't change a thing. In fact it's even worse... You won't play music this way : there are too many different notes to be played. And the ratios are rarely, if ever, something as simple as 0.8. Even if you have only 10 possible freqs, think about the combinations - that's not just 10 routines to write. 10 possible freqs for two channels mean 100 combinations. You can swap the min and max to reduce that to about half. Some are multiple of each other and this removes a few combinations as well. But that's still many, many routines to write, and that for a very limited set of possible frequencies. You're also taking the risk of having sound distortion. If you change the replay freq, you have to do it exactly at the same time you start the new buffer. So you're dependent on cpu speed and dma speed. There are many other possible problems, such as when your output buffer size isn't an integer multiple of what your inner loop produces... |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sound channels switched? | bLAZER | support.WinUAE | 21 | 28 October 2014 08:43 |
A600: missing sound channels | cosam | support.Hardware | 28 | 23 May 2010 06:43 |
More that 4 Sound Channels??? | Dragon3d | support.WinUAE | 8 | 01 February 2008 17:30 |
shufflepuck cafe 4 channels sound is crazy | turrican3 | support.WinUAE | 5 | 08 November 2007 15:41 |
help sound 4 channels | turrican3 | support.WinUAE | 37 | 13 April 2007 09:17 |
|
|