#Better Audio
1 messages ยท Page 4 of 1
Oh yeah that's not too bad -- audio nodes are constructed outisde of the audio thread, then sent over.
So you can do arbitrary allocation during construction. Since the channels are typically provided in the Configuration structs, it wouldn't really be possible to accidentally allocate later.
Oh, alright! So we also constrain channel count and max order to the construction right? Changing them after construction shouldn't be necessary?
Yeah no need to change afterwards
I've talked to someone in an audio DSP server, and they taught me that there are two kinds of bandstop filters typically used when oversampling. Polyphase IIR filters (like what the valib library is using) have better performance, but they have the drawback of phasing issues if you want to implement a wet/dry mix parameter. The second are linear phase filters, which are more expensive but don't have the same phasing issues.
I see. I guess implementing both would be best in that case
I have done the linear phase filters previously
lol my code in the wild getting referenced
Polyphase oversampling is kinda neat, because it downsamples in stages, it can really easily do 32x by upsampling 5 times (each stage oversamples by 2). The filter itself I found to have a very sharp transition, but you are limited by powers of two in terms of the amount
Yeah lol. I was like, let me look at my list of Rust DSP libraries https://github.com/BillyDM/awesome-audio-dsp/blob/main/sections/CODE_LIBRARIES.md. Oh, this one specifically has an oversampling implementation. Oh hey look, it's by SolarLiner. ๐
I should also mention that this is a Rust port of the polyphase filter found on MusicDSP
That being said I don't expect much non-linear effects being used in typical game audio processing
You could oversample a compressor or limiter, but they don't need crazy oversampling, so something as simple as a 17- or 31-tap sinc filter might do the trick
Yeah, that is a good point. It would probably make more sense to just have 3rd party node authors do their own oversampling.
or even cubic interpolation
Are there any practical use cases for oversampling other than heavy distortion (and maybe intersample peak limiting)? I find aliasing to be so minimal, especially in compressors
What exactly is a "sinc" filter again? Is it a type of FIR filter?
I mean, I know many effects implement them but I found them to not be necessary most of the times
"sinc filter" because the sinc function is what you get from doing the Fourier Transform on a "gate" function (1 within an interval centered on 0, and 0 everywhere else)
Yeah, sinc are ideal filters for lowpassing but they are infinitely long (and extend to the negative domain) so they need to be windowed + shifted
so you sample a sinc function with the right parameter to make it correspond to the brickwall low-pass filter (represented by the gate function in the frequency domain) and that gives you the FIR filter you need to apply
Also, the person on that other server gave some advice on how this is usually implemented without the need for zero-stuffing:
Let's say you're upsampling by x4. With polyphase IIR, you have four different IIRs with four different states
And for each input sample, you calculate one output from each of them
And that's your four output samples.
With FIRs, it's the same thing: you'll have four different FIRs with four different kernels. Since FIRs aren't stateful, they can share the same input buffer and so on, but effectively it's four different filters.
When you work it out, your four filters end up being "delay by 1/4 sample", "delay by 2/4 sample", etc.
So depending how you design your filters, you might be able to skip one of them, since it'll be "delay by 0 samples"
The problem being (as @viral grove ninja'd me about) that sinc oscillates indefinitely, so you can't actually have a perfect sinc filter, you need a trick to get a realizable FIR filter, ie. truncating, windowing (truncating is technically a windowing method), all with their own tradeoffs, etc.
One scenario I thought of is if you wanted to simulate sound coming out of a walkie talkie. (Although that's supposed to sound bad, so you might not even care about anti aliasing).
I never did that, is that more than a bandpass filter? Some distortion?
It's mainly a bandpass filter, but you can also add some subtle distortion.
Ah interesting
If you want to get fancy, you can also add a comb filter to get that "talking into a cup" sound.
If the distortion is just some wave shaping then aliasing can be mitigated somewhat without explicit oversampling. There's a paper from some NI guy iirc and it works really well. Only introduces 0,5 to 1 samples of latency as well
That makes sense. Speaking of, comb filters are missing and probably desirable
Comb filters are pretty easy. You literally just mix a signal with a slightly delayed version of itself.
And you use an interpolator if you want to delay by sub-sample amounts.
I'm guessing the quality of the interpolator dictates the quality of the comb filter in case of fractional samples
Correct. And there are of course more advanced ways to do it (like using an allpass to delay different frequencies by different amounts).
Interesting
The world of sound design is wild.
Reverb algorithms also usually make use of allpass filters. You use a bunch of allpass filters to "smear" the phase of different frequencies to get a "washy" sound.
Yeah, I've recently looked at some open source reverb implementations and it's very interesting
Btw are you okay if I add some "dynamic" nodes if people need more flexibility with channel setup and use the static nodes with const generics for the most common setups guaranteeing performance?
It's also why cheap reverb algorithms tend to sound "metallic". It's because a bunch of allpasses literally creates a comb filter.
That sounds good!
Ah, I never knew that. I have a cheap digital mixer from Behringer and I hate the reverb on it ๐ฆ
Sometimes you intentionally want to make things sound metallic. This is a great video of how to synthesize a cymbal sound in Serum. https://www.youtube.com/watch?v=2pPfKlXMA7M
Oh wait, he's using FM to get the metallic sound. I must be thinking of a different video.
Oh OK, he does use comb filters later in the video.
Au5 is a beast
The CPU cycle is more expensive than thee expensive algorithm.
This is the progress I made on the new timing system. Let me know what you think! https://github.com/BillyDM/Firewheel/pull/48
amazing! I've only given it a quick glance, but excited to check it out tomorrow
edit 3: nice actually
Ah yeah I suppose Instants are perfectly capable on all major platforms for this kind of usage. They should be able to provide timings well within any just-noticeable-difference on the simulation/game side.
Could we also have samplers provide a similarly adjusted clock?
Yeah, that makes sense. I'll work on that tomorrow.
Hmm, I'm trying to decide if the internal sample clock should account for output underflows or not.
(An output underflow occurs when the CPU takes too long to fill an audio buffer, leading to audio frames being skipped. Ideally output underflows should be rare or even nonexistant, but it still probably needs to be dealt with.)
If we don't have the sample clock account for it, then underflows will cause any scheduled events to occur later than expected and also cause any playing samples to finish later than expected.
If we do have the sample clock account for it, then we would also need to add extra logic to samplers and such to skip ahead to the correct time when an underflow occurs. That could be done, but I'm not sure we want to add that extra complexity (and to have other users deal with that complexity when making their own sampler-based nodes).
Hm yeah thatโs tricky. I would be tempted to consider underflows such a degenerate state that accounting for them in timings isnโt necessary.
Then again, if audio scheduling is really important in your app, youโd potentially get a whole cascade of errors from a single underflow event.
Though ideally you should be syncing your game to the audio clock anyway if timing is important.
Ah, right. I see.
I am kind of stuck with the redesign of the filter nodes. I have it like this at the moment:
pub struct ConstChannelFilterNode<const NUM_CHANNELS: usize, const MAX_ORDER: usize = DB_OCT_24> {
pub spec: FilterSpec,
}
pub enum FilterSpec {
Lowpass {
order: FilterOrder,
cutoff_hz: f32,
q: f32,
},
// snip
Bell {
center_hz: f32,
q: f32,
gain_db: f32,
},
// snip
}
I did it like this because
- ...it feels natural to tie them together.
- ...not all parameters make sense for all filter types. E.g.
gain_dbis nonsensical forLowpass.
But now I cannot even make a UI node because I need to borrow spec twice, once for changing the type of the filter and once for modifying its parameters which makes total sense. How would one usually solve this? Sorry for the basic question, I am stumped on how to solve this.
This is for the node graph UI example right? Is there anything wrong with borrowing it twice? egui is immediate-mode, so is there any issue with multiple successive mutable borrows? (I'm probably just not understanding the issue, though.)
Actually, it's pretty common to have parameters that do nothing in certain states. I think it makes a lot more sense to unify the parameters, especially since that would make it easier to switch between them.
I'd argue that's not very idiomatic Rust, personally -- I think that's a holdover from many languages that
can't easily represent the state fully accurately, at least not without workarounds.
I explained it poorly, I need to borrow the enum variant as well as the fields:
GuiAudioNode::Filter { id, params } => match &mut params.spec {
FilterSpec::Lowpass {
order,
cutoff_hz,
q,
} => {
ui.vertical(|ui| {
// radio button to switch filter type
ui.radio_value(
&mut params.spec,
FilterSpec::Lowpass {
order: 1,
cutoff_hz: 440.,
q: 1.,
},
"Lowpass",
);
// snip (more radio buttons)
// cannot mutably borrow order, since params.spec is already mutably borrowed
ui.add(egui::Slider::new(order, 1..=16).text("Order"));
// snip (more parameters)
})
}
}
If you think of it like an audio plugin in a DAW, each parameter is assigned one-to-one to a specific knob/button/dropdown in the GUI. Rust's enums aren't really the best structure for that.
Lots of plugins have contextual UIs though -- the number and arrangement of parameter may change depending on other settings. Like this LFO in serum (different set of parameters on the right).
I agree, but for an EQ plugin for example, the labelling or number of knobs may change depending on the filter type, hiding this from the user. In our case the users will be directly interacting with the node itself. (basically what Corvus said above)
It is a little annoying to do by hand though in this case.
Could you make a dropdown widget that just takes a string, then update the variant if it changes?
And then match on it mutably to handle the inner parameters.
I can try. I honestly have no experience with egui, I just tried to make sense of it from the other examples. Couldn't find a dropdown widget when I first looked
I think the type is ComboBox https://docs.rs/egui/latest/egui/containers/struct.ComboBox.html
Thanks! Will try that!
Right, but internally audio plugins always represent each parameter as its own thing. Though I guess that could just be a holdover from non-Rust languages.
It definitely depends on if you need to persist the old values -- for example, switching between modes while still maintaining what the old values were.
In that case, the underlying representation in the UI definitely couldn't be an enum.
I think that probably doesn't apply here though -- if people want to build a front-end for a combined filter node, they can store the old values separate from what the actual variant holds.
With egui 7 million downloads, egui โฆ why not?
This crate provides an Egui integration for the Bevy game engine. ๐บ๐ฆ Please support the Ukrainian army: https://savelife.in.ua/en/ - vladbat00/bevy_egui
For audioโฆ?
I've ended up being busy with other things things weekend, but I've got a bit more progress done today. The audio clock no longer tries to automatically correct for output underflows. This makes things simpler.
I was busy but tried this approach now. If I understand correctly, I just add some additional field on my node like this:
pub struct ConstChannelFilterNode<...> {
pub type_selector: FilterType, // holds just the bare filter type, e.g. FilterType::Lowpass
pub spec: FilterSpec, // holds both the filter type and the parameters, e.g. FilterType::Lowpass { ... }
}
Then I can have a dropdown for type_selector and customized widgets for each variant of spec.
GuiAudioNode::Filter { id, params } => {
// Dropdown menu
ComboBox::from_label("Filter Type").show_ui(ui, |ui| {
ui.selectable_value(&mut params.type_selector, FilterType::Lowpass, "Lowpass");
// snip (other dropdown values)
});
// Custom sliders for each filter type
match &mut params.spec {
FilterSpec::Lowpass {
order,
cutoff_hz,
q,
} => {
// widgets for lowpass type
}
// snip (other filter specs, e.g. Highpass)
}
}
So when I get an event for type_selector in my audio node, I also need to set spec to the new type such that the UI can update.
ConstChannelFilterNodePatch::TypeSelector(type_selector) => {
// save new type selector
self.params.type_selector = type_selector;
// change spec so UI understands it should render different widgets
match type_selector {
FilterType::Lowpass => {
self.params.spec = FilterSpec::Lowpass {
order: 1,
cutoff_hz: 440.,
q: 1.,
};
}
// snip (other filter specs, e.g. Highpass)
}
}
But the UI does not update, not sure why. Did you intend this or did I misunderstand?
Sorry for the wall of text
Oh, sorry -- I might not have been super clear!
Personally, I'd just have the FilterSpec be the only field in the filter node. That way, using it in a code-driven environment maintains correctness and ease of use.
When you're working with it in an egui context, though, I'd just make a totally ad-hoc dropdown element. When it changes, you can match on the new name and construct the new FilterSpec variant however you like.
Then, beneath that, you can match on the variant itself and manage the inner parameters with egui.
In other words, I wouldn't adjust the structure of the filter enum or node so that you can pass it directly to egui.
Does that make any sense? Let me know if I'm misunderstanding.
As for this
So when I get an event for type_selector in my audio node, I also need to set spec to the new type such that the UI can update.
If I understand what you're suggesting correctly -- anything you do in the audio node with regards to the events or the parameter values is not bi-directional. The UI will receive no information about changes to the FilterSpec that happen in the audio node. Only changes in the UI will be communicated to the audio node.
I think my lack of egui experience is showing. How do I make this dropdown? Up until now I've only ever passed mutable references to the widgets that then get automatically set.
Ah that makes sense, thanks!
That's why I thought I had to add another field to my audio node
Ah, I think I see the problem. There's nothing "sticking around" to keep track of what the current variant is (except the FilterSpec itself).
Here's roughly what I would do -- a touch annoying maybe, but it should work:
let initial_variant = match params.spec {
FilterSpec::LowPass => "Low-pass",
// ...
};
let mut new_variant = initial_variant.to_owned();
ComboBox::from_label("Filter Type").show_ui(ui, |ui| {
ui.selectable_value(&mut new_variant, "Low-pass".into(), "Low-pass");
// snip (other dropdown values)
});
if new_variant != initial_variant {
// here's where you change the variants
match new_variant.as_str() {
"Low-pass" => { /* */ },
// ...
}
}
// THEN you can just match on the whole enum and modify the inner values
That is, turn the current variant into a string, create a dropdown that can select between the variants (including the current value), then update the enum if the string has changed.
are Diff and Patch bi-directional? aka are updates to a param from a node reflected in the ecs?
No, changes are only communicated from the ECS to the audio node.
To get information back, there are other mechanisms available
Diff, while pretty efficient, is still quite a bit slower than Patch. Diff can also allocate, so as-is it's not a great way to send events from audio nodes to the ECS.
is there a sanctioned pattern atm?
Thanks for writing this up! I'll try it out later, really appreciate all the help ๐
Yes -- the primary mechanism in place is to store a type in the audio context itself that can handle that communication (via [custom_state](https://docs.rs/firewheel/latest/firewheel/node/struct.AudioNodeInfo.html#method.custom_state during construction).
Types like the SamplerNode will store a shared struct in there that uses atomics to communicate bits of state.
The reasons for this approach are mainly:
-
In certain circumstances, we'll want to be able to store state somewhere that is
!Send, like plugin types, FFI-related state, etc. -
It allows us to maintain separation between a node's parameters (the type that implements
DiffandPatch) and any other bits of state. Parameters that are perfectly stateless, i.e. contain no shared atomics or other synchronization primitives, are very convenient. We can throw them around wherever in the ECS without worrying about its provenance, essentially. They can be constructed on the fly or cloned without concern.
yup okay that makes sense
In a lot of cases you can just clone the state out of the audio context and insert it into the ECS, so we might set up a more convenient way to do that in bevy_seedling. In most cases it could be automatic.
how do i access the firewheel context? it seems like it's in a thread local somewhere?
trying to get up to speed here quickly this is my first time actually digging into these apis (:
i guess for context i should say -- for creative tech stuff we really care much more about input than your average game. i'm happy to help contribute here. it seems like custom state is intended primarily for node local !Send stuff?
kinda want the opposite
or rather, not to have the constraint of accessing state from a non send ecs system
fn access(mut context: ResMut<AudioContext>) -> Result {
let state = context.with(|context| {
context.node_state::<SamplerState>(/* node id from somwhere */)
}).ok_or("missing state!")?;
// ...
}
I honestly expected the removal of !Send resources and other types to happen a bit quicker, so I borrowed the closure-sending idea from an older PR. On Wasm, this just runs in place, but in a multi-threaded environment, the closure is sent to the context's thread.
Not necessarily -- it's just designed to be able to support !Send. You can put whatever you want in there!
And yeah, once you've liberated the state from the audio context like this, it can live freely as a component in a multi-threaded context just fine. That's what we do with the sampler's state.
The dance of constructing a node, then getting its state out of the audio context is a little annoying. But it definitely helps overall correctness, at least for bevy_seedling.
okay i'm tracking, i think this is enough to get started
Awesome! Definitely let me know if you run into any problems or things are unclear.
this all looks so incredbile and such a step up from existing bevy-audio and raw cpal
one thing that comes to mind is that it would be nice to have all the cpal metadata as components in the ecs. it looks like right now SeedlingPlugin constructs a AudioContext resource in it's plugin configuration but it might be nice to shift that to ecs startup so that users could dynamically query available cpal devices, etc
Yes that's probably the right move. Right now the I/O situation is not very robust.
we have the same issue in rendering with e.g. monitor selection, etc. i'm always thinking about what a settings menu would look like. nbd for now though (:
Yeah we'll also need to think about dynamically updating I/O and potentially re-initializing the stream.
It works now. Thanks so much for your help, Corvus ๐
I have also added a comment on my PR, summarizing my changes
Maybe you have some input on better names for the two nodes (ConstChannelFilterNode and FlexibleChannelFilterNode)
Otherwise I am happy with the API now
Oh I do have an opinion! I always try to maximize ergonomics for the most common use cases. In this case, I think we expect people writing code by hand to use the const generics nodes more often. In that case, I would just name it FilterNode. Then, for the dynamic case, I'd personally shorten it to just DynamicFilterNode.
Maybe merely Dynamic doesn't fully communicate how it's dynamic, but it's probably enough to effectively distinguish it from const filter node.
Awesome, thanks for the input!
I am only a bit unsure about "dynamic" because there are such things as dynamic filters that change their parameters based on the input. I don't think it's typical to call them "dynamic filters" but "dynamic eq" is definitely a thing
That could work
Dynamic filters are more often called "auto-filters" like auto-wah, but yes, Dynamic EQs are definitely a thing (very useful in mixing and mastering)
Interesting! Do you have an opinion on what a good alternative name could be?
I've been thinking and I couldn't find anything though haha
It's even harder because we're talking about the specific fact of being able to provide the channel count at compile-time vs. runtime, but it doesn't mean that for the latter you can necessarily change the channel count once instantiated
Good point
I think having the word "Channel" in it would make it more explicit but also longer. It feels like there doesn't exist a perfect name. I think something like FlexChannelFilterNode would at least emphasize that it is more flexible in its channels in some way. Personally, I think that using Runtime as a prefix may make it sound like FilterNode is just some compile time / const fn thing. Though it is definitely more concise.
FlexChannelFilterNode is probably fine -- people can always create an alias if they don't like it
do we have a criteria for the mvp of upstreaming this work into bevy?
Give demo proving it's better
The bar for "better than current" is pretty low
Oh, 100% lol
I read through all of bevy_seedling's docs the other day and I am interested in seeing it replace bevy_audio
might want to either update the pinned message or write a new pin btw
same
the quality of the code is also very high in terms of documentation
agreed, and there are a reasonable number of examples and tests
I'm envious of your main crate doc
This is awesome!
I guess the obvius questions for @slate scarab are (a) would you want to upstream this into bevy as bevy_audio and (b) what do you think needs to happen before we do that? Can you give me a rough percentage the of bevy_audio features this supports?
yes sorry i should have not assumed at all @slate scarab wanted to do that
I'm currently in the process of reworking the timing system which will introduce some breaking changes, so I'd like to get that done first before merging into bevy.
I'll try to get that all done tomorrow.
I think we can wait that long ๐
(a) My idea behind the bevy_seedling name was that, by the time the library sprouted into a little firewheel flower, it would be upstreamed as bevy_audio!
I suppose my only concern is the degree to which an upstreamed Bevy crate can be opinionated. I have lots of big ideas! I'd love to provide first-party support for sophisticated audio implementations similar to Fmod or Wwise. I'd also like to get a node graph editor going once Bevy's editor is off the ground. There's also a lot of little things we could add here and there.
(b) Essentially 100%. There's nothing major I'm aware of that bevy_audio has which bevy_seedling does not. bevy_audio's spatial playback does support ear positioning, which Firewheel's basic spatial node doesn't, but that's the only thing that stands out to me.
The main design challenge remaining is a robust parameter animation system. I'd like to use bevy_animation if possible, but I haven't done a deep enough dive yet to assess whether it has what we'd need for audio. I discussed a bit of that here #art-audio-animation message.
I don't feel that the crate is totally complete without parameter animation of some kind, but it doesn't necessarily block upstreaming.
I suppose my only concern is the degree to which an upstreamed Bevy crate can be opinionated. I have lots of big ideas! I'd love to provide first-party support for sophisticated audio implementations similar to Fmod or Wwise.
I'm very open to this sort of thing ๐
As long as the easy things are easy, I'm happy to support more complex uses
The main design challenge remaining is a robust parameter animation system.
If the current audio crate can't be animated, and I don't beleive it can be, then this isn't something I'd want to block upstreaming on.
It seems like we should probably ping cart for an evaluation, once his workload clears up a bit.
Because it sounds like this is more or less ready now.
I wanted the demo before this, so we can get a direct side-by-side of the actual audio quality
makes sense
The demo scope is small, so I should be able to build something sufficient very soon.
@dusky mirage just fyi, the filter PR would be ready from my side
Ok, I'm finishing up the new timing system, and then I'll get to your PR.
The new timing system is now almost done, I just need to add one more thing. Let me know any feedback you have! https://github.com/BillyDM/Firewheel/pull/48
This seems great! I don't think I'll be able to provide much useful feedback until I start using it, but it seems like it'll be very helpful.
I do have one request -- as you update the audio clocking stuff for sampler nodes, would it be possible to express the playback start time as an absolute time (in ClockSeconds) instead of as a delay? I'm sure that would make it more complicated in cases where the start time is actually "in the past," but it would be super helpful in the context of Bevy.
In general, when you queue a sample for playback, you don't actually know how long it'll take to load it. So the delay given to a node might become inaccurate if it takes too long to load the sample into memory. However, if it were absolute time, that would be no problem!
Well, the EventDelays are actually absolute times, not delta times. In fact, I should probably rename that to something else to make that clearer.
Oh even in this one it looks like https://docs.rs/firewheel-nodes/0.4.2/firewheel_nodes/sampler/enum.PlaybackState.html#variant.Play
What happens currently if the delay is back in time?
The event just happens on the first frame in the processing block. It would actually be fairly simple to allow a sample to "play in the past". I'll work on that in another PR.
Oh nice, yeah that would be super helpful for tight scheduling!
Actually, it turns out using a triple buffer to do the same technique for the sampler node is a bit tricky due to the way node constructors work.
So instead, I've just added a much simpler method to FirewheelCtx that just returns the instant the audio clock was last updated. The user can use this to correct for the delay for the playhead in the sampler node using SamplerState::playhead_frames_corrected() or SamplerState::playhead_seconds_corrected().
It's not as accurate as it would be with the triple buffer technique, but it should be good enough to get a rough idea.
alas
well we can dress that up in a nice API in bevy_seedling at the very least
btw @dusky mirage is there a way to change input / output devices once a stream has started? if not, do you think adding it would be tricky?
Doing that "seamlessly" is incredibly tricky, but you can stop the current stream and start a new one.
Practically the only way to do it seamlessly is to spin up your own audio thread that passes samples back and forth with the OS's audio threads. This will introduce extra latency.
Hm, okay I'll update bevy_seedling to support that. The processor isn't lost, right? It'll just be transferred over and the new_stream method will be called on all the nodes, right?
Correct, the processor is reused across streams.
What do you think we should do for the sample assets if the sample rates change? Right now they're resampled on creation, so I think they'd just.... still be the old sample rate in all the sampler node processors.
Oh looks like you recommend reloading them in the implementation.
Yeah, that's a tricky one to do seamlessly. Ideally you should just avoid changing the audio output device in the middle of a game.
But since bevy_seedling has full control over the process, we could just save the state of all the samplers, then change the stream, then restore that state with the newly loaded assets. Obviously that'll be noticeable, but maybe not too bad since you don't frequently change devices.
And most modern audio devices support both 44100Hz and 48000Hz anyway, so it's very rare that changing the audio output device will change the sample rate.
yeah, maintaining the sample rate would be great
Yeah, things definitely would have been simpler if the industry settled on 48000Hz as a universal standard.
Or 44100Hz as the standard.
I can't remember what the history is behind those two standards. I guess it's something to look up.
Both have a nyquist frequency well above the human limit of hearing, so there's very little if any quality difference between them.
maybe cd folks wanted to maximize the amount of audio they could stuff on there
I support "ejecting" a processor from a stream in interflow, the approach is to have a oneshot channel that sends a distinct oneshot::Sender to send the processor through. This acts both as the stop signal in interflow and the way to retrieve the processor back without adding a Sync requirement on it
(I only need to do it in CoreAudio because of the way the Rust bindings are setup, other backends I can directly get it back because I own the audio thread so I can decide to stop it and return the processor back through the thread::JoinHandle)
IIRC, 44.1 kHz comes from the CD standard because Sony already had some digitization experience tring to encode audio in video cassettes, and the combination of data rates meant that 44100 was an easy to meet target, probably the result of multiplying the vertical resolution in pixels with the data rate or the frame rate or something related to those
Also didn't take 40 kHz straight because they took into account the transition band of the lowpass filter placed at 20 kHz
48 kHz for DVD was because all that and also the fact it's divisible by 24, which makes synchronizing with video frames easy
therefore video now has a standard of 48 kHz, but audio is on 44.1 kHz
Hm, @limpid river you added web_time to bevy_platform right? Or were at least involved I think?
I'm running into a bit of a problem with it:
The implementation of Instant::now() relies on the availability of the Performance object, a lack thereof will cause a panic. This can happen if called from a worklet.
With the upcoming changes to Firewheel's clocking, this is actually a huge problem! The Web Audio API backend I made for it, which runs directly inside an audio worklet using multi-threading, is now panicking whenever Instant::now is called.
I'm not sure what the best way forward is. I might be able to trick web_time by creating a JS object that looks like the global context -- not totally sure. To be honest, I'm not even sure it's possible to get any kind of system time with decent accuracy that aligns with the main browser thread. I'm fairly sure the currentTime value in the audio worklet context starts from zero when the context is created, so that can't be used to compare any kind of Duration against the main thread's Performance.now.
Let me know if you have any thoughts!
If this is not reconcilable, we might have to adjust Firewheel to use a special AudioInstant or something. In general, it could use bevy_platform::time::Instant, and in multi-threaded web contexts, we'd have to do something a little more bespoke.
Alternatively, @dusky mirage do you think the time fetching mechanism could be a property of the backend? If so, that would give my web audio API backend full control, which seems like a much cleaner separation. The cpal backend could simply continue using bevy_platform::time::Instant.
So I'm not super familiar with the web_time side of things, as that just carried over from bevy_utils. I do think there's room for maybe making that an explicit feature, since it does appear to have issues. And regardless, we have a generic polyfil for Instant that'd work on web even without web_time; just up to the user or a library to provide some implementation with whatever is available
Actually yeah, that makes a lot of sense. That should be easy to do.
To expand on that -- I think, fundamentally, code running in a web audio worklet cannot acquire any meaningful Instant with respect to the main thread. The only timing information it has is relative to the audio context.
That means we can't compare anything supplied by the firewheel backend to a std::time::Instant or bevy_platform::time::Instant -- they have a different basis. However, we can always acquire the audio timing even on the main browser thread by referring to the audio context.
In practice, most backends will be able to provide a meaninful Instant in all contexts, so the status quo won't change for existing backends. We'd just have to make sure we always get the audio time, even on the main thread, from the context.
Ok, I have an idea how to make this work. I'll work on it and you can review it.
@slate scarab Ok, here's what I came up with. https://github.com/BillyDM/Firewheel/pull/50
Awesome! I should be able to give some proper feedback in a bit.
Hey @celest whale , I was hoping I could get a quick bit of guidance on the audio demo! As a little refresher, here's what the Rust audio group has listed as requirements:
Currently two crates are participating in this demo: `Rodio` and `Firewheel`
though everyone is very welcome. The demo will be limited to functionality that
is properly implemented by all crates.
# Demo
- audio environment playing back many sounds at the same time.
- sounds and effects are influenced by user input to make it easy to judge latency.
- the demo uses audio libraries in the same way as a game.
- pre-defined events instead of user input to keep the demo repeatable.
- no graphics to keep the demo code simple.
- sounds are all at the same sample-rate.
- implementations pre-resample before loading.
This seems to imply that they're looking for a more general/low-level demonstration. If that's the case, then such a demo might conflict with what Bevy maintainers want to see for upstreaming, since bevy_seedling's API is critical in that decision making.
Also, does it really make sense to limit bevy_seedling/Firewheel to what rodio/bevy_audio can do? A huge reason the former exist is precisely because of what the latter can't do.
In short: what is the core thing we should be demonstrating for Bevy specifically?
- That
bevy_seedlingis overall better thanbevy_audio(pretty easy (sorrybevy_audio))? - That
bevy_seedling/Firewheel are better thanbevy_kira_audio/kira?- This one's more debatable. I argue the former satisfy cart's concerns well in this comment, but in principle the latter could be brought more in line as well.
- That Firewheel is a better foundation for all interactive Rust audio than the competition, and therefore the right choice for Bevy?
- This is the most contentious, and potentially not really answerable. The
kirafolks have put a lot of good work into their engine, and there are technical trade-offs that each make. I don't think this will resolve in the short term.
- This is the most contentious, and potentially not really answerable. The
No pressure to answer at the moment! Feel free to come back to it later. I realize now after typing all that that it's kind of a lot.
If the questions at the end really are important to tackle, then it may be worth spinning this out into an issue or a discussion.
Some things that come to mind:
- We could have better audio now if we upstream
bevy_seedling, but is that the right long-term plan?- I'm biased of course, but Firewheel is an up-and-comer, so going with it is in some ways a bet
- But then.... do we try hard to coordinate with the Rust Audio group? While I think the effort is admirable, it's a very small group of people with no central authority. Are we likely to achieve consensus within reasonable time-frames?
I think a lot of the Rust Audio discussion has stalled, probably in part due to waiting for someone to make that demo. I could totally make that low-level, Firewheel-vs-rodio demo, but... I hope it makes sense that I'm really just more interested in bevy_seedling and Bevy!
I want to see the low level quality stuff first, since that gives us confidence on "it's worth changing". Once we have a decision on that, we'll do a review pass on bevy_seedling and decide where on the "immediately copy paste" to "write our own integration from scratch" spectrum we lie
Hmm, it turns out the way FilterSpec is laid out and the use of a Filter trait makes it really difficult to add parameter smoothing to the filter nodes.
I'm not sure there is an easy way to fix this. I'm not sure if having a "generic" filter system even makes sense at the firewheel-core level.
I either have to nest a loop over the channels inside of a loop over the samples which is very inefficient, or I have to do a very complex system of storing the coefficients in a temporary smoothing buffer which would be hell with a generic Filter trait.
Or even worse, re-computing the filter coefficients for every channel.
And plus I can't think of any way to process multiple channels in parallel to take advantage of auto-vectorization.
Would it be a lot of code duplication to maybe just... split up each filter into a completely separate node?
i.e. instead of one big Spec, you'd have parameters for each individual filter
That would also make the Diff and Patch traits more efficient for each individual node.
If you wanted to easily swap between filter types, maybe that's better left to higher-level tools.
I don't think that would help that much, considering the problem is the Filter trait.
I'm not sure what to do, we could get rid of the filter trait, but then the user will have to keep track of filters manually.
Actually, I think the problem might be more the MultiChannelFilter struct.
Hm, to be honest it just seems like the whole stack would be a bit simpler (if slightly more redundant) if each filter were a different struct rather than a combined enum. I probably should have raised that thought before ๐
But I'm not in the details enough to perceive where the difficulty is coming in with smoothing, so this may be unrelated like you mentioned.
Essentially, the psuedo code for an optimized smoothed filter would work something like this:
if self.smoothed_filter_params.is_smoothing() {
for i in 0..frames {
// Every 32 frames (or 16 or whatever), update the filter coefficients.
// We avoid doing this every frame because updating filter coefficients is very
// expensive.
if i & 31 == 0 {
let next_params = self.smoothed_filter_params.next_smoothed();
self.filter.update_params(next_params);
}
// If `NUM_CHANNELS` is a constant, then this loop will unroll and auto-vectorize
// leading to a huge performance improvement. If it is not constant (there are a
// variable number of channels), then this will not unroll and auto-vectorize.
for ch_i in 0..NUM_CHANNELS {
buffer[ch_i][i] = self.filter.process_channel(ch_i, buffer[ch_i][i]);
}
}
} else {
for i in 0..frames {
// If `NUM_CHANNELS` is a constant, then this loop will unroll and auto-vectorize
// leading to a huge performance improvement. If it is not constant (there are a
// variable number of channels), then this will not unroll and auto-vectorize.
for ch_i in 0..NUM_CHANNELS {
buffer[ch_i][i] = self.filter.process_channel(ch_i, buffer[ch_i][i]);
}
}
}
However, another complication is that FilterCascadeUpTo has its process method defined like this:
#[inline(always)]
fn process(&mut self, x: f32, coeffs: &Self::Coeffs) -> f32 {
self.svfs
.iter_mut()
.zip(coeffs.svfs.iter())
.take(self.num_svfs)
.fold(
self.one_pole.process(x, &coeffs.one_pole),
|acc, (svf, coeffs)| svf.process(acc, coeffs),
)
}
That take will prevent this from being unrolled as well, preventing auto-vectorization optimizations.
Hmm, I do have some ideas on how this might be fixable. I'll experiment with that.
Dang, this is really hard to do generically.
I'm also not sure if it's worth the extra complexity to support a mixture of both one pole filters and SVF filters in the filter cascade. I might change it to only SVF filters to make things simpler.
In fact, should we even support cascades at all? Considering this is mainly used for games and not for creating modular synthesizers, it might just be overkill.
Not having one-poles mixed in and not having cascades of SVFs would make things a lot simpler.
We may have veered too far into "generic DSP library" territory.
Steep slopes *can be very artistically useful, though, unless I'm misunderstanding what you're proposing.
Yeah, I suppose.
Personally I think having only 12 dB/oct slopes is a bit limiting, even for game audio. If we omit the one poles, would making the const order generic (instead of having a "maximum order") fix some issues?
FMOD's EQs do support a bunch of different steepness options and algorithms, so it definitely has some precedent
messed around with them a lot in my own projects
though splitting things into separate nodes makes a lot of sense
I don't think we should aim at being as fully featured as FMOD for example, either. It makes sense to provide basic filters (with pole mixing to extend to shelves and bells) but I agree with the sentiment that we might be veering too much into generic DSP territory here. Firewheel and Seedling are modular enough hopefully that more specific use cases can be provided by third-party crates.
That being said, a 3 or 5-band EQ is simple enough and useful enough that it would make sense to include first party (with 12 dB/oct bands)
In principle, I think you can make that argument. But does it actually work in practice? Who's going to step up and provide the level of quality that we expect in the main Firewheel crates and maintain it over time?
Correct me if I'm wrong, but rodio does allow you to create arbitrary effects. They're just iterators. And yet (to my knowledge), we do not see a lively third-party ecosystem, and rodio is much older and much more widely used than Firewheel. Even kira doesn't have much of an ecosystem!
Could a balance be struck by having a 1st party "3rd party" extension?
Keep the base firewheel reasonably scoped, and put the more ambitious fmod-like capabilities in a second, optional crate?
haha yeah that's what I was thinking
From a purely "PR" perspective, I think people will be more willing to contribute to the Firewheel ecosystem (especially early on) if they're contributing to something that feels "official."
rodio works by attaching effects as decorators, so yes it is extensible but it's kinda hard to make it generic enough, and especially with Bevy where any new combination of effects leads to different types which need to be registered separately. kira is much better in that regards.
That being said, this says more about the lack of need rather than the lack of participation. People mostly don't really need advanced effects in their games, and if they do, they can provide their own implementations, as if they do know what effect they want, they're much more likely to already have some amount of knowledge in how to get an implementation going.
I guess this could be done with a "contrib" module which would host community-contributed nodes, yeah
But there should be a balance between barebones and exhaustive
That being said, this says more about the lack of need rather than the lack of participation.
Sure, but another way to look at that is no one needs advanced audio effects for games in Rust because no one is making games that need that in Rust.
Even if they were making games like that in Rust, they'd probably just use FMOD or Wwise because they have no other options.
That is already true for most game engines' audio engines
and if they do, they can provide their own implementations, as if they do know what effect they want, they're much more likely to already have some amount of knowledge in how to get an implementation going
I also think this isn't super artist-friendly. Most game audio people do not know how to implement almost any of the effects they use. They're implementors, or composers, or sound designers!
Sure, we're a community of people very familiar with Rust here, but my perspective (which could be totally wrong) is that most people we want to be using better audio in Bevy aren't going to know how to do any of that, especially as Bevy matures.
(I hope I'm not coming off as combative or dismissive btw! I might be talking past you.)
Even using bevy_seedling, which is very high-level and user friendly, will probably be a bit of a learning experience for lots of people.
(not at all, I'm all for this kind of discussion!)
(i guess personally, i see "games don't use fancy effects" more as a fault & an opportunity for improvement with better tools)
The thing is that most "serious" games will already be reaching for one of the two, even when using Unity or Unreal Engine (the latter of which also has a serious audio engine platform from what I've heard). So the question is do we want to place the new Bevy audio engine as a flexible platform but lightweight (both in terms of maintenance and runtime performance), or are we aiming for a serious contender to "the big ones" by providing all sorts of tools and nodes and all that
And as you say, audio engines are as complicated as graphics engines, but only the latter is understood by people, and considerate effort is made to make graphics content authoring accessible. Audio engines are either "easy and simple but not expressive" or "expressive and complete but not approachable". Striking both will need a lot of work and probably innovation (the same way graphics has come to be made more approachable until now)
My thinking is that by focusing on the extensibility and lightweight performance part, we can provide a platform for other contributors to build on top of, and contribute back as third or even first party. Bevy is cool in that sense; the community is this project's biggest asset, and we can then focus on different things as needed, rather than spending time to provide a robust list of features that 95% of people won't use (because all they want is to play audio files with their game).
Hm, I can't speak to the maintenance angle, but I think we'd kinda eat our cake and have it too with respect to features vs performance. Or at least, I don't think there's a real dichotomy there.
Something that's really nice about Firewheel is that you only pay for what you use, often even at runtime! And it's flexible enough that I don't think you'd be pigeonholed into a bloated audio runtime. A great base set of nodes doesn't make Firewheel any less extensible, which is great!
I think the current set of Firewheel features is perfectly capable of providing a base for big-boy functionality (especially with the recent clock changes coming in). Higher-level crates can do all the curves and management and so on to provide sophisticated audio implementation. The only thing missing in my opinion are the nodes.
Call me crazy, but I'd love to have a Bevy-first contender to third-party audio implementation software. I think the road is long, but right now it seems doable to me.
Maybe that's a bit of a "miss the moon" situation, but if we land the changes we're working on into Bevy proper, then Bevy's audio capabilities will already exceed any game engine I've worked with personally by a long shot.
We're already doing great!
Side note -- the performance of Firewheel does look great in practice! I'm doing some early performance profiling for the Rust Audio demo. The timings come from core audio's client load, which should be fairly accurate.
Here's playing 20 samples simultaneously, pre-resampled, with a block size of 1024.
I'd add +1 to the point that artists don't know how to implement their effects.
Heck, I personally know a lot of the mathematics and difficult concepts behind audio processing, yet I am still unable to write a realtime DSP thing for anything nontrivial.
Firewheel is 5x as performant*, am I reading that correctly?
-# * please consult your local sage when the word performance is uttered
It might be a bit unfair to rodio? It's all the same sample, and Firewheel can share the underlying buffer, whereas rodio can't easily as far as I can tell. So maybe rodio's just getting clapped on cache misses.
Once I'm actually finished, I'll be able to get some more representative traces. But I thought it was neat!
Oop, actually that was rodio at 512 and Firewheel at 1024 (Firewheel tries to select 1024 by default). You can see in the previous plot that Firewheel has half the samples. Here it is at the default 512.
The delta's actually so high (~7x) that I might be doing something wrong ๐
You and Billy have done a lot right. Donโt be humble.
The performance is all @dusky mirage for sure! My main contribution is Diff/Patch, but that's not being used here.
Oh wow is that literally an order of magnitude
How cross platform are FMod and WWise?
They're both pretty good, actually. They hit all the major platforms (including consoles). You do have to distribute their dlls/dylibs which is kinda annoying, but integrations generally take care of that for you.
I need to catch up on this conversation, but I think we should not aim to be as fully featured as fmod currently.
What we need is a good reliable and maintainable rust-only default. The question is probably when not if a project will switch to fmod (even if we do fantastic stuff, people like what they know). The answer to that question should probably be "commercial teams".
The safest thing to target, for now, is beginners hobbyist and genius single-dev indies (our current core user-base).
If it succeeds with that user-base, then we can set our horizens a bit further out.
The transferability alone of these two necessarily puts them in a different category than anything we could do specifically for Bevy.
for sure
There's some parts of the engine that need to be built proactively for users who are not yet here, like the editor. That's mostly not for the users we already have. There's other features that are for the people already using bevy, and this is for them. All it has to do is satisfy their needs, let's not focus on a prospective larger ecosystem just yet.
That seems to be what you are saying as well.
Yeah -- while I dream about making something feature-competitive with big audio implementations, it's probably not practical for quite a while, if it even makes sense at all. In the meantime though, it would be nice to get to a point where people with a little know-how don't feel like they have to reach for them. The flexibility/raw capability of Firewheel definitely makes that possible.
The big next steps imo are
- Robust parameter animation (ideally with some way to blend curves)
- Helpful tools for interactive audio (loop regions, layering management, transitions, etc)
- More effect nodes!
(1) and (2) are largely external to Firewheel. As we've discussed, (3) is up for debate depending on the node.
But with these three, anyone familiar with Bevy and just a working knowledge of audio will be equipped to handle pretty much anything (without writing everything by hand).
crates like fundsp can help fill the gaps anyway -- it's pretty easy to wrap fundsp processing chains in a Firewheel node
@dusky mirage is there any way I can help with the redesign of the filter nodes?
Yeah, that's the power of block-based processing vs per-sample (iterator) based processing. Block-based processing allows for many types of optimizations that are hard or not possible with per-sample processing.
One of the main reasons is branching. With per-sample processing your logic causes you to branch ever single sample, preventing the compiler from optimizing and making it much harder for CPUs to do speculative execution. With block-based processing, you only branch once per block of samples.
Also branching can prevent the compiler from auto-vectorizing code, which is pretty paramount to high performance audio (you can usually get a 2x-4x improvement with vectorized instructions over scalar instructions).
Yeah, IMO fundsp is far more approachable to the "average" user. (Even if it doesn't have the same optimizations that Firewheel's DSP has, but then again people aren't rendering an entire DAW project in realtime in a game engine anyway.)
Maybe for now we should just focus on providing just a couple of very basic filters. IMO even just a simple one-pole lowpass filter node, a simple one-pole highpass filter node, and a simple one-pole bandpass filter node would already provide a lot of useful creative effects for game developers. I have already pretty much created the one-pole lowpass filter node in this example, it's just a matter of making it more official and adding the other two variants. https://github.com/BillyDM/Firewheel/blob/main/examples/custom_nodes/src/nodes/filter.rs
I actually have a basic, not-especially-optimized one-pole low-pass in bevy_seedling directly, along with a freeverb port. They're both a little.... uninsteresting artistically, so I'm personally very keen to get some real effects going.
Is one-pole bandpass even possible? Or did you mean 1 SVF?
Is it too early to make a nursery crate or something similar for these kinds of things?
firewheel-extras or something
I think it's possible, it's just a lowpass plus a highpass. Though I haven't tried it yet so I may be wrong on that.
Ah, so using 2 one poles?
Correct. But if it is separated out into a separate node instead of trying to make a generic "filter" node, it would be much easier to add parameter smoothing and optimize.
Though I agree SVF filter nodes would be useful too. We just need to think through how to best design them.
Ok! I probably won't have time until the weekend but if you haven't done anything by then I can spin up a PR
One effect that I think would be really useful for games is a convolutional reverb (a reverb where you supply a room impulse response). It gives a lot more variety and realism than freeverb. The downside is that it is pretty CPU intensive, so it's a tradeoff that the game developer would make.
But again, that's probably best if it was in a firewheel-extras crate or even a 3rd party crate.
Another simple thing I think I'll add is a simple noise generator node. In fact I already created it in the example, I just need to add it to the official list of nodes. Though it could be nice to also have the ability to generate pink noise. I'll have to look up how to do that.
Sweet, thanks!
(found it through one of your link lists, iirc)
Yeah, musicdsp.org is a great resource.
Alright, I've added a white noise generator node and pink noise generator node!
Would we use just const generic channel counts? As you've mentioned above, using a Vec would prevent auto-vectorization
where's the best place to ask questions about firewheel in general (not in relation to bevy specifically)?
i know there's a firewheel server, but there's been no activity there for a while
so i'm wondering if i should just ask here
either is probably fine!
Yeah, I created that server when I planned it to be more of a direct FMOD competitor, but I've since changed gears on that.
Yeah, const generic channel types makes sense.
so i want to have a few nodes that share some state between them, that should be calculated on the audio thread as needed, once per block
my idea was to have them share a Rc<RefCell<Thing>> and use the ProcInfo::clock_samples to key updates, but i'm not sure how i can both construct that on the audio thread and share them between nodes
also, AudioNode::construct_processor taking &self makes me confused on how to send the Thing in the first place (given that i need to construct it on the main thread to begin with)
i feel like i might be going on the wrong path early on
Hm, what kind of data is in Thing?
it's a timeline position, local to a couple of nodes on the same timeline (so separate to the global transport, but more shared than a single sampler's position); wrapped in double-buffering so it can be visible from other threads (so more like Writer<Data>), but otherwise just plain data
i'm still sketchy on the double-buffering implementation, but i'm leaving that to future luna; the main point there is that i want to construct the value on the main thread (along with a corresponding reader) and then send it over
Hm, and atomic synchronization isn't easy/practical?
that's not the issue, even if it was a plain AtomicBool i would still need to share a mutable(-able) reference between nodes
hence the Rc<RefCell<Thing>>
well, i guess it could just be Rc come to think of it
but i still need some way to share the Rc between them all, and i don't see how i can do that
Even if you could find a way to smuggle the Rc's around, they're not the best type for sharing arbitrarily between nodes. If they're dropped anywhere on the audio thread, they may trigger realtime-unsafe (de)allocation behavior.
I'd recommend Firewheel's ArcGc type for starters.
(As the name implies, it's a simple, atomically reference-counted type with a barebones GC. We run a GC pass on the main thread every once-in-a-while, dropping any values that have no more references.)
The idea with the &self parameter is that the AudioNode type represents some handle for the actual audio type. In other words, it's generally an entirely separate type from whatever implements AudioNodeProcessor.
So when you create that handle, you can put whatever data you want in there, including a shared atomic value.
in this case, i can have the double-buffering impl handle deallocating on the non-audio thread; it's also !Sync, because it makes the assumption that there's specifically one writing thread (and another !Sync value for the reading thread), so ArcGc doesn't quite work
(though, if anyone has a better double-buffering impl lying around, feel free to throw it at me)
i got that, and i assume i'm supposed to transfer the shared value from the handle to the processor when it's created, right
in this case, i can have the double-buffering impl handle deallocating on the non-audio thread; it's also !Sync,
These two statements seem to conflict -- how can you interact with the type across threads, like reading it in one thread and dropping it in another, if it's !Sync?
you use a method to get a (SharedReader<T>, SharedWriter<T>) that share the same buffers and use atomics to avoid conflict; the reader stays on the main thread, and the writer gets moved to the audio thread & is shared between relevant nodes
so i would imagine putting timeline: Option<SharedWriter<TimelineState>> in the handle, and then takeing it when constructing the processor
i could wrap that in some interior mutability, but it taking &self makes me feel like that should be a bad idea
even though it's only called once outside of error cases (and in those cases i will want some custom handling i think)
Yes, the expectation given the signature is that construction is idempotent. If you're using Firewheel outside of bevy_seedling, you can probably get away with it fine.
yeah, none of this is bevy or seedling related
i would like to make this all work with bevy down the line but that is not my priority
and moreso, i'm just trying to check some other design choices i've made, have something that produces sound that i can start whacking into shape
But all of this extra consideration is why I prompted for atomics in the first place -- an ArcGc<AtomicUsize> (or whatever) is simple, safe and decently fast.
Now of course, if you need to synchronize more fields at once (that can't all be expressed as atomics)... you don't really have a choice but to get into the weeds.
yeah, there'll be more state
and in general i'd want a solution for "something i update on the audio thread and read from a few places"
my initial idea was to actually have a node in the graph that does the updating, but i think firewheel would like that even less
here it's ok for other places to see stale state, and the state isn't particularly complex, hence double buffering
store two copies of the data, allow other threads to view one copy, mutate another copy, atomically swap which one is active when you're done with it
then some extra work to ensure that, if a reader thread falls asleep while in the middle of reading or accessing the data, the writer/audio thread can work around it
which i'm not sure i have correct in my impl, but it is a well-explored problem in general
i could get around it by making it sync, or clonable, or something along those lines & subsequently unsafe... i don't think i can enforce that being used correctly
making it sync
We actually have a special type for that
It's the same idea as https://doc.rust-lang.org/std/sync/struct.Exclusive.html
You could defer the sharing of the buffer until after the nodes are constructed and inserted into the graph. If that works for your use case, I think that would be the most natural way in Firewheel.
After you construct the nodes, package up your buffer into a SyncWrapper, send it to the target nodes with NodeEventType::Custom, take the value out in the node, and then you're up and running.
Actually, it would be pretty simple to let nodes create a Box<dyn Any> that is shared across all processor instances of that node.
so, the benefit i'd get from making it Sync is that i'd be able to put it in an ArcGc that i share between all the nodes (replacing the outer Rc otherwise used); SyncWrapper doesn't help, because i do want to share the value within the thread, just not between the threads, and i wouldn't be able to recover exclusive access to the value on the audio thread by design
an arbitrary key/value store would be preferred, since i would like to reuse the same data between different node types & use different instances of the data within the same node type
Would a static string work as the key?
Although I do need to think about how to make it realtime safe (no allocations on the audio thread).
yeah, hm
can't be static, because you can have an arbitrary number of node groups available
it does feel like just sending things from the main thread is a better idea, but i'm not sure what a nice API for that looks like
ugh, i think in this case i should just make it Sync and leave it unsafe to use a writer across threads, then call it internal and be done with it
at least at this point
because then i can just set up all the node's handles with the right buffers before anything gets sent & i don't have to think about teaching rust "it's ok once every part is sent over, but not until then"
Hm, so I've got at least a start at a demo/shootout between rodio and Firewheel going: https://github.com/CorvusPrudens/rust-audio-demo.
Depending on exactly what we're intending to evaluate with it (of which I'm still not totally sure!), we may want to change the format. It's a touch sparse at times. Luckily, changes at this point should be fairly easy -- the core sample playing and fading behavior is all in place.
Frankly, my goal was to make them sound the same. They're pretty close, but different approaches to basic spatialization between the two made it difficult to get exact results.
@celest whale, if you have a moment, feel free to take a look! Let me know if there are any adjustments I can make to help you and other maintainers properly evaluate the two.
@lean cloak let me know if there's anything we can do to better represent rodio here, or if there's anything you'd like to see from the perspective of the RustAudio group.
Oh there is a big difference between them now -- for some reason, the text sounds are weirdly distorted in rodio. rodio's receiving exactly the same bytes as Firewheel, so I'm not sure what's up with that.
Anyway, sorry for the delay on evaluating the backend timing changes @dusky mirage -- I should be able to get back to that today.
Okay, very interesting. Numbers around latency or regularity would be a really helpful addition
The other critique I have is that we're not evaluating soundtrack-esque use cases
Latency might be tricky, but I'll see if the Core Audio jitter metric provides anything interesting.
Hm, as in music layering, loop regions, queuing, etc?
The data scientist in me really wishes the x axes would line up on those two graphs lol.
(sorry lol)
It's alright ๐ we're all grown-ups we can read the labels.
(but aesthetics are aesthetics)
The graphs look fairly interesting
Firewheel seems to look very gaussian, so ig it's just uncorrelated noise that you'll always have.
Meanwhile rodio seems to have a bunch of separated peaks (two look very visible), so there's some conditionals on the timing.
But maybe the statistics aren't good enough
This was just from one run to be clear -- it's not particularly careful work. But the performance delta holds pretty consistently.
There could be a number of reasons for the spiky rodio timings. In this demo, Firewheel's operation is pretty steady-state. It sets up all the nodes once, with the only thing changing being how many are active. The data that's passed around is largely just main-thread-garbage-collected Arcs.
With rodio, though, we're frequently adding and removing processors (the sinks). Unless I'm mistaken, that involves occasional mutex locking in the audio thread. I'm fairly certain rodio also drops the audio buffers (which are passed by value, not reference-counted) in the audio thread as well.
In other words, the audio processors are "pre-allocated" with Firewheel. You don't technically have to do this (bevy_seedling makes it easy to dynamically grow them if you want to), but that's the most natural way with the bare API.
We might be able to do something similar with rodio sinks -- basically just create a bunch of empty ones at startup, and then play sounds by grabbing an idle sink and appending to it. That's not how bevy_audio does it, so I don't know if it's even recommended.
Very cute demo! โค๏ธ It worked great too. The stdout gave some errors, but I don't think it influenced anything: ERROR symphonia_core::probe: probe reach EOF at 4793 bytes.
Oh yeah so that's my sophisticated asset loading code. It just uh... tries to read every file in assets as an audio file. I should probably clean that up to not cause any confusion.
every file is an audio file if you try hard enough
Ah, as long as it doesn't try to play png's on my speakers ๐
Okay @dusky mirage those changes for the backend look great! I was worried the generics or associated instant type might get into the user-facing API for the FirewheelContext type, but it looks like that's all hidden. Very convenient for me (no special changes required for bevy_seedling itself)!
For the web audio backend, I'm just using the web context's currentTime value, which ticks up in seconds since the moment of creation. It does stop when the context is paused -- do you know if that would cause any issues?
I think it's unlikely it'll be paused in practice, but I'm just curious.
Another point is that the currentTime will only tick at the rate of processing as far as I understand, so any time in-between will not see a changing clock. Should the backend try to account for that, or does Firewheel manage that itself after getting the Instant from the backend?
I am very tired from the past two days of family stuff, so I'm a little slow with thinking today. ๐
When you say "when the context is paused", what context do you mean? Do you mean the audio thread, the Firewheel context, the web application itself, or something else?
Specifically the Web Audio context.
The specific terminology is suspend. You can suspend the web audio context, which will halt any audio processing until it's resumed.
In the case of the web audio backend I made, it'll often be suspended for a bit at startup -- until the user actually interacts in a specific way, like pressing a key on their keyboard or clicking a button. So the Instant will just be zero for a little bit.
This is the function that calculates the delay between when the audio clock was last updated and now. If there is a timestamp in the audio clock, then the main thread calls backend_handle.now() to get the latest timestamp from the backend, and then it calculates the difference between those two timestamps to get the estimated amount of delay.
So if the audio backend is paused, I imagine this could result in a calculated delay of zero. https://github.com/BillyDM/Firewheel/blob/3377c1461717b84332f6b0194e3f94aab3d95477/crates/firewheel-graph/src/context.rs#L813
Hm, would this be a problem even if it weren't paused? For example if the main thread's update loop runs more than once between audio processing blocks? On the second loop, the curentTime still wouldn't have advanced, so that delay would also be zero.
The currentTime is, I believe, tied directly to how many frames of audio have been processed.
Which, now that I think of it, kinda makes it not that useful as a timestamp ๐ since that information could just be retrieved from how many frames the firewheel processor has processed.
(currentTime being, again, a field on the Web Audio API context object.)
Hmm, yeah. I wonder if it is even possible to accurately calculate that delay in the WebAudio backend then.
We do have a lead, actually
The getOutputTimestamp method provides an estimate for when the last frame of audio was processed in terms of the main thread's performance time.
So basically, it correlates the currentTime value to the main thread time.
The annoying thing is that this has to be called on the main browser thread.
Or, well, maybe that's fine -- that's where the backend is anyway.
But the current abstraction might not be able to capture these semantics exactly.
Found this article. https://web.dev/articles/audio-scheduling
Ah nevermind. That article is about accurately timing audio events, not accurately getting the current time of the audio clock from the main thread.
Hm, I might be mistaken though -- let me make sure I understand the currentTime value correctly.
Yeah, you could probably write a test to see if the currentTime value is different on the main thread.
Ah, yeah according to the spec it looks like my understanding is correct.
This is the time in seconds of the sample frame immediately following the last sample-frame in the block of audio most recently processed by the contextโs rendering graph.
Bit of a mouthful, but yeah -- only updated when the audio is processed.
Thus, for a running context, currentTime increases steadily as the system processes audio blocks, and always represents the time of the start of the next audio block to be processed.
Typically implemented as a shared atomic value.
So it's the same in all contexts, but not super useful for correlating main-thread-time with audio time on its own.
Yeah. It would probably make sense to change the trait method to something like get_delay_from_timestamp(&self, timestamp: Self::Instant).
Yeah if we can basically "massage" the timestamp on the main thread, we should have enough information to calculate where it's supposed to be at the moment of calculation.
Your backend could then just ignore the timestamp parameter and just call getOutputTimestamp to get the estimated delay.
(Unless I'm misunderstanding how that method works.)
It basically tells you when the last processing block happened in terms of the main thread's clock. However, that's not necessarily "now" -- you'd need to then calculate how long it's been since then on the main thread.
e.g. (totally fake numbers here)
context.getOutputTimestamp():
contextTime: 1.0 // in seconds
performanceTime: 6000 // in milliseconds
performance.now(): 6005
So in this scenario, we called getOutputTimestamp() and performance.now() to get the above results. That means the true current value for the audio time would be roughly 1.005.
Actually, that's kind of similar to what I'm doing for the other backends. The timestamp on the CPAL backend is just the instant the last processing block happened.
Oh okay then yeah that seems very similar.
but cpal can get that time within the audio processor, right?
Correct.
I guess that's the main difference -- the Web Audio worklet where the firewheel processor runs is blissfully unaware of how it may be correlated to the main thread.
Only the backend on the main thread can sort it out.
sorry i mean context (only the main-thread context could figure out that correlation)
So do you think a get_delay_from_last_process_block(&self, timestamp: Self::Instant) -> Duration method in the audio backend trait will work?
(Ah, sorry, I was mixing up the FirewheelContext, my backend, and my backend's processor in my head.)
But yes, I think that would work. What does the input timestamp represent in this case?
It's expected to be some timestamp acquired in the past, and we calculate the main-thread-time that has passed since then?
I think in the WebAudio backend case, it would just be nothing. It's just for backends that provide a timestamp type.
Ultimately, the whole goal is just to get that Duration value.
Hm, well either way, I don't see any problem with it at the moment.
Cool, when I'm back at my computer I'll add that change.
Okay, I did a little investigation into latency. More details in the README, but:
In short: there was no significant difference in latency between the two engines.
Through the speakers of a Macbook M3, they both exhibit latencies around 60-85ms
at a buffer size of 512. `rodio` can't adjust the buffer size, but Firewheel
saw no significant improvement after reducing it to 128.
Through my audio interface, the latency dropped noticeably, down to 40-60ms.
Again, reducing the buffer size had little noticeable impact.
There are a number of places this latency could be seeping in, but I seriously doubt it's anything Firewheel or rodio are doing on their leg of the journey. They both push information to the audio thread either as soon as possible or significantly faster than the exhibited latencies.
It seems a little strange that significantly lower buffer sizes (I even tested 64 with Firewheel (with no glitches :3)) have almost zero correlation with the measured latencies. My laptop in combination with my audio interface should be able to do way better.
This has been an open issue for a while now of course, with some participation from a few folks in here.
How did you measure the latency? Is it end to end when triggered from user interaction, or is it screen-zudio latency, where an audiovisual event is supposed to happen at the same time?
Also is it native or web?
End-to-end, purely in the audio domain, examining the transient of a key press triggering the action to the transient of the sound it triggers. The recording was managed by a separate device.
I did quickly test the new web backend actually, and through my speakers it had the same latency.
I might test it with the interface real quick though
MacBook speakers have their own processing, definitely convolution to counteract the form factor and resonance from the chassis (and probably some multiband compression, I can hear the kind of phase distortion that produces when I get clicks), so that probably adds more latency
There's also probably the latency from propagating the keypress to the OS to Bevy and then from the ECS to the audio engine
So I guess your experiments show the audio engine latency is negligible in the total end-to-end latency
Okay with the web audio backend, I'm getting a pretty consistent 50ms.
I think it's using a buffer of 256.
I definitely still have some interest in narrowing it down a bit further, if possible.
In that issue, it seems a little alarming that Godot or even PyGame could massively outstrip what we could do, even with small buffers. That's what the author's testing seems to indicate. But @rapid hedge couldn't reproduce their results.
IIRC AudioWorklets work at 128 sample blocks, not configurable
That's true yes, but there's no requirement that the web audio API runs through the graph only once. I believe it'll run through the whole graph several times, the number depending on the hardware capabilities.
Core Audio was reporting block sizes of 256 when I was running the web testing earlier.
40-60 milliseconds isn't bad, but if we can consistently get much better latencies with other engines or apps, it's definitely worth investigating further imo.
I wonder if our input latency is the key driver ๐ค I know Linebender has been looking into that
This is a really helpful paragraph
512 to 128 is great
Ooh!! 64
Maybe the OS is relevant? I tested on Windows, but the comment you linked is on Linux
Windows does have a reputation for being... chunky with audio.
The OS is very relevant. It also highly depends on the audio device. My USB audio interface can't go much below 512 before I start getting constant underruns.
Can I ask if we will have PitchShift effect once this comes out or if it will come later? It is quite crucial in making sounds not sound annoying when played repeatedly.
So there a number of ways this could be handled.
If you're okay with changing the speed of the samples (which is generally okay for sounds effect variation), you can do this right now in bevy_audio! Sinks can change the speed they play at, which consequently changes the pitch.
Changing pitch without changing speed is very difficult.
Or at least difficult to get it to sound decent.
I did make a component specifically for this in bevy_seedling since it's so common. It randomizes the speed within a range that you give it.
commands.spawn((
SamplePlayer::new(server.load("my_sample.wav")),
// randomly select between 0.9 and 1.1 when spawned
RandomPitch::new(0.1),
));
Variation is all that matters, so if shortening/fastening is part of it then thats no problem ๐
If it was such a big problem to do a proper pitch shift realtime, i can just bake it in audacity or something
I notice there's no way to pass an Rng in there. Is that intentional?
ya it's just a little feature-gated utility component
It doesn't have any kind of privileged access, though -- it just sets PlaybackSettings::speed in its hooks. It's just a few lines of code.
I figure if people need anything more, it shouldn't be too hard to do manually!
Alternatively, if you think that's very useful, we could expand the API a bit to support that.
Randomizing pitch is, again, very common, so it may be worth putting a bit more work into it.
I think the networking people would appreciate that
Just to have all clients hear the same thing ๐
And for replays
I think it's Apex Legends that doesn't synchronize character quips between clients, and I hate it so much.
(How am I supposed to riff on my character's funni????)
@slate scarab Alright, I've updated the delay detection method for audio backends!
Okay, I've hooked it up to the web audio backend and it seems to be behaving correctly.
I might want to do some tests with some precise audio scheduling to verify correctness, but at least right now I don't see anything wrong with it.
Btw sorry for not working on the barebones filter PR on the weekend, I don't have much time at the moment ๐ฆ
FYI, I'm currently working on adding better multi threading support to the Bevy CLI prototype, which will set the headers automatically and also set the required nightly flags.
But it will probably still take a bit, I have limited time for open source ATM.
I'll post here once it's done :)
That would certainly make it much easier for people to try out the web audio backend. I'm guessing few people have tried it just because of the upfront friction.
Yea, I'm hoping that making it more approachable will also lead to more development of web multi threading in Bevy.
It could unlock a ton of performance!
But I was looking for something to test the multi threading with, so it's perfect that you added support for it!
@slate scarab do you happen to have a Bevy app that uses the Wasm multi-threading?
I now have a prototype for running multi-threaded web apps with the Bevy CLI and need a test case :)
https://github.com/TheBevyFlock/bevy_cli/pull/499
I do have a couple game jam submissions (of... dubious quality XD) that currently use multithreading. Would you like me to test the new CLI capabilities, or were you also looking to give it a go yourself? They're both public I believe.
Feel free to also try it out and give feedback if you want! I have put detailed instructions in the "Testing" section of the PR :)
But I'd also like to try it myself if you have a link to one of the repos.
Do you know if there's a good way to verify that it actually runs multi-threaded? Will it just fail to run at all if it's not set up correctly?
Well, there wonโt be any audio!
Iโll verify itโs working and then post a link here. I believe we activate the web audio backend with a feature, so Iโll give you that info too. (Iโm away atm so I canโt check yet.)
Perfect, thanks a lot!
Okay here's a repo that should work out of the gate: https://github.com/void-scape/chain-reaction. I think you'll want to disabled its default features. The game is slightly busted (it will crash on the game over screen), but it works for the purposes of testing.
The --experimental-multi-threading flag actually doesn't work for me because the cross origin isolation header values include single quotes. I removed those with a header override and it seemed to work, meaning the compiler flags seem to be working as expected.
If everything works, then when you interact with the page (like clicking on the canvas), the audio should start up. If no sound starts playing, it probably didn't work, and you should see an error like this in the console.
Ahh my mistake, thanks for testing!
I'll try to fix it tomorrow
I fixed the quotation and it works now!
I also verified by checking window.crossOriginIsolated in the dev console and that returns true now as well.
Also checked the repo you linked and that also runs and plays the (really nice) soundtrack :)
I was doing some more stress testing between rodio and Firewheel. For each, I played hundreds of samples at once. It's the same sound, but with some random pitch variation handled at runtime.
If we allow Firewheel to pre-allocate enough nodes to play all the sounds (otherwise it'll just drop incoming sounds when there are too many), it does quite well!
rodio can handle around 256 simultaneous sounds on my machine before it runs into underflows. This is when the sounds are played directly on the mixer, a new 0.21 feature, so it should be pretty favorable. Fairly respectable, especially since all the sounds are being live resampled due to the random pitch.
Firewheel, on the other hand, got up to 8192 before it had underflows (and in the test there was just one)! In fact, I was generating so many events to play the sounds that Bevy itself slowed way down. (I was even able to push it up to almost 16k without the pitch variation.)
Like the previous stress test, this may not be super representative, and Firewheel may benefit heavily from playing a single sample (which it shares as an Arc, likely kept in cache), but still kinda fun.
Awesome! Looks like I recommended a few more flags than were necessary in the firewheel-web-audio crate 
Hello everyone, I am new to Audio-Developement, Bevy, Open-Source and Rust, but I would like to contribute to this working group anyhow ๐
The open issues with the label A-Audio are comprehensible, but some of the C-Feature issues are not graded by difficulty.
Since both issues marked with a D-Straightforward seem to wait for continuation (#18952 & #16277),
I was wondering if someone could give me a brief overview about the current state of "Better audio" and guide me through some potential projects I could help with.
From my perspective, we're mostly waiting on a demo contrasting rodio and firewheel. @slate scarab was working on that last I heard, but I suspect that more help there would be welcome
User testing, refinement and feedback for #1378170094206718065 would also be very welcome ๐
Discussion on the demo is ongoing here , if youโre interested: https://github.com/RustAudio/audio-ecosystem/issues/8
Do note that many of the audio issues in Bevyโs repo would be fixed or made obsolete by integrating our work here.
Since the work is all third-party at the moment, any progress weโve made here is not reflected in Bevyโs issues.
Oh and to clarify -- the work we're doing now involves integrating Firewheel as the audio engine for Bevy. We believe it has a number of big advantages over any existing solution.
bevy_seedling is my integration, and if there are no big issues with it and we like the API, it'll probably end up replacing bevy_audio completely.
Thank you Alice!
Thank you Corvus for the overview of the current audio situation. Relying on third-party libraries for audio and ensuring their seamless integration into Bevy likely explains why there are few Bevy-specific issues on this subject. Do you think integrating Firewheel as the new audio engine would centralize the audio workflow and issues back into Bevyโs repo?
Yeah that seems correct.
I was interested in checking out how kira compares as well. kira was definitely the most pleasant crate to integrate from scratch given its API. In any case, it looks like Firewheel has a 2.5-2.7x performance lead over it (compared to a little over 3x with rodio).
Performance isn't necessarily the most critical aspect here, but it's cool that we stand to gain much more flexibility without any performance downsides.
For what reason specifically does Firewheel run better than kira?
Curious where kira has overhead that Firewheel avoids
Given the delta isn't huge, it may be something simple like better auto-vectorization in a few key places.
Oh, this might also be unfair to kira -- I need to double check to make sure it's not decoding on the fly.
No okay it does decode eagerly. Should be fair.
This sort of thing -- interpolating every value over each audio block -- may also contribute. While linear interpolation is fairly cheap, it can add up.
Especially if it confuses the compiler (which is easy to do) in a way that pessimizes SIMD optimizations.
The default nodes in Firewheel are generally pretty careful about that sort of thing, and more eagerly branch to encourage SIMD.
For example, look at how much effort is put into the VolumeNode with regards to whether the volume is actively changing or not: https://github.com/BillyDM/Firewheel/blob/4c3abbb93d58d98e1fe56857d58bf6f94b9d5018/crates/firewheel-nodes/src/volume.rs#L122-L199. (There are also carve-outs for 1.0 (pass-through) and 0.0 (silence).)
All that compared to, effectively, the single line I linked in this message I'm replying to.
So it definitely doesn't come for free ๐ but pursuing the idea of auto-vectorization everywhere and aggressively avoiding unnecessary processing seems to be doing quite well for Firewheel.
If no output device has been selected (i.e. you just want the default device) and the default device changes (like when you connect headphones), should we automatically move the audio stream to the new default device?
I think the answer is probably yes, with the only downside being we'll have to poll the devices every once in a while to see if the default changes. If we only do it once or twice a frame, it can't be that expensive, right?
Oh actually it looks like cpal only surfaces an error here (and stops the stream) if you specifically select a device. It already handles the automatic switching in the default scenario. We know when an error occurs otherwise, so we don't actually have to do any polling or anything.
Yeah. Because keep in mind even something as simple as applying a single static gain to a signal requires 44,100 * NUM_CHANNELS multiply operations per second.
Unrelated -- I'm working on restoring the playback state after the audio stream changes sample rates. It's actually... almost trivial, which is great (thank you ECS). However, it seems like the samplers actually continue playing their sequence, as if they never got their new_stream method called (or they think the sample rate hasn't changed).
Just off the top of your head, do you have an idea of where that might be coming from @dusky mirage? I'll have to do a bit more investigation here to nail it down.
Oh to be clear, they continue some of the time
not every time
Hmm, not sure.
Also, having the Firewheel processor automatically schedule events efficiently is turning out to be quite complex. Nothing I can't handle, it's just I need to refactor things some more which is going to take a little while.
definitely worth the wait for me if we're able to make it work!
(Essentially state mutation logic inside FirewheelProcessorInner has become a spaghetti mess. I need to compartmentalize it.)
(Actually this might just be related to asset loading in Bevy -- i.e. even if I kick off a reload, the asset might report that it's "loaded" before the reload completes.)
While the above is happening, it does look like the old sample rate also isn't updated. i.e. the first switch from 44.1k to 48k correctly distinguishes the two (old rate: 44.1k, new rate: 48k), but switching back doesn't (old rate: 44.1k, new rate: 44.1k).
I'll see if I can figure out where that's coming from.
Oh wait, actually I have it to where it only stops if the sample rates differ. https://github.com/BillyDM/Firewheel/blob/4c3abbb93d58d98e1fe56857d58bf6f94b9d5018/crates/firewheel-nodes/src/sampler.rs#L1128
Yeah that makes sense! But in this case, the issue is that the context is actuall "lying" to the samplers. When it switches from 48k to 44.1k, it still tells the samplers that the old rate is 44.1k (i.e. the original sample rate in this case).
The whole sequence being 44.1k -> 48k -> 44.1k
Oh ok.
Oh yeah I think it's because prev_sample_rate isn't set anywhere except in the Default implementation (for StreamInfo).
Ah yeah. It appears I forgot to set that value.
Do you need that fixed now, or can I just fix it along with the new event stuff?
I can just patch it for myself locally atm, I'll probably wait for the event stuff before publishing another bevy_seedling version
Ok, this is what the patch will look like: ```rust
let maybe_processor = self.processor_channel.take();
stream_info.prev_sample_rate = if maybe_processor.is_some() {
self.sample_rate
} else {
stream_info.sample_rate
};
self.sample_rate = stream_info.sample_rate;
self.sample_rate_recip = stream_info.sample_rate_recip;
let schedule = self.graph.compile(&stream_info)?;
let (drop_tx, drop_rx) = ringbuf::HeapRb::<FirewheelProcessorInner<B>>::new(1).split();
let processor = if let Some((from_context_rx, to_context_tx, shared_clock_input)) =
maybe_processor
{
Great, looks like that did the trick!
wait I lied XD
gimme a sec
Should it be is_none maybe
stream_info.prev_sample_rate = if maybe_processor.is_none() {
since that's when we call new_stream on all the nodes?
(That makes it work in my case. idk if that's how it's meant to work in general.)
Okay, with a bit of a hack for the assets, it all works now. The sample states are restored after the sample rate changes, meaning there should be no issues when switching between devices, even in externally-controlled situations like disconnecting headphones. I think there are some edge cases that could cause problems, but they should be fairly unlikely.
@faint wigeon how much information would you want associated with each input and output device? Right now, when you query for them, the entity only provides the name, channels, and whether it's the default.
But do you think you'd want all the information associated with them, like sample rate and buffer size ranges?
It's a touch tricky to balance where exactly the abstraction boundaries are drawn. Technically, we could just stuff the cpal information in the entity, but Firewheel's backend abstraction doesn't expose that itself.
yeah i think for games probably not but for example i am just starting to get my bevy setup going for using avb with lasers and so sample rate is very important ! so would def appreciate at least that.
i understand not just wanting to make the fire wheel api the cpal api thatโs basically how we ended up with bevy window with winit
right now iโm just using raw cpal but would love to give seedling a shot!
Yeah so what I want to avoid is a situation where you can't easily provide arbitrary 3rd-party Firewheel backends to the seedling plugin. Right now, since Firewheel's backend trait only grabs this more limited information, that's what the entities report.
However, it's probably totally fine to update those types in Firewheel so they can report sample rates and buffer sizes. That seems pretty useful in the general case, and those concepts aren't tied to anything cpal-specific.
gotcha that makes sense, ty for working on this. being able to select the streams in the ecs is very exciting!
It's not perfect yet, but it's getting better! Right now, the flow is to query the devices, grab the name from the one you want, and then update the audio backend's configuration resource with that name. Any time the resource changes, it'll attempt to restart the stream with the new config.
It would be cool if we could maybe use relationships instead (like an InputOf or something), but that's another abstraction that would require a bespoke integration of each backend, so... 
Oh and there's a little period in Startup before the stream is initialized where you can write to the config, allowing you to also manage the initial stream config in the ECS.
Maybe I'll make a trait for the backends for this though. I think in practice, not too many people will bring in their own backend, and those who do would probably have no problem authoring just a few trait methods to tie everything together.
that doesn't sound too bad! maybe in the future we could have a backend agnostic resource for "high level" configuration and the backend would just observe that?
am happy to test drive whenever this lands
e.g. set an entity on some resource
Hm yeah that could work. That should allow us to handle 90% of the things you need via the high-level interface without hiding the underlying backend from those who need it.
Although I guess it would be easy to lose synchronization with the backend config and the ECS representation if you modify the config directly 
I'll let you know when it lands!
Oh derp, you're right!
@slate scarab Alright, here is what I have for the new event timing system! https://github.com/BillyDM/Firewheel/pull/55
Awesome! It's a little late here, but I should be able to take a deep look tomorrow. Very exciting!
Oh and apparently a test is failing. I'll look into that tomorrow.
Okay I just started looking into this -- definitely looking good so far. I have some things to note, but I'll put that together later.
I am running into an issue though for the audio demo. It seems like if there are too many events going on at once (not enough to overflow the buffer mind you, just a handful really), I run into a panic here for scheduled events or here for unscheduled.
That is, in both cases, it appears the sorted event buffer indices are falling out of sync with the event arena. I couldn't figure out exactly where that desync is happening, though.
If you want a quick repro, feel free to check out this branch of the repo!
Ok. I suppose it was a bit optimistic hoping it would work first time. ๐ I'll debug it tomorrow.
@slate scarab Those were pretty difficult logic bugs to squash but I think I fixed it. Let me know if you run into any more trouble!
For those who are curious, the first bug was caused by forgetting that this block can run for multiple sub-chunks, so I incorrectly assumed that the next event would always belong to the node and thinking that checking for None was unnecessary. https://github.com/BillyDM/Firewheel/blob/93d272615f3ef66c848750dde4d1d2d3a486532d/crates/firewheel-graph/src/processor/event_scheduler.rs#L458
The second bug was caused by too eagerly adding a filter to the iterator here to filter out Nones https://github.com/BillyDM/Firewheel/blob/93d272615f3ef66c848750dde4d1d2d3a486532d/crates/firewheel-graph/src/processor/event_scheduler.rs#L522. But I actually needed to do extra logic for the None case.
Also, nifty little demo! ๐
Awesome, thanks! Seems to be working great for me.
It's actually performing a little bit better? Might be noise, but either way it looks like the changes resulted in no appreciable regressions.
Actually yeah, the fix to the iterator filter logic was needed for the clump optimizations to actually work. Though I wouldn't imagine it causing that much of a difference.
Oh yeah, and I did do a few things to Notify, but I also wouldn't imagine that causing much of a difference.
Or actually, the most likely culprit might be the owned events and parameter patches along with the use of ArcGc. Before paramter patching would have caused allocations on the audio thread if the type wasn't RealtimeClone.
Oh, really?
Oh and by "the changes" I mean in comparison to the current main if that's not clear!
Yeah, before it would actually clone a Box<dyn Any>.
(Or wait, maybe not.)
When I'm back at my computer I'll figure out that clone was if there was one.
I think the only allocation that could have happened previously (specifically during patching, at least with the derived implementations) is if you had an enum with a field that allocates on clone.
The new trait would prevent that happening with the previous implementation, which is nice. Although since the patch is given by value, you wouldn't need to clone in the first place.
Of course, in this scenario, you'll end up calling the drop handler for the replaced value, which could result in a syscall. But we could probably carefully apply RealtimeClone to types that also happen to have real-time safe drop handlers?
Ah yeah, nevermind, you're right.
I did also find this mistake which I fixed in my PR. This should be a "not equals". https://github.com/BillyDM/Firewheel/blob/4c3abbb93d58d98e1fe56857d58bf6f94b9d5018/crates/firewheel-core/src/diff/leaf.rs#L79
indeed it was questionable
Hmm, another theory for the better performance is the better cache efficiency of using a single immediate/scheduled event buffer for all nodes instead of allocating a buffer for each node.
(And it could just be noise)
The only thing I'm a bit iffy about is the performance of sort_unstable. Of course it is fine for most use cases, but if you had hundreds or even thousands of scheduled events, I'm not sure how well it would go. Ideally we would sort events off of the audio thread, but that's kind of hard to do since we are mixing musical and non-musical scheduled events together.
According to the docs, the unstable sort algorithm is O(n * log(n)).
Or wait, actually it's O(n * log(k)) where k is the number of distinct elements.
So I suppose if you had n distinct elements, it would be that.
Well I ran a few more tests and they're all at or below the original benchmark. The first one might have been a little exceptional (at around 10% faster). But hey, none of them were slower so that's pretty good.
Oh and to expand on this: using the pointer of an Arc as a means of change detection is a very attractive optimization. It feels clever. However, it's not perfect. You could have false positives for example, like if you have two Arcs that happen to hold the same value.
Yeah, that is true.
It's definitely helpful for large types that would be very inefficient to diff (like dyn SampleResource).
Yeah it should be fine, actually. False positives aren't too bad anyway. Just one-off extra events, really.
Yeah, and the vast majority of plugins are probably just going to have primitive parameter types, and maybe some dyn SampleResources.
My (very rough) guess on an O(n * log(n)) sort for a thousand events is something like 50 microseconds? That's very roughly 10k operations, probably only a few nanoseconds each. Which seems fine 
Once I get an animation API going for bevy_seedling, it should be pretty easy to blast it and see.
apologies if this is the wrong place to ask this... i used bevy seedling in my GMTK jam entry and it was great! It worked pretty well although a I did notice stuttering and glitching in the audio on WASM mobile builds (older devices). Probably not surprising given a lot of the particles etc were on CPU so without profiling I'd guess there was a CPU bottlneck
Is there a way in firewheel to play "raw" audio, i.e. audio I generate in real-time as an iter or vec of f32s?
Ah, I should probably document how to fix this in bevy_seedling! If youโre okay with a little nightly, then you can move the audio processing into a separate, dedicated audio thread. This completely solves any stutters.
let me link it here
That's what custom nodes are for! You can do anything you want there, they are made specifically for generating and processing audio data.
#1236113088793677888 message
(and for the custom node, refer to this example: https://github.com/CorvusPrudens/bevy_seedling/blob/master/examples/custom_node.rs)
Ah perfect, thanks both! I imagine that will help the audio although I think then the particles will be the problem ๐
Being new to thinking about audio I don't know what a lot of the terms are so I was imagining nodes were for volumes / filters etc not a source. I'll take a look at the example!1
It's a bit abstract, so it's hard to wrap your head around, but the abstractness is the point, because it's what makes the audio graph versatile enough to support custom audio processing
Everything is a node in the audio engine, so yes, filtering and setting volume is done with nodes, but generating sound is just having a node with 0 inputs
You simply have to connect the output into the graph in a way that you can eventually hear it (ie. it is connected into a chain of nodes that ultimately connect to an output that then feeds your speakers), but I believe the seedling plugin takes care of that automatically (you set up connections through the ECS, and by default it goes directly to the main output AFAIK)
ah ok, that makes sense. You've given me a place to start thank you!
Heads-up: 0.17 uses wayland by default, so the feature list in the readme for bevy_seedling will need to be updated to use wayland instead of x11 ๐
@slate scarab finally migrating to seedling now. I'll fire some questions at you while I'm on it, hope you don't mind ๐
First: is there a particular reason why Sample is not in the prelude?
I'm storing Handle<AudioSource> rn and it seems like the seedling equivalent is Handle<Sample>
If so, I'm used to bevy having all asset types I might need in the prelude, so it feels like I'm not supposed to use Sample?
Don't think that's the intention, but that's how it feels
the only reason is I figured it would be less commonly used
I'm using it because I preload all my assets
And because I have things like convenience functions that take in a Handle<Sample> and return an impl Bundle of the effects I want
A bit like a UI widget, but for audio
we could definitely export it as long as it's not too overloaded of a term
I think it's probably okay
yay!
Oh, I see you have examples for a settings menu and for spatial audio. Perfect, that's exactly the two things I was wondering about!
But I definitely wanted to avoid AudioSource because that's not really what the asset is for. I mean, yeah it is an audio source, but it's not "the way" to make an audio source in bevy_seedling. After all, you can write any kind of audio source as a node! Synthesizers, noise generators, whatever really.
makes sense ๐
If Sample is too overloaded (I don't think it is), we could also do AudioSample
maybe that's better
In the next release, the default configuration will have a spatial pool already set up
so that should be convenient
it'll be the typical SFX/Music/Master setup by default
Bikeshedding a bit, but maybe AudioFile? The same way textures have the asset type Image, because it doesn't have to only be about textures, audio files don't necessarily have to be for samples
hmmm but it could be samples generated at runtime, and then packaged up into an asset
ha, I didn't think about this case, yeah
yeah, Sample feels the most generic actually
or maybe simply Audio? But that also feels too overloaded
ya and I feel it has a similar problem as this
Oh also keep in mind @rapid hedge that bevy_seedling despawns audio entities by default, unlike bevy_audio (if that's what you were using). It's maybe a little bold, but I personally think it's the correct default.
Wait, bevy_audio doesn't despawn them? ๐
I always assumed it did lol
So every time I spawned a new sound effect, I leaked a bit of memory?

it's actually worse than that -- if the handles stick around, they slow down the audio processing a little bit
Wow
Yeah, despawning definitely seems like right call here
so when you spawn a ton of them, you can get underruns (which people have mentioned in issues from time to time)
I think most people assume audio entities despawn after finishing a one-shot audio source
Chainboom spawns like 3 per second over 15 mins 
but then they attach audio sources to their main entities and their mesh disappears randomly
yeah it's definitely a tricke trade off, but i figured this is kind of obvious (you can see when entities like this get despawned!)
maybe the default could be selected based on whether the source is spawned as a fresh entity or inserted onto an existing one, or maybe that's too much cognitive overhead
right
Another approach that was proposed is to remove all the audio entities by default, and then despawn the entity if it's empty. However, I'm not sure I like that. If you insert any components that bevy_seedling doesn't know about, but you want and expect them all to be despawned, you'd end up with a lot of zombie entities.
yeah I think it should be taught to people to use fresh entities for audio, especially one-shot
And since audio entities would seem to be despawned when you don't insert any other components, changing the behavior in this way might be confusing.
tack them as children of the main entity, so that spatial audio still works as expected, but this way feels more ECS-y than just adding them as additional component to your existing entities
ya this definitely avoids a lot of issues
(also it feels like having a single entity with all the components on it is the ECS equivalent to a God Class in OO-based designs)
ya i think we should teach people to spawn spatial sounds as children of world-positioned entities
it's just a good rule of thumb
you would need to remember to insert a transform i suppose, but maybe we could find an ergonomic way to do that
One approach could be to make Transform a required component of the SpatialPool label. In the next release of bevy_seedling, the default graph configuration will have a spatial pool with that label. So the way we'll ease people into playing spatial sounds will look like:
commands.spawn((
SpatialPool,
SamplePlayer::new(server.load("sample.wav")),
));
Maybe SpatialPool can #[require(Transform)]?
ya
oh lol I was writing my message at the same time as yours
data race
in my Rust server?
There's also a trick for panicking when a different component is not specified by the user FYI
#[derive(Component)]
#[require(Transform = panic!("You need to provide a Transform"))]
struct Foo;
Although in this case, we'd actually want the default transform most of the time (assuming these will be children of entities with transforms)
@slate scarab I know why I assumed that bevy_audio despawns!
/// A sound effect audio instance.
pub(crate) fn sound_effect(handle: Handle<AudioSource>) -> impl Bundle {
(AudioPlayer(handle), PlaybackSettings::DESPAWN, SoundEffect)
}
It's my widget ๐
This is useless now in bevy_seedling haha
oh haha yeah that makes sense (otherwise chainboom would have borked audio after 15 min probably)
hehe
How come volume is part of SamplePlayer but speed is part of PlaybackSettings?
And is it intentional that I can use a fluent API here:
SamplePlayer::new(sound_effect).with_volume(Volume::Linear(1.6))
but not here?
PlaybackSettings::default().with_speed(1.5)
Hm, does this section answer that? https://docs.rs/bevy_seedling/latest/bevy_seedling/sample/struct.SamplePlayer.html#playing-sounds
No intention there, I just haven't written those methods for PlaybackSettings. In anticipation of BSN, I'm expecting fluent apis to be less common and no longer the default way to do things. But of course we can still throw it in there.
makes sense, thanks ๐
In terms of the ECS API yes, but more fundamentally, none of the parameters on the SamplePlayer can change in Firewheel after playback starts. (Not as true in the upcoming Firewheel versions, but really you should still follow the recommendations of the docs there.)
Is there a rule of thumb how SpatialScale translates to SpatialBasicNode or do I just tweak it by ear?
(I always hated SpatialScale)
oh really?
The docs on bevy_audio weren't clear on what the magic number meant when I first started using it haha
So I hold some residual bitterness to it
Also, I feel like my intuitive hearing does not match the numbers I use
Like, I think to myself how far away I want the audio to still be audible
But the result is almost always entirely different than what I initially wanted
So it always ends up with me cluelessly fiddling with constants
Like "oh, I want the footsteps to be audible until like 30 meters away, so I'll use SpatialScale::new(1.0 / 12.0) or something like that"
I like AudioSample
but now the footsteps are so loud that they always sound like they're right next to my ears
After some major fiddling, I end up with SpatialScale::new(1.0 / 3.6)
Which is a factor 4 off from my initial guess
Maybe this is just a skill issue though
Would you look at that, it's still like it on main: https://docs.rs/bevy_audio/latest/bevy_audio/struct.SpatialScale.html
A scale factor applied to the positions of audio sources and listeners for spatial audio.
A scale factor applied to the positions of audio sources and listeners for spatial audio.
Yes, this means nothing to me ๐
I think I actually learned about what SpatialScale does by reading the docs of bevy_seedling, come to think of it
haha
mm i see
bevy_audio uses the inverse square law based on distance
this is physically accurate, but can sometimes be a little annoying to work with artistically
Firewheel uses exponential decay, which is in some sense the reverse pros and cons
really we should allow you to choose the fall off mode
but do keep in mind that they are different between the two, so things may sound a little weird without different compensation right now
Soooo if I have SpatialScale::new(1.0 / 3.6) in bevy_audio, what initial value should I set it to in bevy_seedling?
Of course, I'll still tweak it later
idk ๐
But as a starting point
it depends
You can't get a value because it has the different law
Inverse square will take longer to decay, so over bigger distance bevy_audio will be louder
ya it's two different curves
So maybe that's why SpatialScale never felt intuitive to me
I just went by the docs of bevy_seedling while using bevy_audio hahaha
But the initial drop is faster with exponential decay so close-by, seedling is louder
that explains so much
Sooo I just go by the default and completely retune everything? :/
technically you can get a conversion factor by having a set distance you want to match the two
unfortunately, yes
i did this for the audio demo
Is there really not some dirty rule of thumb?
Just so it's not completely starting from scratch
Here's a graph https://www.desmos.com/calculator/lm9jpyssa4
you can change k to whatever you want, but you'll never get the two to line up
actually that's not technically what we're doing
(red is bevy_audio, blue is bevy_seedling)
I'm graphing the actual calculation used -- it's technically exponential decay but the constant is quite small
also the scale of the x-axis means nothing because there might be additional scaling factors in each implementation that I don't know
Is k the distance I want to match?
wait, hold on, this is a better one https://www.desmos.com/calculator/rrhbsw8ete
k is the factor you'll multiply your old SpatialScale by to get the new constant
Ah gotcha
And you mentioned doing this for a particular distance?
sorry scratch that, made another mistake
๐ happens
https://www.desmos.com/calculator/qcfqfizzsz (final final fr fr)
fr fr
just playing around it seems to me like 1.4 is a good starting point?
Again, no need for this to be accurate
Just something so I don't start from literally zero ๐
the k now properly changes the distance falloff curve for the seedling attenuation
Looks a bit like a bird looking to the left. Fitting.
So the falloff is much more gentle compared to inverse square
Oh wow that's a massive difference
This sounds perfect
My issue was precisely that the falloff was extremely jarring
Cool!
(here it is with the slider too https://www.desmos.com/calculator/zzhbhdjqzp )
this is incredibly useful
Not just for comparing the two, but for seeing what that parameter does in general ๐
Maybe add that to the docs of the component ๐
Is x the distance in meters(-ish)?
yes, I believe @dusky mirage wrote it with the intent of modeling 1 unit = 1 meter by default
Seriously, this is exactly what I wish I had when first designing my audio
because that's exactly what went through my head
"I want the footsteps to be silent at about 30 meters, how do I tweak this to make that happen?"
maybe we should include a little curve tool in the editor for this :3
Could you do a version of that where it's not k, but the spatial scale directly that you're modifying?
I think k is exactly what the spatial scale would do
ah, I thought it was the translation factor of bevy_audio to bevy_seedling
no ๐ but it might help you set a good default scale to get the two to sound similar
Okay great
Again, this perfect ๐
Please please put this in the docs
Well, the version in the docs probably doesn't need the blue graph
until we add inverse square back in as an option 
Aaah this makes me so happy right now ๐ No more struggling with SpatialScale!
Alright, now to port my audio menu
I believe @oak walrus was kind enough to write a little prototype somewhere ๐
We have two candidates:
const VOLUME_STOP_DB: f32 = -60.0;
const VOLUME_STOP_POS: f32 = 0.01;
let position = slider.get_position();
let volume = if position < VOLUME_STOP_POS {
Volume::Linear(lerp(0.0, db_to_linear(VOLUME_STOP_DB), position / VOLUME_STOP_POS))
} else {
Volume::Decibel(lerp(VOLUME_STOP_DB, 0.0, (position - VOLUME_STOP_POS) / (1.0 - V0LUME_STOP_POS))
};
and
fn position_to_volume(t: f32) -> Volume {
let curve = UnevenSampleAutoCurve::new([
(0.0, f32::NEG_INFINITY),
(0.01, -30.0),
(0.5, -7.0),
(1.0, 0.0),
]).unwrap();
Volume::Decibels(curve.sample(t.clamp(0.0, 1.0)).unwrap())
}
With the caveat that the second version treats the entire range of [0.0, 0.01] as negative infinity
I wish I found some reference of how other games do this
Like, this has to be a solved problem
namely, translating between some slider and a target volume
you could probably bump up the minimum to like -50 even
apparently discord made an entire repo for this XD https://github.com/discord/perceptual
to be fair, it's a very JavaScript thing to do, so
We should provide a function for this from bevy_seedling itself
Oh wow great find!
How did you find it? I tried googling but I sucked at it 
and it looks to be pretty close to my solution, but with an added "boost" range
i saw it in a gamedev subreddit
idek why it showed up
Yeah I'd say this is extremely standard functionality for settings, and I personally believe settings should be more widespread in Bevy apps for accessibility reasons
the curve solution will simply waste the first interval because the interpolation will be stuck at -inf, but otherwise they're both close too
Is this an accurate translation of what the library does?
pub(crate) fn apply(self, perceptual: f32) -> Volume {
if perceptual == 0.0 {
return Volume::Linear(0.0);
}
let db = if perceptual > self.normalized_max {
((perceptual - self.normalized_max) / self.normalized_max) * self.boost_range
} else {
(perceptual / self.normalized_max) * self.range - self.range
};
return Volume::Decibels(db);
}
Not sure what to do with that last normalized_max scale 
(this is for perceptual to amplitude)
Full code:
const DEFAULT_VOLUME_DYNAMIC_RANGE_DB: f32 = 50.0;
const DEFAULT_VOLUME_BOOST_DYNAMIC_RANGE_DB: f32 = 6.0;
/// Constructor for taking a user-presented control value and converting it to a volume
#[derive(Debug, Clone, Copy)]
pub(crate) struct PerceptualToVolume {
/// Normalization of perceptual value, choose 1 for decimals or 100 for percentages
pub(crate) normalized_max: f32,
/// Dynamic range of perceptual value from 0 to [`Self::normalized_max`]
pub(crate) range: f32,
/// Dynamic range of perceptual value from [`Self::normalized_max`] to 2 * [`Self::normalized_max`]
pub(crate) boost_range: f32,
}
impl Default for PerceptualToVolume {
fn default() -> Self {
Self {
normalized_max: 1.0,
range: DEFAULT_VOLUME_DYNAMIC_RANGE_DB,
boost_range: DEFAULT_VOLUME_BOOST_DYNAMIC_RANGE_DB,
}
}
}
impl PerceptualToVolume {
/// Converts a user-presented control value to a [`Volume`].
/// The input must be in the range \[0, 2 * [`Self::normalized_max`]\].
pub(crate) fn apply(self, perceptual: f32) -> Volume {
if perceptual == 0.0 {
return Volume::Linear(0.0);
}
let db = if perceptual > self.normalized_max {
((perceptual - self.normalized_max) / self.normalized_max) * self.boost_range
} else {
(perceptual / self.normalized_max) * self.range - self.range
};
return Volume::Decibels(db);
}
}
hm, tbh I would stick with @oak walrus's first approach, since the behavior near zero will be a little more well-behaved, and usually games don't have a boost setting to worry about
ah, good point
it could still benefit from the struct though
since you may want to configure the range and stop position
Okay I've added all these notes as issues. I should be able to implement them pretty quickly, and I can work towards another release.
I've been a little blocked on integrating animations. Initially I wanted something simple for stuff like fade ins / fade outs. However, a proper implementation is really just another animation crate, which is quite involved. And frankly, there's no point in writing another one unless it properly integrates with bevy_animation in some way, which is very difficult. So I think I'll put that aside for now.
(And in any case, you can animate the audio nodes like any other ECS type with any of the existing animation crates, so it's not like people have no options.)
Thanks for writing down my feedback ๐
Very glad to have it! The overall feedback has been pretty sparse so far.
Oh really?
I was under the impression that a few people used it so far ๐
But I guess they didn't have much to say
ya but mostly people have said either "ya seems nice" or nothing at all ๐
Is there a better way to do db_to_linear than Volume::Decibels(foo).linear()?
hard to tell if everything's good or people just bounce off it 
The whole node system y'all set up seems genius ๐
I hope I can make it justice at some point, I'm still quite the beginner in terms of audio ๐
assuming foo is db as an f32, then that'd be the canonical way imo
would you prefer a specific conversion function for that?
I think it's fine, I was just thinking SolarLiner was referencing a specific function ๐
Oh yeah, force of habit of another library I use, Volume::Decibels(...).linear() is strictly equivalent
Do you have some time to write the inverse out too?
I'm not familiar enough with the underlying maths to do so myself ๐ฌ
i.e. convert a volume to the perceptual control value
Not at the computer right now
Alternatively you can have the perceptual value as a component, have an observer that changes the volume when that component is changed using the conversion above
Bonus is it's easily serializable
good idea!
no problem, I'll figure it out ๐
How come that the perceptual value 1.0 corresponds to 0.0 dB, btw?
Is that arbitrary?
like, why not 3.0 dB?
generally when you work with dB (FS), the "full-scale" value is 0.0
TIL, thanks ๐
so that's why it goes negative for things quieter than full-scale
and the actual calculation is 10f32.powf(0.05 * db)
so negative values don't produce a negative amplitude or anything
just a smaller one
That part I remembered, I just know from college that everyday sounds are above 0 dB, so I was wondering why we use that value for "full" ๐
Does this look correct?
/// Constructor for taking a user-presented control value and converting it to a volume.
#[derive(Debug, Clone, Copy)]
pub(crate) struct PerceptualToVolume {
/// When the perceptual control value is below this value, the mapping will be linear between:
/// - 0 perceptual = 0 volume
/// - [`Self::pivot_pos`] perceptual = [`Self::pivot_db`] volume
///
/// When above this value, the mapping will be exponential between:
/// - [`Self::pivot_pos`] perceptual = [`Self::pivot_db`] volume
/// - 1.0 perceptual = 0 dB
pub(crate) pivot_pos: f32,
/// The volume to use at [`Self::volume_stop_pos`]
pub(crate) pivot_volume: Volume,
}
impl Default for PerceptualToVolume {
fn default() -> Self {
Self {
pivot_pos: 0.01,
pivot_volume: Volume::Decibels(-50.0),
}
}
}
impl PerceptualToVolume {
/// Converts a user-presented control value in \[0.0, 1.0\] to a [`Volume`].
pub(crate) fn apply(self, perceptual: f32) -> Volume {
if perceptual < self.pivot_pos {
let min = 0.0_f32;
let max = self.pivot_volume.linear();
let t = perceptual / self.pivot_pos;
Volume::Linear(min.lerp(max, t))
} else {
let min = self.pivot_volume.decibels();
let max = 0.0;
let t = (perceptual - self.pivot_pos) / (1.0 - self.pivot_pos);
Volume::Decibels(min.lerp(max, t))
}
}
}
oh ya that'd be dB SPL https://en.wikipedia.org/wiki/Sound_pressure
Aaaaaah gotcha. Thanks!
ya that looks right to me
thx
I'll add it to the issue in case you want to go with this version ๐
I just needed something to use right now for my menu haha
Does this seem reasonable?
pub(crate) fn to_perceptual(self, volume: Volume) -> f32 {
if volume.linear() <= self.pivot_volume.linear() {
let vol = volume.linear();
let pivot = self.pivot_volume.linear();
let t = vol / pivot;
t * self.pivot_pos
} else {
let vol = volume.decibels();
let pivot = self.pivot_volume.decibels();
let t = (vol - pivot) / (0.0 - pivot);
self.pivot_pos + t * (1.0 - self.pivot_pos)
}
}
ya this also looks correct
@slate scarab in bevy_audio, I had tags for music and SFX. Some of my SFX (well, most of them) are spatial
I assume that means I need two sampler pools?
isn't like -60dB treated as "silence" in standard audio tech?
it's a little arbitrary (discord seems to prefer -50). If you make the "silence" too low, the rest of the scaling won't be as nice
ya I'd recommend separating your music from other stuff as a good rule of thumb anyway
Followup:
I copy-pasted the pool setup from the menu example and have this:
commands.spawn((
Name::new("SFX audio sampler pool"),
SamplerPool(Sfx),
VolumeNode {
volume: DEFAULT_VOLUME,
},
));
But that gives me
WARN bevy_seedling::pool::queue: Queued sample "audio/sound_effects/run/Footsteps_Rock_Run_05.ogg" with effects in an effect-less pool.
So I changed my SFX pool to
commands.spawn((
Name::new("SFX audio sampler pool"),
SamplerPool(Sfx),
sample_effects![SpatialBasicNode::default()],
VolumeNode {
volume: DEFAULT_VOLUME,
},
));
But that gets me
ERROR bevy_seedling::pool::queue: Expected audio node in SampleEffects relationship

What part did I misunderstand? ๐
haha it's a good thing i made that first warning
idk about the second part so i'll start with the first
When you explicitly create a pool, like the Sfx pool, you can give it sample_effects! that act as a template for all samples played in the pool. If you try to play a sample in that pool with effects that aren't in the pool, it simply removes them (and prints the warning).
The second snippet should work.
What does it look like where you spawn the sample player?
i.e. the footsteps
Also, I saw I never added this issue we discussed a while ago: https://github.com/CorvusPrudens/bevy_seedling/issues/40
The resource in the code looks private
if you're refering to what's on main
no it's on next
Is that on a branch?
Aaaah
the branch is called next
I thought you meant there's a function called next haha
alright
May I suggest using init_resource instead of insert_resource? ๐
That way, if the user provides the resource before the plugin, it does not get silently overwritten
oh I thought that just always inserted the Default value
We had this conversation a while ago in the context of a new book chapter Alice was writing
I thought so too ๐
Alice corrected me
that would be much better yes
Happy to hear I'm not the only one who didn't know that haha
Nice, that matches my intuition after reading https://docs.rs/bevy_seedling/latest/bevy_seedling/pool/struct.SamplerPool.html#playing-samples-in-a-pool
A component for building sampler pools.
oh i see my in-example-comment is wrong
should be SampleEffects now
can i test my comments in my docs comments pls
hehe
the linked docs also give a touch more detail
Samples played in a pool donโt need to respect the ordering or presence of effects; when a sample is queued, missing effects are inserted and the order of effects is corrected.
Well I play a multitude of sounds, and the ERROR is not telling which one it's angry about
But many of them are just spawn((SamplePlayer::new(handle), Sfx))
Here's the one from the stepping
Hm, yeah that should work -- I'd only expect an issue to arise if something like this happened:
commands.spawn((
SamplePlayer::new(server.load("sample.wav")),
sample_effects![
SpatialBasicNode::default(),
Transform::default(),
],
));
commands.entity(entity).with_child((
Transform::default(),
SamplePlayer::new(sound_effect).with_volume(Volume::Linear(1.6)),
PlaybackSettings {
speed: 1.5,
..default()
},
sample_effects![SpatialBasicNode::default(), SpatialScale::default()],
Sfx,
));
Oh?
commands.entity(entity).with_child((
Transform::default(),
SamplePlayer::new(sound_effect).with_volume(Volume::Linear(1.6)),
PlaybackSettings {
speed: 1.5,
..default()
},
sample_effects![(SpatialBasicNode::default(), SpatialScale::default())],
Sfx,
));
aaaaaah
Makes sense ๐
yeah so since the second entity didn't have any registered audio node, it got angy
but hat error message is downright arcane haha
Hm, what do you think it should be?
children: famously problematic
What does the non-parenthesized version translate into? It feels like the macro should encapsulate the extra parentheses?
At least it's a first impression kind of thought
I shoe-horned in location tracking for certain things before I realized Bevy had a more integrated solution. Could we use that to help improve this?
Basically, sample_effects! is a relationship kinda like children!, where each entity represents a different audio effect. That way, you can have multiple of the same audio effect if you want (chain a few LPFs to get a steeper slope, for example).
The parentheses version spawns the spatial node as one entity with a scale component.
Well, in an ideal world:
Entity {NameOrEntity} was passed a sample effect without an audio node. The offending sample effect bundle contains only a SpatialScale, but has a sibling SpatialBasicNode. Did you forget to enclose them in parenthesis?
E.g. replace sample_effects![SpatialBasicNode::default(), SpatialScale::default()] with sample_effects![(SpatialBasicNode::default(), SpatialScale::default())]
The first one spawns two different entities, with the second entity having no audio effect.
Yeah that makes total sense
ahhh, right, both components are parameters for the same effect? Right that makes sense
ya the SpatialBasicNode is the effect, and it can be configured with the scale component.
right right right, pretty decent, nice
I think dianostics could go scramble in the relationship hierarchy for nice error messages, given that this seems like an easy to run into footgun
I would have expected each effect to have all their settings within the component itself
It's a bit of a slippery slope when factoring in change detection
yeah that makes sense
For an extreme example, see the Window component
Change detection on that is basically impossible 
also more importantly in this case, SpatialBasicNode is a Firewheel type -- it's the whole thing
though we do have just the thing to make granular diffing of arbitrary structures in seedling :p
SpatialScale is a concept we added specifically for bevy_seedling
and ya we could add a param to the node for overall scale, but in a way that's less elegant
Does this mean that the way I set up stuff now, if I play an SFX without a SpatialBasicNode, it will insert it for me?
Okay, good news: Foxtrot compiles and run!
Bad news: the spatial stuff seems a bit off
I set panning_threshold: 1.0 to debug what's going on, and even when the fox walks to the right of me, its sound plays on my left speaker sometimes 
ALSO there's another thing -- every node has an associated configuration struct. This is used once when the node is created, and often sets things up like channel count (which can only be determined once when the node is created). These can be added as additional components.
sample_effects![(
VolumeNode::default(),
VolumeConfig {
// add a 5-channel volume node
channels: NonZeroChannelCount::new(5).unwrap(),
..Default::default(),
},
)]
So hopefully we'll guide user's expectations towards the idea that audio nodes can have multiple meaningful components.
I assume SpatialBasicNode uses GlobalTransform and not just Transform, right?
ya
I'll be honest -- I haven't tested in 3D yet. I had to adjust the coordinate system for 2D, so I'm not 100% sure it matches Bevy's coordinate system properly. Although that issue doesn't seem related to coordinate systems 

Oh, also it's not playing in my browser
Is there an error before that one?
(I'd expect that to happen if the audio context fails to initialize.)
well that's symphonia for you ๐ they've since resolved that I think, but they've made no new releases
Ooof
Is there anything I can do?
bevy_audio uses symphonia too, right?
So I'm wondering why I'm not getting the same error there
well it's just a log (info), isn't it?
but I don't think bevy_audio uses symphonia by default for every format
Oh wait you're right haha
here's the actual error, sorry ๐
ya that means it either doesn't have the proper headers or it wasn't compiled with the +atomics target feature
Hmm still same error
This should be using the same compilation options as in the minimal example I did before
Is my browser just caching the wrong headers? 
Let me do a cargo clean too just in case
To make absolutely sure it's being delivered with the right headers, you can check the response headers for the main document in the network tab of your browser's inspector.
If they're present, then it's almost certainly an issue with the Wasm's target features.
On it
Another thing, if I do this:
commands.spawn((
Name::new("SFX audio sampler pool"),
SamplerPool(Sfx),
sample_effects![(
SpatialBasicNode {
panning_threshold: 1.0,
..default()
},
SpatialScale(Vec3::splat(2.0))
)],
VolumeNode {
volume: DEFAULT_VOLUME,
},
));
Does that mean that all my Sfx will "inherit" those spatial basic nodes and spatial scales?
LGTM
Those are the exact same between projects 
"-Ctarget-feature=+bulk-memory,+sign-ext,+nontrapping-fptoint,+atomics"
(more than strictly necessary, I know)
and build-std = ["std", "core", "alloc", "panic_abort"]
Yes, we use entity cloning for missing effects, so things like this should be correctly cloned over. When overriding the effects, you will have to include a scale though.
got it 
So if I don't override the effects, the scale gets cloned?
Well, it's fine if you override any other effects
but if you specifically override the SpatialBasicNode, that's when you need to provide stuff like that again
(in this case there's only one effect, so
)
Could you take a look at my branch if you find time? My current two issues are
- No audio on web
- spatial audio has a weird interpretation of left and right
Yeah I should be able to take a look later today! Thanks for bearing with me
Thank you for giving me premium migration live guides haha
Hopefully we can iron this all out so the onboarding for everyone else is very smooth
should be no problem
@slate scarab alright here's the current status: https://github.com/janhohenheim/foxtrot/pull/407
Run it with bevy run
and run it in the web with bevy run web --headers="Cross-Origin-Opener-Policy:same-origin" --headers="Cross-Origin-Embedder-Policy:require-corp"
It uses WebGPU, so make sure you open it in a browser that supports that ๐
Since I use a highly customized .cargo/config.toml, you'll have to add these to your setup yourself
At least, until the CLI is able to do additive rustflags with config.toml
btw does anyone have opinions on the name SamplerPool?
I'm specifically worried about the r in Sampler. In one sense it's a more accurate description: a SamplerPool is a pool of Sampler nodes, in which we can queue SamplePlayers.
However, I think most of the API surface surrounding samples uses Sample. SamplePlayer, SampleEffects,SamplePriority. I'm a little worried SamplerPool will be a bit of a stutter on the r. And SamplePool is honestly accurate enough imo.
Sampler with r sounds more logical me, otherwise I'd go "huh, Sample is already an Asset, what do I need a pool for? Is that for reading them?"
Also Sample without r happens to a verb
So SamplePool can wrongly be read as "Go and sample this pool"
Not a strong opinion tho
Correct (if I did my math right ๐ )
Hmm, now I'm second guessing my math.
Given that every doubling of distance is a 6.020599913 decrease in decibels, that means:
db = โ6.020599913 * log2(distance)
And given than amplitude = 10^(db/20), that gives us:
amplitude = 10^((โ6.020599913/20) * log2(distance))
converting to log base 10 gives us:
amplitude = 10^((โ6.020599913/20log10(2)) * log10(distance))
and โ6.020599913/20log10(2) actually equals exactly -1 so:
amplitude = 10^(-log10(distance))
using the product rule for exponents:
amplitude = (10^(log10(distance)))^-1
which simplifies nicely to:
amplitude = 1/distance
And then assuming that a distance of 1 meter is the "maximum" sound level, that gives the final answer of:
amplitude = (1/distance).min(1)
This is quite difference from my previous calculation of:
amplitude = log10(-0.03 * distance)
(which I can't quite remember exactly how I derived it).
I should probably just research how other game engines do it.
Doubling the distance means quadrupling the surface area of the wavefront, so it is inverse square, not simple inverse 
Hmm, but 1/x^2 is an even sharper decline than 1/x.
I think the inverse square law applies to decibels, not raw amplitude. (6.02 is what you get when you apply the inverse square law for decibels).
ah so my graph might not have been taking that effect into account
The argument is over amplitude, not energy I'm pretty sure
either way, I think it would be nice to have a few models, with the most artistically useful being the default
the web audio api has three available: https://developer.mozilla.org/en-US/docs/Web/API/PannerNode/distanceModel
Oh, this is actually very helpful!
Yeah, we can add a parameter to switch between the different models.
Ah yeah, graphing the inverse model, it is equivalent to (1/distance).min(1).
Oh, and apparently the exponential model is equivalent to the inverse model when the rolloff factor is 1.
I went ahead and plotted all the models in desmos: https://www.desmos.com/calculator/eqdb7yuroi
are these the actual factor that we multiply the samples by, or do these values represent the dB?
I think it's the actual factor that we multiply the samples by (given that the inverse model matches exactly my derivation for amplitude).
Oh yeah, and treating those graphs as decibels gives weird results for amplitude. https://www.desmos.com/calculator/jnd6d1xkp4
@slate scarab arrrrrrg RUSTFLAGS strikes again
Turns out that in Foxtrot, it comes in sneakily through one config as an env var
overriding my target settings in ~/.cargo/config.toml
Audio works in web now ๐
oh nice!
The only thing left is that spatial audio behaves weird
I'm still be very grateful if you could take a look at it ๐
is it uh... any better? ๐
or does it need a heavier demo to cause stuttering with bevy_audio