#How do I actually get started using tracy?
160 messages · Page 1 of 1 (latest)
The most basic integration looks like this:
• Add the Tracy repository to your project directory.
• Tracy source files in the project/tracy/public directory.
• Add TracyClient.cpp as a source file.
• Add tracy/Tracy.hpp as an include file.
• Include Tracy.hpp in every file you are interested in profiling.
• Define TRACY_ENABLE for the WHOLE project.
• Add the macro FrameMark at the end of each frame loop.
• Add the macro ZoneScoped as the first line of your function definitions to include them in the profile.
• Compile and run both your application and the profiler server.
• Hit Connect on the profiler server.
• Tada! You’re profiling your program!
The recommended way to integrate Tracy into an application is to create a git submodule in the repository
(assuming that you use git for version control). This way, it is straightforward to update Tracy to newly
released versions. If that’s not an option, all the files required to integrate your application with Tracy are
contained in the public directory.
With the source code included in your project, add the public/TracyClient.cpp source file to the IDE
project or makefile. You’re done. Tracy is now integrated into the application
So, I copy the whole repo in to my specific project?
In one terminal, run: ./capture-release -o my_capture.tracy This will sit and wait for a tracy-instrumented application to start, and when it does, it will automatically connect and start capturing. Note that on Windows, the capture tool is called capture.exe.
Then run your application, enabling the trace_tracy feature: cargo run --release --features bevy/trace_tracy
From https://github.com/bevyengine/bevy/blob/main/docs/profiling.md#backend-trace_tracy
Fundamentally, I don't understand how to install the Tracy GUI referenced in Bevy's docs on Linux
(Ping @floral galleon when you have time; I know you've got this working and I'd like to improve Bevy's docs once I figure this out)
you clone tracy repo elsewhere and build it
there are released binaries for windows, but you have to compile it yourself for linux and macos
for added fun, you have to take the git tag of the correct version, 0.9 I believe
Okay, so I should clone this into just a random folder?
yup
Any tips for getting it to compile? I'm not exactly familiar with C++
I think you'll need to go to the directory profiler/build/unix then run make release
or capture/build/unix if you also want the capture tool
Okay, progress. I'm running into Package freetype2 was not found in the pkg-config search path.
Presumably I need to just apt-get it
On Unix systems you will need to install the pkg-config utility and the following libraries: glfw, freetype, capstone, dbus. Some Linux distributions will require you to add a lib prefix and a -dev, or -devel postfix to library names. You may also need to add a seemingly random number to the library name (for example: freetype2, or freetype6). Be aware that your package manager might distribute the deprecated master-branch version of capstone, and a build from source from the next-branch might be necessary for you. Have fun!
Aha, thank you!
On PopOS with the help of https://packages.ubuntu.com/:
pkg-config:pkgconfigglfw:libglfw3+libglfw3-devfreetype:libfreetype-devcapstone:libcapstone-devdbus:libdbus-1-dev
Okay @warm pier, the build process seems to have worked: I now have a obj/release directory with a Tracy-release binary in it
now run it 🙂
😅 Double-clicking on it does nothing, same with right click -> run
chmod +X Tracy-release in the correct directory also has no apparent effect 🤔
Run Tracy-release from the terminal. Maybe it’s outputting some error
alice@pop-os:~/Documents/Code/tracy/profiler/build/unix$ Tracy-release
Tracy-release: command not found
alice@pop-os:~/Documents/Code/tracy/profiler/build/unix$ chmod +X Tracy-release
No logs evident
probably ./Tracy-release
Aha
Cannot establish wayland display connection!
terminate called without an active exception
Aborted (core dumped)
Okay, plausible
wayland 💩 😄
calling make release LEGACY=1now fails with glfw broken
Okay, installing that and it's building again
Progress!
nice, that works then 🎉
run a bevy program with tracy enabled and it should appear in the list
Okay, and the capture tool also seems to basically be working
Do I need to run the capture tool first?
you don't need the capture tool
you can connect directly with the UI
the capture tool is to save the trace to a file, or if you notice the UI is taking a lot of resources
Aha
I often use the UI directly for quick investigations, and capture to compare different runs
./capture/build/unix/capture-release -o v1.tracy -s 60 -f, if you run that before your Bevy game, it will capture 60 seconds from start
i usually run the capture tool and then analyse the results after. i'm often comparing two runs using different code/configurations
Right 🙂
and i don't want the tracy GUI to be using GPU resources, competing with the application i'm tracing
I can see the rendering systems running while the main thread is being slept by the framerate limiter! Nice work @flint wharf!
Looks like I'm getting limited by ShadowPassNode and MainPassNode on the rendering side
digging in tracy is very fun 🙂
With a ton of time on check_visibility and check_light_visibility too
try the "statistics" button on top
And fixedupdate is just signal difffusion lol
Nodes for @floral galleon
And systems for @coral cobalt
that diffuse_signals system looks big 😄
Yeah lol
It's "all of the pathfinding and AI"
Implemented quite naively
With no parallelism
are you doing any batching/chunking yet? also, is your terrain static or destructible/animated somehow?
No batching / chunking yet
And this is for a small map still
Terrain is mostly static, but can be terraformed
i seem to remember you talking about ~4k entities?
Yeah, about that count
So merging the meshes and then rebuilding when terraforming changes in a chunk is probably the way
Will identical meshes be automatically batched in some way in the future?
in the future i think identical meshes will be instanced. variable meshes of the same material type will be batched
Okay. I may hold off on trying to roll my own chunking for now to manage implementation complexity, and try and get enough perf improvements out to test things by tuning shadows and improving signal diffusion
tempted to hack together merging of your meshes just to see what the perf difference is
though if you're using frame rate limiting...
Trivial to turn off 🙂
cool
Yeah I'd be super curious
so for now, we should be able to merge meshes that use the same material
Yep: for Emergence that means all of the columns can definitely be merged
We could merge all of the toppers of the same Id<Terrain> too, but I'm not sure if disconnected meshes are supported like that
And then you'd want to rebuild the mesh in change_height
i imagine they're all using triangle lists so they're supported
searching for change_height gives me nothing
ah respond_to_height_changes, sorry!
no worries 🙂
hmmmm
so if we start with merging the columns and ignoring the toppers, we should still see a perf boost from that
sorry, catching up, what's the question?
that should roughly halve the number of mesh entities i think? given the columns + toppers make up the majority
@neon mauve maybe our discussion should continue elsewhere
if the system is per-entity compute heavy, par_iter combined with ParallelCommands and a custom command to write the final results out can be quite effective.
I wanted to show you the list of bottlenecking systems on a realistic game 🙂
you treat the entire world as read-only during parallel iteration, and use ParallelCommands to defer writing the final results to the world at a later time
Yeah thankfully this is all stored in a single resource, so I expect I'll be able to parallelize directly 🙂
gotta be careful on the write-out
since you'll commonly need locks for resources
it's easier if it was per-entity
Sorry I'm a bit out of the loop, what is this about? :)
We definitely need better documented patterns for going wide within a system
It's not all that intuitive for game dev
Which otherwise falls easily into traps that make it impossible to go wide
Tracy, the powerful but fussy-to-install profiling tool, is installed, and I've profiled Emergence
Awesome! That's also on my list of things to set up
Yeah, I've summarized what I learned from this thread in https://github.com/bevyengine/bevy/issues/8366
And since you're on PopOS should be directly transferrable
Thanks for all the help Francois ❤️
It’s not that hard to install… and most of what I’ve copied is from their pdf
Well, as far as c programs go
Agreed here: this was one of the better documented and straightforward to install C/C++ programs I've ever installed
|| but that bar is underground ||
@neon mauve fwiw instancing is surprisingly straightforward, and will let you trivially render tens of millions of instances, if you can express your tiles (or sets of tiles) that way.
Instead of chunking/merging/etc
I’m curious what’s the difference between Tracy and criterion
Does Tracy offer better features
tracy is an application profiler, criterion is a microbenchmark harness
Ah ok
The former is useful for measuring real-world application performance by instrumenting a running applicaation
the latter is meant to collect large numbers of samples for small operations
both are useful for their own purposes
Are there any pure rust alternatives to Tracy 
right now? No
🐦 Friendly little instrumentation profiler for Rust 🦀 - GitHub - EmbarkStudios/puffin: 🐦 Friendly little instrumentation profiler for Rust 🦀
Oh I’ve seen that
I used puffin for my previous game, it was nice
@neon mauve keep in mind tracy will only measure CPU performance. If you want to measure GPU performance, you'll need https://github.com/bevyengine/bevy/pull/8067
I.e., CPU performance for rendering is time spent recording rendering commands on the CPU, GPU performance is how long it actually takes to execute all the commands, rasterize triangles, shade pixels, etc on the GPU
Thank you!
Jasmine’s GPU time stamp queries PR give span-like timings of how long each pass or node took. If you need more detail than that then you need tools like nsight graphics, Radeon graphics analyzer, intel graphics analyzer, Xcode’s instruments tools, RenderDoc can show some things but iirc its timing information isn’t great and all the others show things like GPU occupancy over time, cache misses, bandwidth usage, etc etc
Yes. NSight does way more advanced things like analyze wavefront occupancy, profile shaders down to the instruction, analyze synchronization stalls between CPU and GPU, etc.
memory bandwidth usage is another useful one 🙂
variable meshes of the same material type will be batched
Sorry to hijack this, what do you mean by material type?
I'm currently adding implementation for Valve (quake) maps, and I'm spawning a mesh per brush. Performance is fine until the scene gets large due to the number of drawcalls. I'm already merging planes to make the brushes a single mesh instead of e.g. 6 meshes in the case of a cuboid (6 planes), but I'm debating whether to go a step further and just merge every basic brush that shares the texture. I would much rather keep them separate if possible, though.
I wouldn't mind waiting for this optimization if it'll help my use case; what do you think? It's essentially just a bunch of meshes (not identical) that share the exact same material.
I mean the type that implements the Material trait. So all meshes with materials of type StandardMaterial have the potential to be batched together
I don’t know what brushes are in terms of quake maps. You mention a cuboid…
Brushes are basically the building blocks of how some old games did their geometry; in my case I'm converting the brush planes into meshes so I can use valve/quake level editors for my game with Bevy. So a very large map could end up with thousands or tens of thousands of meshes since you're basically modelling in the level editor. For example, the attached chunk of terrain is 16 brushes, which I end up turning into 16 meshes (In the editor I can move any of the chunks around, they're not part of the same "mesh" in the traditional sense).
They're all using the same material and texture that just tiles over itself. Does that make sense?
Yup. Makes sense. Thanks! Yes, this kind of thing should be highly batchable. But, are the meshes meant to be destructible? Or can you know if they are? If not then adding the vertices of each brush into meshes grouped by material instance should get you significant wins already
Most of the time they're not destructible, and thankfully like you asked we know if they are when we're generating the map. I'm actually debating doing what you said of merging all the brushes that have the same material, I just don't know what kind of unexpected problems I might run into (I'm new to this sort of thing). Like if I have just one massive mesh that spans from one corner of the map to another, maybe I'll have issues with collision, or if I implement some form of render distance detecting what to hide, etc.
That's why I'm wondering if maybe it's just worth waiting for batching if I'm not hitting that bottleneck yet, you get me?
Ah if you need to use the mesh for collisions as well then yeah that could be problematic. Though you could use separate meshes for the rendering and collisions maybe
And then when batching is implemented you can simplify and use the same for both
Ohh not a bad idea, I'll do that if the levels get too big
Do you know if batching is likely to be in for 0.11-0.13 or is it more out of that scope? No worries if you don't ETAs are hard lol
Thanks btw
Not going to land for 0.11, hopefully 0.12/0.13. Depends on maintainer bandwith. Progress has kinda stalled atm.
Appreciate the response, thank you
Np. Feel free to help review existing PRs or contribute, more rendering help would be lovely :). Don't be scared to ask lots of questions in #reflection-dev , it's how I get started contributing to bevy.
Will do. I feel unqualified but I'll get there haha.
/shrug That's where everyone starts
Jasmine meant #rendering-dev 🙂
I was unqualified. I did stuff and time passed, and now I’m a maintainer and rendering SME 🙂 if you’ve got the time then learning and doing stuff is all it takes. And it’s not an individual effort either
Ah rip yeah
Yep, that was me. I started off with no Rust knowledge and minimal programming experience 🙂
Really?! Wow! 🙂