#Explain the static order initialization fiasco please.
139 messages Β· Page 1 of 1 (latest)
When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.
not sure how helpful that is
this needs a code example maybe https://en.cppreference.com/w/cpp/language/siof
what does your code look like?
oh i interpreted this as a theoretical question
There are good examples and explanations online
If you have a specific question after you have read those we can help you
So, someone wants to use some variable created in some piece memory in some translation unit, but there hasn't been a constructor yet called to actually initialize anything inside of that piece of memory, so the result is, read from a corrupted piece of memory, kinda like, use after out of scope, or read from an invalid pointer
And the fix , supposedly is, reorder the translation units a.k.a. cpp files and the order in which they are created
such a vital and crucial piece of knowledge for writing working programs, and yet barely a paragraph in cppreference, how is this possible
No
No what
That might work in some cases with some compilers, but that's not guaranteed
It doesn't actually come up that often. It comes up when pople are trying to do too much during static init, which is pretty rare.
^ well, I don't really care about probability of occuring, I want to know how to detect and fix such a disaster if I ever see it
Last time I saw something vaguely simmilar, it was related to recursive types, and using some correctly written forward declarations to make the linker happy
Where struct A has pointer A and pointer B as fields, and struct B had a pointer B and pointer A as fields
// Static initialization order problem
// File1.h
class A {
....
void doSomething() {
...
}
}
extern A aObj;
//File1.cpp
static A aObj;
// File2.cpp
class B {
B() {
aObj.doSomething();// Not okay! aObj may not have been constructed
}
....
}
static B bObj;
So I stole this snippet from FreeCodeCamp
let me run asan on it...
in theory I know that B::B() can crash, let me try triggering that crash
hahahaha these goddamn clowns can't even write an example that compiles
π€£ π€£ π€£
kay let me try to edit it...
$ find ./ -type f -exec echo $'\n'"//"{} ';' -exec cat {} ';'
//./build.sh
#!/bin/bash
flag="-fsanitize=address"
set -eEuo pipefail
g++ -std=c++11 -c File1.cpp $flag -o File1.o
g++ -std=c++11 -c File2.cpp $flag -o File2.o
g++ -std=c++11 -c main.cpp $flag -o main.o
g++ -std=c++11 File2.o File1.o main.o $flag -o a.out
//./File2.cpp
#include "File2.h"
#include "File1.h"
B::B()
{
std::cout<<"B()\n";
::aObj.doSomething();
}
static B bObj;
//./File1.h
#pragma once
#include <string>
#include <iostream>
class A
{
std::string s;
public:
A() : s("something large enough to avoid small string optimization")
{
std::cout<<"A()\n";
}
void doSomething()
{
std::cout<<s<<"\n";
}
};
extern A aObj;
//./File1.cpp
#include "File1.h"
A aObj;
//./main.cpp
#include "File1.h"
#include "File2.h"
int main()
{
B instance;
}
//./File2.h
#pragma once
class B
{
public:
B();
};
in the build script, at link time, if File2.o is first, we get use-before-construct issue as seen from the corrupted output
but, asan doesn't catch this???
What did I miss?
Asan isnβt guaranteed to catch stuff like that
okay, and this object file ordering, does it guarantee the initialization order?
so, FIle1.o first and File2.o second is the fix?
No
Why would it? π
The standard has no notion of such silliness.
bla bla, linker docs, bla bla, symbol table, bla bla
I don't know
The standard has no notion of any βsymbol tableβ
yeah and the standard doesn't say anything about the linker either, but that doesn't mean we can ignore the linker
I didnβt say anything about ignoring it
The standard also doesnβt guarantee anything about the linker π
so what you are telling me is that, in real life, no matter how you reorder your .o files, this doesn't guarantee anything, and the linker can arbitrarily decide "you know what? I am going to make you crash for the lulz"
is that it?
there has to be some linker arg to enfore order, ther -l<libraryname> arg for dynamic libraries does enforce the order according to the linker that GCC uses
Not of static init
Itβs a lost cause, just donβt do this lol
If you have to, define in the same TU
why didn't you say this earlier
Otherwise, use functions to enforce order
I did
so, if we get LUCKY, the linker will use the CORRECT order, but the linker docs do not say anything about initialization of extern globals
but, this is just bad
Idk, maybe they do, itβll vary linker to linker
Just donβt do stuff like this
https://www.modernescpp.com/index.php/c-20-static-initialization-order-fiasco/
Someone here suggets "use static local variables", but like, this doesn't solve anything, if I run 2 threads, thread 1 initializes singleton1, thread 2 initializes singleton 2, and singleton 1 depends on sngleton2 being alive, well that same fiasco happens
It does solve things.
yes, the fix is, do not use globals in headers ever
what???
I gave an example, the globals will cause issues from the 2 threads
Do you know how static locals are initialized
lazily, once the first call reaches that line, the corresponding constructor is used for the static initialization
I explained that two threads initializing 2 singletons lazily can trigger the fiasco, thus a crash
why do you think this solves the fiasco
the mutex which guards these static locals in C++11 is a separate mutex for each of the variables, it's not some global holy divine mutex or something
if 3 threads try to touch the same singleton, okay that's fine
but if 2 threads touch 2 separate local static variables a.k.a. singletons, then we get the fiasco
How about you write out an example where using static locals has a static init order fiasco
should be pretty ez
Using static locals instead of globals
why do you make me do it?
Do it and weβll look at a concrete example
(And while youβre writing it you may come to understand why it solves it)
Keep in mind youβre asking us to spend our time helping you learn, you donβt get to sigh at us π
ok, so, due to the fact that, you aren't touching raw variables, but instead you are accesing lvalue references which function calls like getInstance are returning, you can't get a read-before-construct, BUT you CAN get botomless recursion between A::getInstance() and B::getInstance() leading to a certain stack overflow leading to segfault
and, if we go back to the normal extern globals, if A::A() uses objB and also, B::B() uses objA we get again infinite recursion, but more importantly, no matter how we reorder the .o files this fiasco is impossible to fix.
was that the explanation?
and the other thing the blogpost suggests is C++20 constinit, which well, doesn't work for complex globals which use pointers, heap and other non-compile-time things
It makes a difference where in the command you write this option; the linker searches and processes libraries and object files in the order they are specified. Thus, foo.o -lz bar.o searches library z after file foo.o but before bar.o. If bar.o refers to functions in z, those functions may not be loaded.
ummm, the docs show that the ordering really does work?
https://gcc.gnu.org/onlinedocs/gcc/gcc-command-options/options-for-linking.html
Same thing written at https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc.pdf
page 288
maybe for other linkers made by other compiler devs the rules are different, okay that's fine, these are the results for GCC at least
Ordering of link flags matters for searching as mentioned. That doesn't necessitate init order.
the linker searches and processes libraries and object files
mmmmm, I don't know how to interpret this quote
BUT you CAN get botomless recursion
Yes, if you have a true circular reference, absolutely. So just write it better π
- This actually doesn't cause a segfault in gcc, you get a clear error: https://godbolt.org/z/8eMxseG3P
- There are easy ways around this, namely don't reference B from A's constructor and vice versa
there's an exception for whaaaaaaaaaaat?
https://stackoverflow.com/a/6717376/15675011 this might help clarify things for you
__gnu_cxx::recursive_init_error
is this from the standard???
No, of course not π
It's gcc being nice
https://godbolt.org/z/YG917M3rx clang also gives a great error
which response exactly
I linked a specific response
oh
this thing is regardling linker erros for undefined references, where, you take you directed-acyclic-graph of dependencies, you run one topological sort, grab one of the possible sorted results and use that ordering to the linker.
but, I can't decide if this is related to the initialization order we were discussing
It's not
As I said here ^
indeed
so, if in real life, if some colleague tries to enforce init order with the .o file name ordering, and blindy believes that this fixes anything, and based on luck it seems to work correctly, and acts on the foolish principle "if it works don't touch it", do I need to read the assembly and decrypt what ordering is used in the raw executable file? How do I check the ordering?
Or do I just say, fuck it, and rewrite the code without the globals and thus solve the fiasco easily?
In real life, if a colleague ever writes code susceptible to the static init order fiasco, it's your job to block that from making it into the codebase.
Nothing else needed
No assembly nonsense, nothing
You don't check the order
You just don't do this lol
and what if the code was written 5 years ago still in prod π
(this is a fictional scenario, not real at my job lmao)
Then it's a miracle it hasn't broken in those 5 years. Fix it.
Nevermind, I am STOOPID
address sanitizer docs explain the flag that is needed to catch these
https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco
let me try...
still won't find the error
damn
not even valgrind reports issues
you can try ubsan or msan
well, then we're back to it being best to just not write code like this in the first place
Thank you and let us know if you have any more questions!
This thread is now set to auto-hide after an hour of inactivity
In the docs/papers about C++ modules, do they mention a way to solve the fiasco by enforcing some kind of order?
No
I had to fix something like this involving static initialization across ~20ishared libs back in the day, where the order that they load (on windows at least) isn't stable - not quite random, but never guaranteed to load in a specific order... and they all depended on one another in some way. the codebase (even just code within each library) was too big/risky to redesign so we built a dependency manager that would lazy load every library's primary resource management class in an explicit depth first order so the lowest level data was loaded first, then it walked up the tree from there until everything was safely initialized. It took a while to work out some of the crazy edge cases we came across, but it eventually stabilized. before implementing this there would be sporadic crashes on startup or teardown of the exe (since the same unstable ordering also applied to unloading the dlls). It's not the same exact issue, but this gave me flashbacks sweating at my desk as a pretty junior dev at the time haha.
@modest swallow - question - why is this design needed? just so 1 translation unit contains the definition? i'd be curious if you just thought about redesigning to avoid this much static initialization (unless it's literally just the two classes in some specific valid use-case)... specifically to the extent where you need to redefine the extern variable as static.
if you're going to stick with this design regardless, you prob want to just avoid the strings (or anything else that needs to be dynamically initialized). really anything that isn't a primitive. constinit might help, but you may still run into deallocation issues even if initialization happens to work out. you can also declare them as pointers then initialize dynamically from a function that returns a reference to a static instance scoped to the function that only gets invoked once at runtime (best approach).
the last part i'll reiterate is that the only way for this not to haunt into the future will probably just be a redesign. i'd be extremely surprised if there isn't a way to do what you want without needing to resort to nonstandard functionality like this.
just came across this blog post that should help break down what's happening and how to fix it (the author recommends a lot of what was also discussed above):
https://pabloariasal.github.io/2020/01/02/static-variable-initialization/#the-red-zone---static-initialization-order-fiasco
Today is the last day of the year. Iβm wearing my yellow underwear; itβs a new yearβs tradition in this part of the world. People say it shall bring wealth, luck and happiness for the upcoming twelve months. Growing up, I used to consider it silly to wear yellow underwear on new yearβs eve. Today I think silly is the one who doesnβt.
isn't stable - not quite random, but never guaranteed to load in a specific order
Could that be some evil hidden environment variable that forces some bad order when starting an .exe file on windows compiled with msvc++ compiler and its linker?
was too big/risky to redesign
Some shitty dev left globals in their library, and then the dependency hell began growing like a tumor I guess? π₯ β οΈ π₯
so we built a dependency manager that would lazy load every library's primary resource management class in an explicit depth first order so the lowest level data was loaded first,
uhhhhhh, did you split your program into multiple .exe files or something? What exactly is this dependency manager? Or did you instead choose to explicitly open dlls and explicitly load functions from them?
but this gave me flashbacks sweating at my desk as a pretty junior dev at the time haha.
You solved this disaster on your own as a dev?
constinit might help
it doesn't work for non-constexpr things
you can also declare them as pointers then initialize dynamically from a function that returns a reference to a static instance scoped to the function that only gets invoked once at runtime (best approach).
You haven't read the chat from above, I already figured that out. Yes, this is nice except when some goof triggers an infinite recursion between A::getInstance() and B::getInstance()
will probably just be a redesign
Yea
to do what you want without needing to resort to nonstandard functionality like this.
There are workarounds for the fiasco, but as you found out, there's no guarantees, the only way is to just avoid the fiasco with normal local variables and pass-by-reference.
Could that be some evil hidden environment variable that forces some bad order when starting an .exe file on windows compiled with msvc++ compiler and its linker?
no, just windows. at least at the time it was. not sure if this behaves any different in modern versions when it comes to this one.
Some shitty dev left globals in their library, and then the dependency hell began growing like a tumor I guess?
no, it's rare that a single dev could do something like this singlehandedly. this was slowly done over like 20 years by dev teams. i don't think the issue appeared until shortly after I started working there, probably as the company grew in size and something was introduced that exposed the crashing on init/teardown or just made it happen often enough where it became a problem. some applications like our mobile apps didn't have to worry about this since they statically linked to all of those libraries instead.
uhhhhhh, did you split your program into multiple .exe files or something? What exactly is this dependency manager? Or did you instead choose to explicitly open dlls and explicitly load functions from them?
no, but it was already split into many different binaries (the libraries), and there were probably ~100ish executables that depended on them at the time (that number is probably double by now). all global statically initialized variables were converted to pointers and just properly dynamically initialized at runtime in the right order. it was just an explicit ordering that was defined so that no higher level library could initialize itself until everything it depended on in other libraries were also initialized already. this had to be handled per library in a way that each was aware of every other library it depended on.
You solved this disaster on your own as a dev?
no it was a pretty massive undertaking to make sure nothing was broken since the codebase is ~5M LOC of just the shared native c++ code. it was a mix of a few senior devs and a few jr devs. the tech lead for my team at the time took notice that i enjoyed the challenge of some tricky bugs so i just got a lot more of them after that haha.
it doesn't work for non-constexpr things
this one is a little tricky to understand since it's not super commonly used. it's not really the same as constexpr other than the compile time initialization part.constinitis just for statically initialized variables, but the variables declared with it don't have to beconstand they don't need to supportconstexprdestructors. I added a link below that'll be worth a read or skim at least... the first thing you'll notice is it starts off by discussing static order init fiasco. the thing that causes it to break down for you is this sentence on cppreference's page for it: "If a variable declared with constinit has dynamic initialization (even if it is performed as static initialization), the program is ill-formed"... which means the std::string members probably break that rule. keep in mind the code you have in place is also ill-formed though.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1143r2.html
There are workarounds for the fiasco, but as you found out, there's no guarantees, the only way is to just avoid the fiasco with normal local variables and pass-by-reference.
I guess my biggest question is why these even have to be statically initialized in the first place. describing what you're actually using these objects for might help paint the picture in my mind to get a better idea of what you're trying to do. the more info you can provide, the more any of us can help come up with something that might be a better design solution as a whole.
statically initialized variables were converted to pointers and just properly dynamically initialized at runtime in the right order.
π
haha it was really the only option with that codebase
wait, how did you do the globals cleanup? Did you just go:
LET THEM LEAK
π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π π
haha, nah. it wouldn't be a leak in this case since unload order wasn't guaranteed either. if a shared library got unloaded while something that depended on it was still alive it'd crash the application on exit. deinit just happened in reverse order. initialization happened depth first in the dependency tree, and teardown just went in reverse order - starting with the root and working it's way down from there
yeah, and someone needs to run a topological sort on the dependency tree, a.k.a. the directed acyclic graph, and use a reasonable deinit order?
kinda hard in real life tho