#Managing `operator new` visibility in gcc

50 messages · Page 1 of 1 (latest)

novel bronze
#

I have a program that uses a custom memory allocator via an arena that needs to be initialized on program start. This is built to run hundreds of thousands of test cases with 0 memory leaks. I would ideally like to compile this as a DLL, but the problem is that operator new is still visible, even when the visibility is explicitly set to hidden. From what I understand, this is a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81107

So, my two options are:

  • Disable operator new for DLLs, resulting in memory leaks
  • Preserve operator new for DLLs, resulting in, well...
$ python
Python 3.12.7 (main, Oct  1 2024, 11:15:50) [GCC 14.2.1 20240910] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pynoko
[Heap.cc:139] WARN: HEAP ALLOC FAIL: Cannot allocate 8 from heap (nil)

zsh: segmentation fault (core dumped)  python

My ideal third option is that operator new is scoped to the DLL and isn't exposed to anything else. I would like to avoid switching to clang if at all possible - anyone have any ideas?

upper spearBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.

next badger
novel bronze
#

although I suppose the secret fourth option is to just fix the memory leaks and not rely on the custom allocator, but in the very long run fixing memory leaks is not good for perf

next badger
#

i guess no one know... sad

balmy heart
# novel bronze I have a program that uses a custom memory allocator via an arena that needs to ...

just to make sure I understand the context

you initially had some dll built through gcc/g++
that dll had a variety of memory leaks, to fix them you replaced operator new to displace the standard allocation function to use an arena allocator without changing the rest of the code of the dll
that replacement operator new was made hidden/non-exported to try to limit it to the one dll, but it still displaced operator new in other binaries due to an "unexpected behaviour" between operator new and visibility

the options you're exploring are

  1. not fix the dll's memory leak
  2. keep your fix and making the dll break a bunch of binaries/apps that try to use the dll
  3. managing to use your replacement only for the dll in some way (original intended behaviour I assume)
  4. ditch the custom allocator/allocation function to fix the dll directly

did I get that right?

#

as for whether this behaviour with the visibility of operator new is a bug, it's sort of debatable, though I agree that it's most likely unexpected
everything that has to do with dll/so/dynamic libraries is more or less "outside the purview of the standard", so there are just a bunch of weird interactions which are more or less broken/unspecified/underspecified

normally from the moment you have some operator new anywhere, it's meant to be the replacement, no question asked, there isn't really any phrasing for it being the replacement in one portion of the program but not another, and that's arguably a desirable property, because in general you cannot mix and match allocators
in the sense that if you new'd something through one allocator and try to delete it through another allocator, you're gonna have a bad time
admittedly you're expected to have the one "global allocator", and then use allocator aware entities for when you do not want to use that global allocator, and yes in your scenario that's not convenient, because that would probably mean changing a bunch of code throughout the dll

#

to address a different "issue", imo your fix isn't really a fix
from what I understand of your description, you're using a global arena allocator for your dll fix
once the program ends, the global is destroyed and ends up "properly" releasing any and all memory/allocation, including the leaky ones

if that's what you're doing, this is just a way to maybe silence warnings/sanitizers, it isn't inherently less "leaky"

next badger
#

it sounds like some arcane

#

replace operator new in entire dll

#

compiled code

#

i thought it would require change dll itself

balmy heart
# novel bronze although I suppose the secret fourth option is to just fix the memory leaks and ...

if you have "real" leaks, I would argue that for performance you probably want to fix them
a long running process that keeps on leaking memory will eventually run out of memory, and that can manifest in a variety of ways which usually are non-nice/undesirable
typically a crash
and arguably a crash is a worst outcome than a maybe slower but correct/complete execution

if you have sort-of-false-leaks, e.g. the kind where you don't keep on grabbing more memory uncontrollably but always keep under some known limit, maybe because you have some global objects for which you had no good way to know when to destroy because you handle them through raw pointers, and don't need to run their destructor, and follow whichever guideline (I think it was one of google's) that recommends leaking them to avoid having complex interaction in between dll and static order destruction fiasco and all that/whatever from a practical pov those "leaks" don't matter/are intentional

next badger
#

is it valid way to re-run process using start* unix-api?
if it will free all memory of old process, right?

if it possible you maybe can rerun app when it take too many memory

balmy heart
#

to address the exact question you were asking

My ideal third option is that operator new is scoped to the DLL and isn't exposed to anything else. I would like to avoid switching to clang if at all possible - anyone have any ideas?
never done that so no, no clue
if you figure out something I guess I'm a bit interested, but there's a point where if gcc doesn't support it, your hands are tied
the global replacement being global is arguably intentional

if my understanding of your actual problem is somewhat correct, arguably option 1: leaking is mostly the same in the end, up to tooling/sanitizer diagnostics

next badger
#

you just need to save runtime data in some file

actually may be tough, but it works and eliminate any mem leak issue

unix provide some function about free memory, but that not 100% accuracy, so you may want to rerun on % of used memory

novel bronze
#

I feel as though a lot of things are being misrepresented here so I will go through them one by one

novel bronze
# balmy heart just to make sure I understand the context you initially had some dll built thr...

I say I'm compiling for DLL - it's a standalone program that other people are attempting to make into a library, and I am trying to accomodate them as best as I can

this standalone program uses a custom memory allocator for a few reasons, the first and foremost being because obviously I'm not perfect and leaks will happen. I'm partially joking when I say in the long term fixing memory leaks is not good for perf, but there's a grain of truth to that because frankly, it would be faster just to zero the entire memory space. the cleanup happens in such a way that it ensures there are no dangling pointers, and is thus safe

it also helps development because it provides differing consistent behavior across release and debug builds (release zeroes the memory and debug fills the memory with an arbitrary value)

options 1 & 2 are the already existing options, which I am ideally hoping to avoid, because they both have drawbacks. I'm wondering about option 3 in particular because it's the best of both worlds

novel bronze
novel bronze
#

I was holding out some hope that someone who uses gcc would be familiar enough with this kind of problem to know how to get around it (since clang has been able to do this for ~6 years now)

balmy heart
#

the first and foremost being because obviously I'm not perfect and leaks will happen.
as far as I'm concerned that's the worst reason you could have, but that's opiniated I guess

I'm partially joking when I say in the long term fixing memory leaks is not good for perf, but there's a grain of truth to that because frankly, it would be faster just to zero the entire memory space. the cleanup happens in such a way that it ensures there are no dangling pointers, and is thus safe
this can work out fine in a variety of scenario, yes, but that's really only for memory
because you're posting this in c++ related contexts, it has to be mentioned that memory leaks in c++ are very commonly also object leaks, and leaking an object that is not trivially destructible is a big meh
I cannot actually judge because I don't know what precisely you're leaking, and it's also unclear if your code depends on proper destruction of objects or not, but I assume it doesn't depend on that
so the next issue about memory leak is running out of usable memory, and that's also not clear if you have that problem or not

if some people are trying to use the code in your DLL and your DLL is to be provided as a library, it's unclear to me what the memory usage will end up like
what it is the DLL expose, one function that runs once and then cleans up everything once it's done? then it's not actually a leak by virtue of what your arena allocator does, I guess?
or can user create some object whose lifetime you do not strictly control, and the cleanup may not happen until a long time after the fact, and maybe the object could grab more and more memory that it pseudo-leaks, creating issues?

they are never true memory leaks
can the user turn it into a true memory leak or can he not

novel bronze
#

I'm partially joking when I say in the long term fixing memory leaks is not good for perf
without going into too much detail here, perf really matters for this project. the goal is to run Mario Kart Wii's physics simulations thousands of times more efficiently than e.g. uncapped emulator

#

because you're posting this in c++ related contexts, it has to be mentioned that memory leaks in c++ are very commonly also object leaks, and leaking an object that is not trivially destructible is a big meh
any objects that aren't trivially destructible typically get handled by a "Disposer" class. when the scene is exited, and the memory space gets zeroed, it will first iterate through all Disposers and properly call their destructors

#

if some people are trying to use the code in your DLL and your DLL is to be provided as a library, it's unclear to me what the memory usage will end up like
it's rather up in the air for me I'm afraid. I don't work with DLLs at all, that behavior is rather unpredictable to me

#

can the user turn it into a true memory leak or can he not
nope!

balmy heart
#

it also helps development because it provides differing consistent behavior across release and debug builds (release zeroes the memory and debug fills the memory with an arbitrary value)
whatever value the memory ends up holding/representing arguably does not matter unless you have invalid memory accesses... usually I'd argue that only the property of "debug" filling the memory with canary values is relevant, but some defensive coding styles disagree
like there are a variety of other constructs in the language that are kinda broken even if you still zero out the memory, but there's a point where this can interact with your other custom constructs and toolings to bring about various benefits, so I guess? if you're sensitive to performance issues I'd rather not zero-out the memory though

I'm wondering about option 3 in particular because it's the best of both worlds

I was holding out some hope that someone who uses gcc would be familiar enough with this kind of problem to know how to get around it (since clang has been able to do this for ~6 years now)
well I can't say I'm an in-depth gcc user, there are definite drawbacks to displacing the allocator in only one binary, but I assume you're probably familiar with those issues
there's no way clang would be able to deal with such issues by itself, so you'd need to have your code structure bypass those issues

#

this sounds more and more like some ecs

novel bronze
balmy heart
#

ok maybe not, it's just that the word popped up in my mind

#

that thing yeah

novel bronze
#

ehh, kinda

balmy heart
novel bronze
#

it matches behavior. perf matters, but while the program is still being developed, accuracy matters more

balmy heart
#

matches behaviour... with what?

novel bronze
#

what's observed in Mario Kart Wii

balmy heart
#

I think I'll bow out then

#

I don't have an answer to your original question, and this setup fundamentally doesn't make sense to me

#

if zero-ing destroyed objects affect behaviour, imo it's an issue that objects that replace the destroyed one aren't initialized properly

#

at the very least that's how usual c++ is intended to behave

#

if it's not a matter of making new objects in the location of older objects, then you're trying to access non-existing objects

novel bronze
#

yes I think this has gone rather off-topic

#

I have not explained myself clearly (which I'm admittedly bad at in general) nor do I think I will be quite so capable of doing so

#

I think I agree that there is no solution to the original question, so I will mark as resolved

#

!solved

upper spearBOT
#

Thank you and let us know if you have any more questions!

This thread is now set to auto-hide after an hour of inactivity

next badger
#

xD

#

well played