#binary streams, char_traits, how many bits?

53 messages · Page 1 of 1 (latest)

dusk zephyr
#

While replacing C constants with C++ equivalents I ran into the problem where I assumed all chars were numerically the same, such that C's CHAR_BIT is 8 while C++'s std::numeric_limits<char>::digits gives me 7.

I then went down the rabbit hole of trying to shift types around until I get an 8 from numeric_limits. std::uint8_t seems like a decent candidate.

I was met by a whole bunch of compiler complaints about my chosen type not fitting an existing type covered by std::char_traits.

Do I just plow through with a magic 8 in my code? Is it wrong to assume char has eight digits?

Am I supposed to use a different streaming class from std::basic_iostream? I just want to read in bytes and treat them like eight bits data morsels without the compiler yelling at me and without using magic numbers. Is my understanding of C++ completely broken?

wanton flareBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.

jaunty forum
#

What are you even trying to do?

dusk zephyr
#

Sure. I am reading from a socket using POSIX api. I've wrapped the file descriptor with an admittedly hacky std::basic_streambuf derived class.

My problem is

requestId = buf[2] << std::numeric_limits<char_type>::digits & buf[3];```
#

std::numeric_limits<char_type>::digits unexpectly evaluates to 7

jaunty forum
#

How about you use 8 instead?

dusk zephyr
#

That is literally my question. I don't want to use a magic number and the non-magic number that C++ provides is 7, whereas the non-magic number C provides is 8.

#

Why do these differ? Why is programming "properly" this hard?

jaunty forum
#

Apparently signed characters get 7 bits of precision, the sign does not count as precision.

#

And it's implementation-defined whether char is signed or unsigned, so it varies by compiler.

#

I think the issue is that you are looking for the number of bits and the tool you used gives you the precision. Those are not always the same and happen to differ here.

dusk zephyr
#

Yes, I understand this. My problem is that I don't care about implementation details like that. I just want to process a byte stream using the classes provided by C++. If I use std::byte that's a can of worms because it has zero digits. If I use std::uint8_t that's a can of worms because basic_iostream wants the type to be covered by char_traits which it is not.

jaunty forum
#

I don't think going away from char is a good idea.

dusk zephyr
#

Then where do I get the number of bits? The translation guide for C constants to C++ constants is not precise.

jaunty forum
#

Keep with char and just keep using CHAR_BITS to pretend it's not hardcoded.

dusk zephyr
#

oh noes. then I will be tainting my C++ code with dirty nasty C constants.

jaunty forum
dusk zephyr
#

C++ posits that 8 is a universal constant that should be known and understood by everyone in existence regardless of context. got it.

jaunty forum
#

A universal constant for the number of bits in a byte.

#

If you use it for something else it might be less obvious.

dusk zephyr
#

That is my point. There is no name for it in C++, only C.

jaunty forum
#

C++ inherited almost everything from C for a reason 🤷

#

Replacing all of C with a C++-equivalent is not a goal.

dusk zephyr
#

Yet C++ has been duplicating just about every part of C in its own way since it popped into existence

#

I'm not here to argue with you about this. I am venting. Please excuse me.

jaunty forum
#

Whenever C++ thinks it can do better. It succeeded with strings and std::sort is an improvement over qsort, but it seems like a constant that defines the number of bits in a byte has not been improved upon yet.

dusk zephyr
#

Thank you for confirming my suspicions however.

jaunty forum
#

I vaguely remember that some earlier C++ (and C?) standards said a byte has at least 5 bits, but I believe that has been removed.

dusk zephyr
#

Conclusion: don't be a dummy and expect C++ to have an equivalent for every C constant. CHAR_BIT is the only place to "learn" that a byte indeed has eight bits.

#

!solved

wanton flareBOT
#

Thank you and let us know if you have any more questions!

This thread is now set to auto-hide after an hour of inactivity

jaunty forum
#

So, in theory there could be a system with 47 bit chars.

#

Though, I think combined with sizeof(char) == 1 there is not much chance of that.

dusk zephyr
#

Avid users of such systems are likely rolling in their graves at the mention of a char less than 36bits.

jaunty forum
#

I don't think any exist, but the standard committee seems deathly afraid of the possibility of a platform C++ is not suited for.

dusk zephyr
#

I used C89 for 10 years, extensively. I then used C# for another 10 years. Now I am dabbling in C++ and I am easily frustrated by apparent-to-me deficiencies. There are all these fancy constants dangling off various classes in C# and C++, while C goes brrrr with preprocessor constants everywhere. Heaven forbid I try using ancient C code with C++ and have to explain in a comment every time I invoke reinterpret_cast or const_cast...

The headers I use are so old that read-only pointers aren't even marked const and I have to const_cast everywhere.

jaunty forum
#

That does sound like a huge pain.

dusk zephyr
#

C++ treating the underlying memory as pure and only accessible through special wizardry is kind of a pain. I get it, but it's still painful. I just want to come in like a wrecking ball as one does in C.

jaunty forum
#

For the specific code you're writing I would just not bother argue that your version is a lie. It pretends it can deal with platforms where bytes are not 8 bits, but it surely cannot. Thus, hardcoding the 8 and maybe adding a static_assert is better than writing misleading code.

dusk zephyr
#

i think i did that in one of my early projects. static_assert(CHAR_BIT == 8); and other nonsense

#

this code will likely not even run on a different platform, ever. i just like having the peace of mind that i'm not using magic numbers.

jaunty forum
dusk zephyr
#

that's like C++20 or 23 thing right?

jaunty forum
#

probably

dusk zephyr
#

i'm stuck with 17 for the forseeable future.

#

i'm lucky i'm not stuck with the previous compiler that was partial 11 support

#

partial 11 is THE WORST.

#

at least it's not pre-standard C++ (the one i learned in school)

#

we didn't even have smart pointers there

jaunty forum
dusk zephyr
#

good to know it's in the pipeline