catch - A simple fetch in C | Together C & C++ | Page 1

twilit perch May 17, 2024, 8:20 AM

#

I made a simple fetch in C for GNU+Linux systems.
It was designed to be easy to understand, fast, and memory safe.
I would love feedback on it

📎 catch.c

twilit perch May 17, 2024, 4:25 PM

#

@tardy wyvern Its this

#

according to valgrind, it is memory safe

#

super fast execution

#

its pretty much instant

tardy wyvern May 17, 2024, 4:29 PM

#

twilit perch <@713515744297746553> Its this

Not bad; tbf I don't really write too many unix utilities myself so I am not sure of best practices but you may have better luck looking at say how neofetch is implemented 🙂

twilit perch May 17, 2024, 4:30 PM

#

tardy wyvern Not bad; tbf I don't really write too many unix utilities myself so I am not sur...

I just want to know if the code itself and the design is good

#

I designed it to be divided in functions, each function does 1 thing and is independent from everything else

#

the only exception would be free_character_arr() Which provides a way to free character arrays made by split() in a easy way

tardy wyvern May 17, 2024, 4:33 PM

#

twilit perch I designed it to be divided in functions, each function does 1 thing and is inde...

Yea that's a good approach from a unix philosophy standpoint - have one thing only and have that one thing do the thing well or smth

tardy wyvern May 17, 2024, 4:34 PM

#

twilit perch the only exception would be `free_character_arr()` Which provides a way to free ...

That just seems like a cleanup function which is ok too

twilit perch May 17, 2024, 4:34 PM

#

tardy wyvern Yea that's a good approach from a unix philosophy standpoint - have one thing on...

I didnt exactly aim for that, but to lower code block sizes and not make main() A bunch of nonsensical and extremely long block of code

#

Just imagine main() if all that code was merged into it without the functions

tardy wyvern May 17, 2024, 4:35 PM

#

twilit perch I didnt exactly aim for that, but to lower code block sizes and not make `main()...

That's good; you should see how I coded my main function for my Cuda mergesort; I didn't bother too much with creating some auxiliary functions and have a main function that's like 150 lines long yamikek

twilit perch May 17, 2024, 4:36 PM

#

tardy wyvern That's good; you should see how I coded my main function for my Cuda mergesort; ...

emil

#

maintainability = -1

tardy wyvern May 17, 2024, 4:37 PM

#

twilit perch Just imagine main() if all that code was merged into it without the functions

use functions for more modularity? ❌

shove everything into main to save overhead of function calls? ✅

twilit perch May 17, 2024, 4:37 PM

#

speaking of maintainability, catch should be easy to maintain, and add additional code to. A function is responsible for printing out a specific information, so all a dev that wants to contribute to catch just make a function, verify the said function works on their machine and just copy-paste it in

#

at least in theory

tardy wyvern May 17, 2024, 4:38 PM

#

twilit perch speaking of maintainability, `catch` should be easy to maintain, and add additio...

Do you know how to unit test?

twilit perch May 17, 2024, 4:39 PM

#

tardy wyvern Do you know how to unit test?

no

twilit perch May 17, 2024, 4:42 PM

#

tardy wyvern use functions for more modularity? ❌ shove everything into main to save overhea...

even with that overhead, catch prints out almost instantly

#

You should try running it on your own machine (If you're on GNU+Linux)

tardy wyvern May 17, 2024, 4:44 PM

#

twilit perch no

https://youtu.be/kZ2Nco4-39s

https://github.com/Snaipe/Criterion

Give these a shot and put into your program and see if they catch any bugs 🙂

YouTube

Charles Cabergs

Unit testing in C with criterion

Finally, a clean and simple library to do unit test in C and C++.

criterion: https://github.com/Snaipe/Criterion
criterion doc: https://criterion.readthedocs.io/en/master/index.html

Social links:
Website: https://cacharle.xyz
Github: https://github.com/cacharle
Linkedin: https://www.linkedin.com/in/charles-cabergs-328aa8214/

▶ Play video

GitHub

GitHub - Snaipe/Criterion: A cross-platform C and C++ unit testing ...

A cross-platform C and C++ unit testing framework for the 21st century - Snaipe/Criterion

tardy wyvern May 17, 2024, 4:44 PM

#

twilit perch You should try running it on your own machine (If you're on GNU+Linux)

I will once my stomach cooperates with me again 🙃

twilit perch May 17, 2024, 4:45 PM

#

tardy wyvern I will once my stomach cooperates with me again 🙃

welp, good luck

tardy wyvern May 17, 2024, 4:45 PM

#

twilit perch even with that overhead, `catch` prints out almost instantly

I was joking about the overhead pushing and popping stuff on and off the stack is blazing fast anyway these days 😛

twilit perch May 17, 2024, 4:55 PM

#

tardy wyvern I was joking about the overhead pushing and popping stuff on and off the stack i...

emil Some of the functions use the heap

#

I love the heap though

#

the entire split() function returns a heap allocated array which each element is a pointer to a heap allocated character array

tardy wyvern May 17, 2024, 4:59 PM

#

twilit perch <:emil:836730277036163132> Some of the functions use the heap

Heap can be fast too if you're not crossing too many page boundaries and/or it's all contiguous in memory 🙂

twilit perch May 17, 2024, 5:00 PM

#

tardy wyvern Heap can be fast too if you're not crossing too many page boundaries and/or it's...

I mean its not like im making something where nanoseconds count

tardy wyvern May 17, 2024, 5:02 PM

#

twilit perch I mean its not like im making something where nanoseconds count

But you're going into embedded tho aren't you trollcrab

twilit perch May 17, 2024, 5:02 PM

#

tardy wyvern But you're going into embedded tho aren't you <:trollcrab:891812104703713280>

I have yet to use the heap with arduino

tardy wyvern May 17, 2024, 5:04 PM

#

twilit perch I have yet to use the heap with arduino

https://arduino.stackexchange.com/questions/682/is-using-malloc-and-free-a-really-bad-idea-on-arduino

👀

Arduino Stack Exchange

Is using malloc() and free() a really bad idea on Arduino?

The use of malloc() and free() seems pretty rare in the Arduino world. It is used in pure AVR C much more often, but still with caution.

Is it a really bad idea to use malloc() and free() with Ard...

twilit perch May 17, 2024, 5:05 PM

#

tardy wyvern https://arduino.stackexchange.com/questions/682/is-using-malloc-and-free-a-reall...

fragmentation is funny, isnt it?

#

I understood "Dont use Heap with arduino cuz you will run out of memory due to fragmentation" from that

tardy wyvern May 17, 2024, 5:07 PM

#

twilit perch I understood "Dont use Heap with arduino cuz you will run out of memory due to f...

Basically use malloc sparingly and only when absolutely necessary, and when using it just allocate absolutely everything you need in one go so that fragmentation is minimized 🙂

twilit perch May 17, 2024, 5:07 PM

#

tardy wyvern Basically use malloc sparingly and only when absolutely necessary, and when usin...

Should i learn a specific coding style?

#

or should i continue with my own natural one?

tardy wyvern May 17, 2024, 5:09 PM

#

twilit perch or should i continue with my own natural one?

Not sure actually sorry I'm not far enough in coding either to answer that for myself either even 😔

twilit perch May 17, 2024, 5:10 PM

#

https://tenor.com/view/fade-away-oooooooooooo-aga-emoji-crumble-gif-20008708

Tenor

twilit perch May 17, 2024, 5:11 PM

#

tardy wyvern Not sure actually sorry I'm not far enough in coding either to answer that for m...

I wonder what my ChatGPT jailbreak will say

twilit perch May 17, 2024, 5:41 PM

#

tardy wyvern Not sure actually sorry I'm not far enough in coding either to answer that for m...

ChatGPT jailbreak said this:

#

I just got another idea to jailbreak ChatGPT

twilit perch May 18, 2024, 5:50 AM

#

tardy wyvern Yea that's a good approach from a unix philosophy standpoint - have one thing on...

It does have an issue though

#

since the printing function are independent, they eventually become magic for the user, and even the person that wrote it (me)

#

Like seriously, I can just forget about how it works and just print_os() without a care in the world

#

is that good or bad

tardy wyvern May 18, 2024, 6:27 AM

#

twilit perch is that good or bad

sorry I'm not quite sure what you mean :/

tardy wyvern May 18, 2024, 6:32 AM

#

twilit perch super fast execution

got a compiler warning; program works tho 👀

twilit perch May 18, 2024, 7:09 AM

#

tardy wyvern got a compiler warning; program works tho 👀

thats because the program dosen't have to care what fread() returns

twilit perch May 18, 2024, 8:07 AM

#

tardy wyvern sorry I'm not quite sure what you mean :/

my own program (catch) became magic for me, is that good?

#

Its really easy to forget how the functions that print information works

tardy wyvern May 18, 2024, 2:30 PM

#

twilit perch my own program (catch) became magic for me, is that good?

no you really want to document why it works so that you don't forget down the line 🙂

tardy wyvern May 18, 2024, 2:33 PM

#

twilit perch thats because the program dosen't have to care what `fread()` returns

then you might just want to suppress compiler warning by https://stackoverflow.com/questions/3378560/how-to-disable-gcc-warnings-for-a-few-lines-of-code but i won't recommend it as you're technically outsmarting the compiler at this point 🤷‍♂️ and I would actually do something with what fread() returns e.g. like check any potential errors

Stack Overflow

How to disable GCC warnings for a few lines of code

In Visual C++, it's possible to use #pragma warning (disable: ...). Also I found that in GCC you can override per file compiler flags. How can I do this for "next line", or with push/pop semantics ...

twilit perch May 18, 2024, 5:11 PM

#

@tardy wyvern Also, ive been thinking of making a very simple archive tool similar to tar

tardy wyvern May 18, 2024, 5:11 PM

#

twilit perch <@713515744297746553> Also, ive been thinking of making a very simple archive to...

Good luck with that 😅

twilit perch May 18, 2024, 5:12 PM

#

tardy wyvern Good luck with that 😅

(I am slowly becoming mentally insane)

twilit perch May 18, 2024, 5:14 PM

#

tardy wyvern Good luck with that 😅

I mean it shouldn't be that hard, right?

tardy wyvern May 18, 2024, 5:15 PM

#

twilit perch I mean it shouldn't be that hard, right?

That's what the beginners usually say yes yamikek

twilit perch May 18, 2024, 5:17 PM

#

    Archive
{------------------------------}
|____________| |_| |______| |_|
   Data         |     |      Terminator (Signals EOF)
                |   Metadata
                Metadata terminator (signals beginning of metadata)```

#

I was thinking of this way to make it work

#

(yes i spent a bunch of time drawing that)

tardy wyvern May 18, 2024, 5:25 PM

#

twilit perch ```c Archive {------------------------------} |____________| |_| |______| |_...

Picking the right terminators and data representations will likely not be a trivial task; best understand something like zip or tar and try and reimplement something similar so that you don't experience full pain of reinventing not just a wheel but the whole ecosystem 🙂

twilit perch May 18, 2024, 6:23 PM

#

tardy wyvern Picking the right terminators and data representations will likely not be a triv...

the terminators themselves are gonna be hard though

#

since a file can have any kind of data, i need terminators that wont get mixed

tardy wyvern May 18, 2024, 6:26 PM

#

twilit perch since a file can have any kind of data, i need terminators that wont get mixed

Have the terminator be a unique sequence of characters and also have a header which includes the payload size maybe is simplest thing I can think of

twilit perch May 18, 2024, 6:27 PM

#

tardy wyvern Have the terminator be a unique sequence of characters and also have a header wh...

What about generating random 1 byte binary for each archieve and using that as terminator? kekw

tardy wyvern May 18, 2024, 6:29 PM

#

twilit perch What about generating random 1 byte binary for each archieve and using that as t...

Hey if you wanna really feel the pain of reinventing a whole new standard be my guest yamikek

twilit perch May 18, 2024, 6:30 PM

#

tardy wyvern Hey if you wanna really feel the pain of reinventing a whole new standard be my ...

it was meant to be a joke

#

What about unsigned chars as terminators?

twilit perch May 18, 2024, 6:31 PM

#

tardy wyvern Hey if you wanna really feel the pain of reinventing a whole new standard be my ...

Also, would it be possible to use binary as a terminator? (like 0xblablabla)

tardy wyvern May 18, 2024, 6:37 PM

#

twilit perch Also, would it be possible to use binary as a terminator? (like `0xblablabla`)

If you write in C then you'll probably have to use something like https://stackoverflow.com/questions/17598572/how-to-read-write-a-binary-file#17598785 to do the binary IO

Stack Overflow

How to read/write a binary file?

I'm trying to write to a binary file, read from it, and output to the screen.
I can write to a file, but when I try to read from it, it is not outputting correctly.

twilit perch May 18, 2024, 6:41 PM

#

tardy wyvern If you write in C then you'll probably have to use something like https://stacko...

I know how to do binary I/O, I just need a binary terminator

tardy wyvern May 18, 2024, 6:43 PM

#

twilit perch I know how to do binary I/O, I just need a binary terminator

I think you might be better off just explicitly storing payload labels and sizes...

twilit perch May 18, 2024, 6:43 PM

#

tardy wyvern I think you might be better off just explicitly storing payload labels and sizes...

💀

#

ive got an idea, i should be able to write hexidecimal binary

#

with unsigned chars

twilit perch May 18, 2024, 6:45 PM

#

tardy wyvern I think you might be better off just explicitly storing payload labels and sizes...

Question is: what should i name the project?

tardy wyvern May 18, 2024, 6:45 PM

#

twilit perch 💀

Bruh don't 💀 me when your stuff encounters buffer overflow or gets randomly truncated lol

tardy wyvern May 18, 2024, 6:46 PM

#

twilit perch Question is: what should i name the project?

ltar for learner's tar? Lol?

twilit perch May 18, 2024, 6:46 PM

#

tardy wyvern ltar for learner's tar? Lol?

what about sar which stands for "Stupid Archive"

tardy wyvern May 18, 2024, 6:46 PM

#

twilit perch what about `sar` which stands for "Stupid Archive"

Sure 🙂

twilit perch May 18, 2024, 6:46 PM

#

trollcrab

tardy wyvern May 18, 2024, 6:47 PM

#

twilit perch <:trollcrab:891812104703713280>

So you're not gonna be storing the file sizes explicitly?

tardy wyvern May 18, 2024, 6:47 PM

#

tardy wyvern So you're not gonna be storing the file sizes explicitly?

In your metadata that is?

twilit perch May 18, 2024, 6:48 PM

#

tardy wyvern So you're not gonna be storing the file sizes explicitly?

the filesize can be enumerated from the size of the data block, metadata is only for filename and type

tardy wyvern May 18, 2024, 6:50 PM

#

twilit perch the filesize can be enumerated from the size of the data block, metadata is only...

So basically if I went in and edited a sar file I can technically cause it to read the entire file in as a single data block bc I maliciously manipulated the terminators? 🙃

twilit perch May 18, 2024, 6:50 PM

#

twilit perch ```c Archive {------------------------------} |____________| |_| |______| |_...

this is an example on how an archive with a singular file is structured, the same structure can be copied over and over for a gigantic archieve with many files

tardy wyvern May 18, 2024, 6:51 PM

#

twilit perch this is an example on how an archive with a singular file is structured, the sam...

I would also encourage you to store a checksum to detect tampering

twilit perch May 18, 2024, 6:51 PM

#

tardy wyvern I would also encourage you to store a checksum to detect tampering

Im pretty sure any attacker would also change the checksum too

#

for tampering detection, the users are better off using cryptographic methods such as signing

#

first im gonna try to write hexidecimal binary for terminators

tardy wyvern May 18, 2024, 7:07 PM

#

twilit perch Im pretty sure any attacker would also change the checksum too

You're basically saying "because an attacker will go to any lengths to do x" that you'll just do the least ideal thing possible and just be done with it; I don't really agree with that mentality but if you want to create a new archive standard where I can just truncate your archive or cause a buffer overflow trivially then ig that's on you 🤷‍♂️

twilit perch May 18, 2024, 7:12 PM

#

tardy wyvern You're basically saying "because an attacker will go to any lengths to do x" tha...

Im making it to learn

#

yeah boii

#

I can write hex binary

twilit perch May 20, 2024, 3:23 PM

#

its not a compression standard or anything, Im just putting file data in a big file along with their metadata such as name

#

it isnt rocket science

twilit perch May 20, 2024, 3:25 PM

#

tardy wyvern So basically if I went in and edited a `sar` file I can technically cause it to ...

I suppose i could reserve the first few bytes for archive metadata (which specifies where stuff is at)

#

instead of using terminators

twilit perch May 20, 2024, 8:29 PM

#

@tardy wyvern Also, does sizeof(long) change accross machines or is it only a compiler portability thing?

tardy wyvern May 20, 2024, 9:21 PM

#

twilit perch <@713515744297746553> Also, does `sizeof(long)` change accross machines or is it...

It does change across machines esp. if you wanna switch to a micro-controller; if you explicitly want 64 bit int you'll have to #include <stdint.h> and then do sizeof(int64_t) as well as redeclare all your longs to int64_ts 🙂

twilit perch May 21, 2024, 9:34 AM

#

tardy wyvern It does change across machines esp. if you wanna switch to a micro-controller; i...

emil

twilit perch May 21, 2024, 10:14 AM

#

tardy wyvern It does change across machines esp. if you wanna switch to a micro-controller; i...

Also im gonna make the part that reads first

#

After that ill make the archive maker

twilit perch May 21, 2024, 6:36 PM

#

@tardy wyvern btw, i made a github repo for Sar

#

im planning on Sar becoming a real thing rather than one for learning

#

https://github.com/U2C9727A4/Sar

GitHub

GitHub - U2C9727A4/Sar: Sar, short for "Stupid's archive".

Sar, short for "Stupid's archive". Contribute to U2C9727A4/Sar development by creating an account on GitHub.

tardy wyvern May 21, 2024, 6:44 PM

#

twilit perch im planning on Sar becoming a real thing rather than one for learning

then you better put the size of each data chunck within the metadata portion of the archive lol

twilit perch May 21, 2024, 6:48 PM

#

tardy wyvern then you *better* put the size of each data chunck within the metadata portion o...

uhh what?

#

the data portion is reserved for the file data inside of the archive, it dosent even have a name. Why would i need to do that?

#

the size of the data block is defined in the reserved space of the uint64_ts

tardy wyvern May 21, 2024, 8:00 PM

#

twilit perch + the size of the data block is defined in the reserved space of the uint64_ts

Exactly my point is to store data block sizes explicitly instead of implicitly lol

twilit perch May 21, 2024, 8:01 PM

#

tardy wyvern Exactly my point is to store data block sizes *explicitly* instead of implicitly...

thats what im doing

#

it is stored outside of metadata in a reserved portion called archive metadata

tardy wyvern May 21, 2024, 8:01 PM

#

twilit perch thats what im doing

👍🔥

tardy wyvern May 21, 2024, 8:02 PM

#

twilit perch it is stored outside of metadata in a reserved portion called archive metadata

metametadata kekw

twilit perch May 21, 2024, 8:03 PM

#

tardy wyvern 👍🔥

I called it "Stupid's Archive" for a reason

#

its because its stupidly simple

tardy wyvern May 21, 2024, 8:05 PM

#

twilit perch I called it "Stupid's Archive" for a reason

My dad is data scientist and told me before actually metametadata exists iirc 🙂

twilit perch May 21, 2024, 8:05 PM

#

tardy wyvern My dad is data scientist and told me before actually metametadata exists iirc 🙂

emil

tardy wyvern May 21, 2024, 8:06 PM

#

twilit perch <:emil:836730277036163132>

No not ur archive tool; it's metametadata I'm talking about lol

twilit perch May 21, 2024, 8:06 PM

#

i sent that emoji to metametadata

tardy wyvern May 21, 2024, 8:21 PM

#

twilit perch i sent that emoji to metametadata

So you store different data about each data block, and there can be multiple data blocks, and then the archive metadata includes data about the whole archive including data about the data about each data block? Is this your design and does that make sense?

twilit perch May 21, 2024, 8:23 PM

#

tardy wyvern So you store different data about each data block, and there can be multiple dat...

This is the structure of an archive with just a singular file

#

the same structure can be copied over and over again for a archive with many files

#

heck, you can even merge 2 diffirent archive files into a singular one

#

the design very well allows that and shouldn't require extra effort inside the source code

#

merging the archives shouldn't even have to do any processing, you just get arc1 and arc2 and just combine em' without doing anything special

#

you should be able to use DD to combine them too

tardy wyvern May 21, 2024, 8:49 PM

#

twilit perch This is the structure of an archive with just a singular file

Ok so does Archive metadata contain data that summarizes/describes what's in the metadata block if that makes sense?

tardy wyvern May 21, 2024, 8:51 PM

#

twilit perch you should be able to use DD to combine them too

That means that all you did was just put some extra wrappers around a file and called that an archive yamikek

twilit perch May 22, 2024, 4:58 AM

#

tardy wyvern That means that all you did was just put some extra wrappers around a file and c...

isnt that literally what an archive is?

twilit perch May 22, 2024, 5:04 AM

#

tardy wyvern That means that all you did was just put some extra wrappers around a file and c...

I was aiming for a literal "tape archive" (Its literally stitching together files with extra wrappers), so here we are

twilit perch May 22, 2024, 4:50 PM

#

tardy wyvern Ok so does Archive metadata contain data that summarizes/describes what's in the...

archive metadata is 4 uint64_t integers, they define where metadata block and data block starts and ends at

#

Its just reserved for explicity defining where stuff is at

#

This also means an archive can be any size, up to the 64 bit limit.

#

while also allowing metadata block to be easily resizeable for any purpose, which makes adding new information to the metadata extremely easy

tardy wyvern May 22, 2024, 5:24 PM

#

twilit perch archive metadata is 4 uint64_t integers, they define where metadata block and da...

yep that's data about data about data for you lol i.e. meta-metadata 🤭

twilit perch May 22, 2024, 5:27 PM

#

tardy wyvern yep that's data about data about data for you lol i.e. meta-metadata 🤭

I called it archive metadata

twilit perch May 22, 2024, 5:47 PM

#

tardy wyvern yep that's data about data about data for you lol i.e. meta-metadata 🤭

aaand now the extremely complex part starts

#

aand i just completely avoided it

#

trollcrab

#

guess how i avoided it

#

seriously though, take a guess

tardy wyvern May 22, 2024, 5:54 PM

#

twilit perch seriously though, take a guess

oversimplification lol

twilit perch May 22, 2024, 5:56 PM

#

tardy wyvern oversimplification lol

The issue was to get a hash of a data block that dosent exist yet, so i just reserved 512 bytes of space for the hash at write_metadata() call

tardy wyvern May 22, 2024, 6:01 PM

#

twilit perch The issue was to get a hash of a data block that dosent exist yet, so i just res...

that's fine lol

twilit perch May 22, 2024, 6:01 PM

#

tardy wyvern that's fine lol

its stupid, but it works

#

hence the name of the archive project trollcrab

#

i called it "Stupid's Archive" for a reason

#

it implies the author (me) is stupid
it implies it is stupidly easy to use
it implies it is stupidly simple
it implies the code can be understood by someone stupid

#

just like me

twilit perch Jun 12, 2024, 4:35 PM

#

@tardy wyvern Aye

#

I found a really nice vscode extension

#

https://open-vsx.org/extension/llvm-vs-code-extensions/vscode-clangd

#catch - A simple fetch in C