dumper: making a GCC plugin
Table of Contents
Hi everyone, it has been a while since I wrote something, so I hope you can enjoy and test my program (it is a plugin though!), and also read this blog post.
Dumper
You can read the README.org file in the source code, however I'm going to make a quick summary anyways. Dumper is a GCC plugin that generates C code and also prints out some information about the C structures that it founds within the source files where you are using it.
Here be dragons
The dragons den located at where the documentation should be. This means that the hardest part was the documentation, sometimes it was helpful and sometimes not, but it was seldomly enough. A lot of people think that the source files are the documentation, and sometimes is, however, not all the time is helpful or it may be to much effort to put on something that may not be our main focus. So, I think that the documentation must be important in any project and much more important in a projects that provides an API, not everyone wants or has the time to delve themselves within the source code, and even having the time is not going to guarantee you "productivity" since the code has to be read and understood.
I think that C is a magical language, it a sense of the way that it makes feel when I program (being a wizard!), however, it is magical because in can be use in some cryptic ways. Heck, that is why the IOCCC exists! The only limit is your knowledge in how the final binary can be manipulated within C. My point is… reading and understanding the source files is hard due technical reason (standards knowledge, old ISO c, underlying architecture and OS, etc) but it may be harder due to anthropological reasons.
My friend and foe: recursion
Well… the GCC plugin infrastructure, and I guess that almost all the GCC code is, at least from what I read (Generic source files), is made in such a way where recursion is use and abuse. Everything is a tree, and different tree have different functionality where the very same tree functions can be use, quite easy once you start reading the source files (because the documentation is not enough). The dumper code uses and abuses recursion too, I do not mind at all.
My concern, multithreading
Pretty much it works, nevertheless, I do not know if there will be problem due to multithreading, it is working though, at least for Linux in an x8664 machine; this may be due to Linux buffering but it may fail in any other architecture or OS.
static analysis
Ultimately this is static analysis and there may be problem if I misunderstood the standard, however, the static analysis is just a secondary effect of the real purpose, which is generate dumpfuncs.
It seemed ad hoc-ish, so I refactored everything in one day
Check ac59552
, it was a mess. After that I had to refactored the
code to make it look cleaner and to separate some features. I thought
that I could just make spaghetti code until it works and then
refactor. Nonetheless, I could not resist to refactor it, it was an
incremental mess, each time harder and harder. Now is better! I just
wanted to rant about it.
Erratas & disclaimer
Notice this is a direct excerpt from the README file.
I made it for my own sake. I have been developing and debugging other programs and I needed to make my own dumpfunc when intrusive debugging was not enough or when it was being to intrusive for the programs; therefore, I have tested dumper to some programs that I'm hacking on, it could not work or it could be outputting the incorrect dumpfuncs, if that is the case please let me know and I will check it out, however, I encourage you check the code and hack on it.
Also notice that I using it on Linux (X8664) with GCC version 12.2, it thus works with that version and I'm not sure for now if it will work for other Os, architecture, and so on; just like before, let me know and I will take a look at the issue.
Future ideas
- A binary output with tools that can generate the code rather than generating the output right away.
- Add more flags.
- Perhaps adding the aforementioned binary output within the ELF data.
- Add more static analysis options.