wingologA mostly dorky weblog by Andy Wingo2010-09-07T20:43:44Ztekutihttps://wingolog.org/feed/atomAndy Wingohttps://wingolog.org/abusing the c compilerhttps://wingolog.org/2010/09/07/abusing-the-c-compiler2010-09-07T20:43:44Z2010-09-07T20:43:44Z

code reading

Today I found something really neat in Larceny's foreign function interface. The deal is that often times you need to parse a C structure or a preprocessor definition, and man, parsing C makes a body feel lazy. What's a hacker to do?

Larceny has an amusing take on this problem. The code looks straightforward enough:

;; parse out ent->d_name as a string
(define (dirent->name ent)
  (define-c-info (include<> "dirent.h")
    (struct "dirent" (name-offs "d_name")))
  (%peek-string (+ ent name-offs)))

The define-c-info block calculates name-offs, which is the offset of d_name in the dirent structure. %peek-string is something internal to Larceny that takes a memory address of a NUL-terminated C string and returns a Scheme string.

I had imagined, looking at this, that they had some kind of database of the headers and such, and in a sense they do -- in the form of the C compiler. define-c-info is a macro that runs the C compiler at macro expansion time, compiling and running a generated C program that spits out the relevant information as an s-expression on its stdout.


some people like diagrams

So in this case, if the d_name field starts 11 bytes into the structure, the generated C program will print out (11) on its stdout, and that number gets read in and inserted into the program. In that way dirent->name expands to something like:

(define (dirent->name ent)
  (define name-offs 11)
  (%peek-string (+ ent name-offs)))

Cool, no? The C compiler is only needed at compile-time, not at run-time.

Further details can be seen at Felix Klock's 2008 paper on Larceny's FFI.