5. Antiquoted C¶
WARNING: Cogent antiquotation currently does not have any sanity checks. So please follow the guide closely otherwise debugging could be a huge pain.
5.1. Cheatsheet¶
Note
New users: please skip this section.
Antiquotes | Explanation |
$id |
for function identifiers or type identifiers when defining them |
$ty |
refer to a Cogent type |
$exp |
call a Cogent function, which has to have exactly one argument, as in Cogent; any Cogent expressions |
$spec |
specify the Cogent type of a C function (using typecast syntax), which is applied to exactly one argument |
$esc |
macros that should not be preprocessed before antiquoted C is compiled |
5.2. Overview¶
Cogent is a restricted language; there are many things that Cogent cannot express and should be deferred to native C code. In the foreign function interfaces between Cogent and C, it can be seen that accessing C code from Cogent and the other direction are not symmetric, as Cogent will be generated to C code. To access C code from Cogent, programmers write abstract types (types without definitions) and abstract functions on the Cogent side , and on the C side define the implementation of the types and functions. For the compiler to realise which implementation is for which interface, we need to make sure that the name of both sides coincide. But there is a problem: Cogent is polymorphic, and is structurally types, whereas C is monomorphic and nominally typed. What it means is that the Cogent compiler will generate “random” names for the generated interfaces (of course they are not truly random, they are just so random that a programmer can barely guess what they will be; see #322 for more details). To make the names on both sides coincide, Cogent provides a way for programmers correctly “guess” the generated names—antiquotations. That’s why the C components one write for a Cogent program is generally referred to as antiquoted C (it’s because we borrow the syntax of antiquotes; see G. Mainland, Why it’s nice to be quoted: quasiquoting for Haskell). And for accessing Cogent code from C, programmers also need a way to refer to the Cogent names. In a nutshell, in antiquoted C, programmers can write Cogent snippets (e.g. Cogent types, polymorphic function names, and many other things) and the Cogent compiler will compile each snippet in the same way as it compiled the Cogent program, resulting in a coherent set of names generated.
5.3. Modes in the compiler¶
We have two different modes for handling antiquotation. One is type
mode, with command-line argument --infer-c-type
. In this mode,
users can only define abstract parametric Cogent types. The output will
be placed to pre-defined directory, one file per monomorphised type.
Each file is #include
-d by the generated .h
file. Another mode
is function mode, which is enabled by --infer-c-func
flag to the
compiler (note: these two modes are not mutually exclusive). This mode
is for everything else, and the output filename is derived from the
input filename.
5.4. Function definitions¶
We can define an abstract function which has been declared in Cogent. For example, in Cogent we have:
foo : all (a, b). a -> b
Then in C, we can define function foo
by
$ty:b $id:foo ($ty:a arg) {
// ...
}
$id
, $ty
are antiquotes for identifiers and types respectively.
If the antiquoted code consists of only one single identifier starting
with a lower-case letter, then no parenthesis are required (e.g.
$ty:acc
), otherwise we have to write parentheses around it, like
$ty:(R a b)
and $ty:(U32)
. Note: If the antiquoted type is unit
or a tuple, then we have to have at least two pairs of parentheses, the inner
one for the tuple, and the outer one for antiquotation.
Functions defined using antiquotation can be parametrically
polymorphic, ad hoc polymorphic or monomorphic.
$id
is mostly needed by poly-functions only. [1]
The reason behind it is, a mono-function will be generated to a C
function with exactly the same name (modulo unsupported identifier
characters), thus the function name will stay intact. For a
poly-function, as we can see from the example above, the user doesn’t
write all
quantification as in Cogent. Type parameters, if any, are
introduced implicitly by the corresponding declaration in Cogent,
namely a
and b
in the above instance. Besides, we need to
generate new function names according to the instantiation. For these
reasons, the compiler needs to know what and where the function name is,
in order to establish a connection to the Cogent counterpart and to
generate new function instantiations. Thus $id
is a needed as a
signal to the function name. The compiler then will generate one copy
for each monomorphisation.
In a function definition, other forms of antiquotes are supported. By
exp:f
, we can invoke Cogent function f
. (In fact, $exp
theoretically supports all kinds of Cogent expressions, but this feature
is not well tested and largely experimental. Use with care!) As in
Cogent, if this function f
is polymorphic, then it has to be fully
applied to types. To call higher-order functions, as the function and
its argument are usually not given by Cogent antiquotes (i.e. they are C
expressions), we cannot directly call it using the $exp
antiquotation as above. E.g. we have the following scenario:
void foo ($ty:((A -> B, A)) arg) {
// ...
}
To apply the first component of the pair arg.p1
to the second
arg.p2
, in order to generate the dispatch function, we have to give
the type of the function – arg.p1
– to the compiler. We write
(($spec:(A -> B)) arg.p1) (arg.p2); // the parens around type specifier and function is necessary!
The syntax is actually for typecasting in C, we hijack (or better, embed) our semantics in it. This satisfies our principle that everything inside an antiquote is valid Cogent code.
One thing also worth mentioning here is that, antiquoted functions (no matter first order or higher order) can only be applied to exactly one argument, as in Cogent. Otherwise it will generate totally nonsensical code and the error message from the C compiler will not help in general. We are trying to implement some sanity checks in the antiquotation part.
5.5. Type declarations / Typedef’s¶
Similarly, we can define abstract Cogent types using antiquotation. For example,
-- Cogent
type R a b
type T a b c
// Antiquoted-C
struct $id:(R a b) {
// ...
};
typedef struct $id:(T x y z) {
// ...
} $id:(T x y z);
typedef struct $id:(R a b) $id:(R a b);
Most of the knowledge about it can be deduced from previous section,
which will not be repeated here. One difference is that users need to
write fully applied type constructors, namely with type arguments, and
they have to be identical to those given in Cogent. When using
typedef
, only one synonym can be given, if it’s antiquoted. And it
has to be the same as the type it is defined to. Something like
typedef struct $id:(X a) $id:(Y a)
is invalid.
Non-parametric abstract types cannot be used in this way, otherwise they will be put to the wrong output file. In order to refer to any Cogent types in the definition, what the users can do is to NOT antiquote the type name, and use it in the function mode, as the type name in C will be exactly the same as that in Cogent (modulo namespace renaming). E.g.,
-- Cogent
type R
// Antiquoted-C
struct $id:(C) { ... }; // wrong!
struct C { ... }; // correct!
5.6. Escape sequences¶
Any C code which is beyond the reach of the Haskell C parser
(http://hackage.haskell.org/package/language-c-quote) should be wrapped
by a $esc
. In particular, if you have any #include
’ed files that
don’t want to be preprocessed (usually for the reason that they contain
some language extensions which our C parser does not support), use
$esc
antiquote to escape.
Cogent also supports conditional compilation in the style of cpp (C
preprocessor). Directives (e.g. #define
, #if
, etc.) should also
be wrapped in $esc
so that they are left to the C compiler, instead
of (mistakenly) being processed by Cogent’s C preprocessor. For
statement level directives, you need the alternative $escstm
antiquote specifier rather than $esc
.
5.7. Expressions¶
We can antiquote any valid Cogent expressions, using $exp
antiquote.
They will be turned to statement-expression in C.
Footnotes
[1] | One special case is that, if you have an abstract monomorphic function which
uses a parametric abstract type (but instantiated, of course) that is not used anywhere else in
your Cogent program, and if $id is not used on the function name,
then this function will not be processed by the monomorphiser, thus always
dumped to the final artifact. But the parametric type, because no Cogent
function uses it (or no Cogent function specified in the --entry-funcs flag uses it),
it will not be generated in the C code, leading to a used-but-not-defined type.
It happens when you import such a library, which contains functions that you
never use.
In this case, use the $id antiquote on the function name. It will treat the
function as a poly-function, thus processed by the monomorphiser. The compiler can then know
that you indeed don’t use this function and won’t produce it in the final C program. |