Evaluation
The semantics of C++ programs in BRiCk are written in weakest (liberal) pre-condition style. The general form of these rules is the following:
Note that wp
is a predicate in our separation logic (the fact that it returns a PROP
).
Informally you can think of it as capturing the pre-condition to the inputs (one of which is normally an expression) that are sufficient such that the code is safe and if the expression terminates, it terminates in a state in which its outputs satisfy the “continuation” (i.e. the final function argument to wp
).
Due to the structure of C++, BRiCk contains a separate weakest pre-condition modality for each syntactic category. These are defined in bedrock.lang.cpp.logic.wp.
Expressions
In BRiCk we break expression evaluation down into four weakest pre-condition assertions representing the different value categories of a C++ expression.
Modeling Temporaries
In the course of evaluating C++ programs, the language can construct objects that are later destroyed, this occurs for automatic storage duration variables (i.e. stack allocated variables and temporaries). C++ semantics guarantees that the lifetime of temporaries is well-bracketed, meaning that objects will be destroyed in the reverse order that they were constructed. In BRiCk we capture the stack of objects to be destroyed using the type t.
Inductive t : Set :=
| id (* = fun x => x *)
| delete (ty : decltype) (p : ptr) (* = delete_val ty p *)
(* ^^ this type has qualifiers but is otherwise a runtime type *)
| delete_va (va : list (type * ptr)) (p : ptr)
| seq (f g : t) (* = fun x => f $ g x *)
| par (f g : t) (* = fun x => Exists Qf Qg, f Qf ** g Qg ** (Qf -* Qg -* x) *)
.
Here id represents the identity, characterizing that nothing needs to be destroyed.
delete ty p
represents that the value at pointer p
(which should have type ty
) needs to be destroyed.
Note that to delete the value, the C++ abstract machine runs the destructor and reclaims the underlying memory.
Virtual dispatch is not used when invoking the destructor.
seq a b
means that the temporaries in a
must be destroyed before the temporaries in b
.
par a b
means that the temporaries in a
and the temporaries in b
are destroyed in an interleaved manner 1.
The meaning of these constructs is made precise by interpreting the syntax using interp.
#[local] Definition interp_body `{Σ : cpp_logic, σ : genv}
(interp : translation_unit -> FreeTemps -> epred -> mpred)
(tu : translation_unit) (free : FreeTemps) (Q : epred) : mpred :=
match free with
| FreeTemps.id => |={top}=> Q
| FreeTemps.seq f g => interp tu f $ interp tu g Q
| FreeTemps.par f g => |={top}=> Exists Qf Qg, interp tu f Qf ** interp tu g Qg ** (Qf -* Qg -* |={top}=> Q)
| FreeTemps.delete ty addr => destroy_val tu ty addr Q
| FreeTemps.delete_va va addr => |={top}=> addr |-> varargsR va ** Q
end.
mlock Definition interp `{Σ : cpp_logic, σ : genv}
: translation_unit -> FreeTemps -> epred -> mpred :=
fix interp tu free :=
interp_body interp tu free.
#[global] Arguments interp {_ _ _ _} _ free Q : assert. (* set names *)
l-values & x-values
l-values and x-values follow the same general structure.
Their weakest precondition rules are captured by wp_lval and wp_xval respectively (we show wp_lval
as our example).
(* [wp_lval σ E ρ e Q] evaluates the expression [e] in region [ρ]
* with mask [E] and continutation [Q].
*)
Parameter wp_lval
: forall {resolve:genv}, translation_unit -> region -> Expr -> M ptr.
pr-values
The final value category of C++ (pr-values) is slightly more complex than l-values and x-values. The C++ standard describes them as follows:
These are characterized by two predicate transformers.
Operands
wp_operand is used to evaluate a operand of a primitive operator. These operands are always primitives, since operators that accept aggregates are de-sugared to functions or methods.
(* evaluate a prvalue that "computes the value of an operand of an operator"
*)
Parameter wp_operand : forall {resolve:genv}, translation_unit -> region -> Expr -> M val.
Unlike wp_init
and wp_prval
, operands return vals.
Because the value returned does not have an identity, there is nothing to destroy, so the FreeTemps
returned to the continuation destroys only the temporaries created when evaluating the expression.
Initializing Aggregates
wp_init handles initialization of aggregates. The parameter is the following:
(* evaluate a prvalue that "initializes an object".
[wp_init tu ρ ty p e Q] evaluates [e] to initialize a value of type [ty]
at location [p] in the region [ρ]. The continuation [Q] is passed the
[FreeTemps.t] needed to destroy temporaries created while evaluating [e],
but does *not* include the destruction of [p].
The type [ty] and the type of the expression, i.e. [type_of e], are related
but not always the same. We call [ty] the *dynamic type* and [type_of e]
the *static type*. The two types should always be compatible, but the dynamic
type might have more information. For example, in the expression:
[[
int n = 7;
auto p = new C[n]{};
]]
When running the initializer to initialize the memory returned by [new],
the dynamic type will be [Tarray "C" 7], while the static type will be
[Tarray "C" 0] (the [0] is an artifact of clang).
The memory that is being initialized is already owned by the C++ abstract
machine. Therefore, schematically, a [wp_init ty addr e Q] looks like the
following: [[ addr |-> R ... 1 -* Q ]] This choice means that a thread
needs to give up the memory to the abstract machine when it transitions to
running a [wp_init]. In the case of stack allocation, there is nothing to
do here, but in the case of [new], the memory must be given up.
The C++ standard has many forms of initialization (see
<https://eel.is/c++draft/dcl.init>). The Clang frontend (and therefore our
AST) implements the different forms of initialization through elaboration.
For example, in aggregate pass-by-value the standard states that copy
initialization <https://eel.is/c++draft/dcl.init#general-14> is used;
however, the BRiCk AST contains an explicit [Econstructor] in the AST to
represent this. This seems necessary to have a uniform representation of
the various evoluations of initialization between different standards, e.g.
C++14, C++17, etc.
NOTE: this is morally [M unit], but we inline the definition of [M] and
ellide the [unit] value. *)
Parameter wp_init
: forall {resolve:genv}, translation_unit -> region ->
exprtype -> ptr -> Expr ->
(FreeTemps -> epred) -> (* free -> post *)
mpred. (* pre-condition *)
wp_init
takes a ptr that represents the location that the object is being constructed into.
Because of this, the post-condition does not consume a value but only a FreeTemps
used to destroy temporaries created during the evaluation of the expression.
To destroy the actual object constructed by wp_init ty into Q
, use FreeTemps.delete ty into
.
On top of wp_init
, we can define wp_prval
by universally quantifying the pointer that is being initialized and passing it to the continuation.
Definition wp_prval {resolve:genv} (tu : translation_unit) (ρ : region)
(e : Expr) (Q : ptr -> FreeTemps -> epred) : mpred :=
∀ p : ptr, wp_init tu ρ (type_of e) p e (Q p).
Note that the pointer p
is completely unconstrained in this definition.
In practice the C++ abstract machine will pick this pointer to be fresh and reserve it before proceeding to initialize during the evaluation of e
.
Initialization
Initialization is slightly more complex than wp_init
because you can initialize aggregates (using wp_init
), primitives (using wp_operand
), and references (using wp_lval
and wp_xval
).
To capture this, BRiCk defines wp_initialize
which provides the semantics to materialize a value into the C++ abstract machine state (which BRiCk reflects through separation logic assertions such as primR
).
wp_initialize
is defined by induction on the type of the value being initialized with special handling of const
qualifiers.
#[local] Definition wp_initialize_unqualified_body `{Σ : cpp_logic, σ : genv}
(u : bool) (tu : translation_unit) (ρ : region)
(cv : type_qualifiers) (ty : decltype)
(addr : ptr) (init : Expr) (Q : FreeTemps -> epred) : mpred :=
let UNSUPPORTED := funI m => |={top}=>?u UNSUPPORTED m in
if q_volatile cv then False%I
else
match ty with
| Tvoid =>
(*
[wp_initialize] is used to `return` from a function. The following
is legal in C++:
<<
void f();
void g() { return f(); }
>>
*)
letI* v, frees := wp_operand tu ρ init in
let qf := cQp.mk (q_const cv) 1 in
[| v = Vvoid |] **
(**
[primR] is enough because C++ code never uses the raw bytes
underlying an inhabitant of type void.
*)
(addr |-> primR Tvoid qf Vvoid -* |={top}=>?u Q frees)
| Tptr _
| Tmember_pointer _ _
| Tbool
| Tnum _ _
| Tchar_ _
| Tenum _
| Tfloat_ _
| Tnullptr =>
letI* v, free := wp_operand tu ρ init in
let qf := cQp.mk (q_const cv) 1 in
addr |-> tptsto_fuzzyR (erase_qualifiers ty) qf v -* |={top}=>?u Q free
(* non-primitives are handled via prvalue-initialization semantics *)
| Tarray _ _
| Tnamed _ => wp_init tu ρ (tqualified cv ty) addr init Q
| Tincomplete_array _ => UNSUPPORTED (initializing_type ty init)
| Tvariable_array _ _ => UNSUPPORTED (initializing_type ty init)
| Tref ty =>
let rty := Tref $ erase_qualifiers ty in
letI* p, free := wp_glval tu ρ init in (* TODO this needs to permit initialization from xval if [ty] is <<const>> *)
let qf := cQp.mk (q_const cv) 1 in
(*
[primR] is enough because C++ code never uses the raw bytes
underlying an inhabitant of a reference type.
TODO: [ref]s are never mutable, but we use [qf] here for
compatibility with [implicit_destruct_struct]
*)
addr |-> primR rty qf (Vref p) -* |={top}=>?u Q free
| Trv_ref ty =>
let rty := Tref $ erase_qualifiers ty in
letI* p, free := wp_xval tu ρ init in
let qf := cQp.mk (q_const cv) 1 in
(*
[primR] is enough because C++ code never uses the raw bytes
underlying an inhabitant of a reference type.
TODO: [ref]s are never mutable, but we use [qf] here for
compatibility with [implicit_destruct_struct]
*)
addr |-> primR rty qf (Vref p) -* |={top}=>?u Q free
| Tfunction _ => UNSUPPORTED (initializing_type ty init)
| Tqualified _ ty => |={top}=>?u False (* unreachable *)
| Tarch _ _ => UNSUPPORTED (initializing_type ty init)
| Tunsupported _ => UNSUPPORTED (initializing_type ty init)
| Tdecltype _ => UNSUPPORTED (initializing_type ty init)
| Texprtype _ => UNSUPPORTED (initializing_type ty init)
| Tparam _ | Tresult_param _ | Tresult_global _
| Tresult_unop _ _ | Tresult_binop _ _ _
| Tresult_call _ _ | Tresult_member_call _ _ _
| Tresult_member _ _ | Tresult_parenlist _ _ => UNSUPPORTED (initializing_type ty init)
end%I.
mlock
Definition wp_initialize_unqualified `{Σ : cpp_logic, σ : genv} :
∀ (tu : translation_unit) (ρ : region)
(cv : type_qualifiers) (ty : decltype)
(addr : ptr) (init : Expr) (Q : FreeTemps -> epred), mpred :=
Cbn (Reduce (wp_initialize_unqualified_body fupd_compatible)).
#[global] Arguments wp_initialize_unqualified {_ _ _ _} _ _ _ _ _ _ _%_I : assert. (* mlock bug *)
Definition wp_initialize `{Σ : cpp_logic, σ : genv} (tu : translation_unit) (ρ : region)
(qty : decltype) (addr : ptr) (init : Expr) (Q : FreeTemps -> epred) : mpred :=
qual_norm (wp_initialize_unqualified tu ρ) qty addr init Q.
#[global] Hint Opaque wp_initialize : typeclass_instances.
#[global] Arguments wp_initialize {_ _ _ _} _ _ !_ _ _ _ / : assert.
Function Call Semantics
The semantics for function calls is concerned with the way that we pass arguments to functions and (potentially) receive the return value. We note that it is important to handle the passing of primitives as well as aggregates, both of which are very common in C++. The semantics for function calls specifies how to pass arguments to functions and (potentially) get back the return value, both for primitives and for aggregates.
BRiCk follows the C++ standard by using initialization semantics to pass (and return) data to (and from) functions. It is also the style taken by Cerberus.
C++ leaves the lifetime of trivially destructible function arguments unspecified, as it is generally not visible to client programs. BRiCk follows the Itanium ABI as is documented in bedrock.lang.cpp.logic.call.
Destruction of function arguments is left implementation defined
in C++. See <https://eel.is/c++draft/expr.call#6> which states:
> It is implementation-defined whether the lifetime of a parameter
> ends when the function in which it is defined returns or at the
> end of the enclosing full-expression. The initialization and
> destruction of each parameter occurs within the context of the
> calling function.
In the BRiCk semantics, we follow the Itanium ABI which destroys
trivially destructible objects immediately and otherwise destroys
objects at the end of the full expression. See
<https://refspecs.linuxbase.org/cxxabi-1.83.html#call>.
Footnotes
- 1
par
can be used to describe the semantics of destruction in C, it is not needed for the semantics of C++ because C++ defines evaluation order as non-deterministic interleaving.