Blog: Project Evolution
We list some major developments this project has undergone since its inception, in reverse-chronological order.
Dec 4, 2023
New release is out! It contains all the features mentioned below in this blog. As always, both the source code and a docker image for installation are provided. See the usage page for info on how to obtain and run CRAM.
Mid 2023
More advanced control-flow structures in Rust: CRAM now migrates C++'s three-way conditional question mark operator into a Rust if
expression. Switch statements are migrated into pattern match
ing statements.
Also, CRAM now handles all kinds of C++ loops, automatically (rather than by asking for user assistance to detect the loop structure, as in earlier versions): C++'s while
is identical to that of Rust, including break
and continue
escapes. For for
loops, which are very powerful in C++ and don't exist in Rust in the same generality, we detect many patterns that can be migrated to idiomatic Rust for
loops that iterate over a sequence or a container. More complex C++ for
loops whose idiomatic purpose we cannot abstract out of the code are migrated to Rust while
loops.
Early 2023
After a break, the project is back in full swing under a Phase-II contract.
C preprocessor handling: This is a beast and involves not only macros and conditional compilation, but also #include
directives, which ultimately leads to the entire issue of code organization into multiple files, which is done very differently in Rust than in (legacy) C++.
We translate macros into compile-time constants or functions, where possible, or inline them into the source, for "undisciplined" (not independently parseable) macros like #define INFO_MSG(M) "Info: " << (M)
.
Regarding #include
s: CRAM now first migrates legacy C++ projects that have multiple translation units each of which include header files into C++ modules, which were introduced in C++-20. There are no more #include
s in the code after this refactoring. If you have access to a very recent C++ compiler, you can build this code. In any case, the C++ modules are then migrated to Rust modules, which requires a few changes, because---as almost always---same-named concepts in the two languages are not actually the same.
As a result, CRAM can now migrate C++ projects into Rust: programs that consist of many source files, with dependencies among them.
Mid 2022
Intraprocedural refactoring of C++ const
pointers into references: int* const p = &i
becomes int& p = i
. Such pointers essentially implement aliasing, which the refactored references form makes explicit. It also prepares the code for a migration to Rust, where we also use references to implement such aliasing.
Alias-nest resolution: Rust does not permit nests of reference variables pointing to the same memory location, unless they are all const
. We implemented a refactoring that breaks up non-const alias nests, by unifying the variable name used to refer to the memory location, and eliminating all other reference variables.
Mid 2022
Migrating C++ templates into Rust generics: this requires Rust traits that define the features any type instantiating a template parameter must provide. Sometimes Rust has such traits built-in, such as the Add
trait for all types defining a +
binary operator. We can migrate templates requiring overloaded operators at this point.
Mid 2022
An "idiomaticity improvement": assignment statements in C++ that copy but actually can be implemented in Rust as a move. This is roughly the case if the right-hand side variable is no longer used after the assignment. The potential improvement can be applied to assignments of the explicit x = y
kind, but also to implicit assignments of function and method call arguments to formal parameters. (In some cases, we perform this improvement as a refactoring on the C++ side.)
Early 2022
The trim_front
procedure in the Valhalla Routing Library (in file src/midgard/util.cc
) is migrated into correct and idiomatic Rust, mostly automatically! This migration covers function and struct definitions, a non-trivial vector
traversal (modifying the container), as well as test cases and some (simple) I/O, which we added for assurance demonstration purposes. User assistance is requested currently for selecting the correct traversal pattern for the main loop in that procedure. CRAM correctly migrates iterators used in the traversal, calls the migration procedure recursively on the loop body, and plugs the result into the hole left for the loop body by the migrated loop pattern.
Early 2022
Container traversal as a language idiom: well-designed C++ code is assumed to rely nearly 100% on the STL for its data containers. The most common operation involving a container is perhaps to traverse it: forward or backward (if permitted by the container), modifying it or not, using one or two (or even more) iterators. Such fairly complex code patterns cannot be migrated line-by-line; this would at best lead to non-idiomatic Rust code. We built a library of idiomatic Rust code that implements various container traversal patterns. For now, the user specifies the pattern of the traversal to be migrated; CRAM performs the migration to Rust, by instantiating a Rust idiom.
Early 2022
An internal implementation issue: heterogeneous source code that contains elements from both C++ and Rust. This is inevitable in a gradual source-to-source translation process, and presents a challenge for a tool---like ours---working with intermediate code representations that are meant to be pretty-printable to human-readable code.
Early 2022
Implemented first migration steps, i.e., steps that actually produce Rust code. Basic declarations of variables and functions (this step relies on type migration from C++ to Rust, which is mostly straight-forward, but not quite so for references and pointers), basic expressions and statements (including assignments, function calls, simple loops), a proper main
function.
Early 2022
Open-sourced CRAM! We are not generating Rust code yet, but our refactoring steps can be used to harden C++ programs in several ways, which is an intended by-product of this project.
Early 2022
Hardened the C++ code, by introducing const
qualifiers in variable declarations wherever possible. This facilitates later migration to Rust, as const
(immutable) variables are subject to fewer restrictions than non-const
(mutable) ones. For example, Rust permits clusters of references to a variable, but only if they, and the variable, are all immutable. In C++ there are no such restrictions, so mutable-reference clusters must either be broken up, or converted to immutable-reference clusters if possible.
Early 2022
Added tests to validate the correctness of the refactoring steps. A test consists of a small C++ program and a manually refactored version. Our CI process compares the manually refactored version verbatim against the one produced by CRAM.
Early 2022
Implemented some basic refactoring steps on the C++ source:
- introduced explicit casts for mixed-type expressions. Such casts are beneficial in C++ and required in Rust;
- refactored C++ STL
list
instances intovector
s, to prepare migrating such code to using Rustvector
s. Thevector
type is more flexible and more efficient in Rust and generally recommended. We expect to fine-tune this refactoring later.
Late 2021
Successfully parsed C++ code, using the Valhalla Routing Library as testbed. Parsing relies heavily on the in-house Software Evolution Library, which creates an intermediate representation that can be pretty-printed back to human-readable, formatted code ("round-trip"). This is critical for any effort that aims at generating human-maintainable source code.
Late 2021
Created a precise software architecture for CRAM, which you can see on the landing page. The architecture consists of various muses inside CRAM's IDE, which is based on the Mnemosyne software development assistant. The muses are in charge of parsing the C++, refactoring it to prepare the migration to Rust, perform the migration based on detecting and translating idiomatic language elements into "sketches", and finally filling any holes in the sketches to arrive at the generated Rust code.
November 2021
Project started!