GrammaTech
Repo on GitLab

Blog: Project Evolution

We list some major developments this project has undergone since its inception, in reverse-chronological order.

Dec 4, 2023

New release is out! It contains all the features mentioned below in this blog. As always, both the source code and a docker image for installation are provided. See the usage page for info on how to obtain and run CRAM.

Mid 2023

More advanced control-flow structures in Rust: CRAM now migrates C++'s three-way conditional question mark operator into a Rust if expression. Switch statements are migrated into pattern matching statements.

Also, CRAM now handles all kinds of C++ loops, automatically (rather than by asking for user assistance to detect the loop structure, as in earlier versions): C++'s while is identical to that of Rust, including break and continue escapes. For for loops, which are very powerful in C++ and don't exist in Rust in the same generality, we detect many patterns that can be migrated to idiomatic Rust for loops that iterate over a sequence or a container. More complex C++ for loops whose idiomatic purpose we cannot abstract out of the code are migrated to Rust while loops.

Early 2023

After a break, the project is back in full swing under a Phase-II contract.

C preprocessor handling: This is a beast and involves not only macros and conditional compilation, but also #include directives, which ultimately leads to the entire issue of code organization into multiple files, which is done very differently in Rust than in (legacy) C++.

We translate macros into compile-time constants or functions, where possible, or inline them into the source, for "undisciplined" (not independently parseable) macros like #define INFO_MSG(M) "Info: " << (M).

Regarding #includes: CRAM now first migrates legacy C++ projects that have multiple translation units each of which include header files into C++ modules, which were introduced in C++-20. There are no more #includes in the code after this refactoring. If you have access to a very recent C++ compiler, you can build this code. In any case, the C++ modules are then migrated to Rust modules, which requires a few changes, because---as almost always---same-named concepts in the two languages are not actually the same.

As a result, CRAM can now migrate C++ projects into Rust: programs that consist of many source files, with dependencies among them.

Mid 2022

Intraprocedural refactoring of C++ const pointers into references: int* const p = &i becomes int& p = i. Such pointers essentially implement aliasing, which the refactored references form makes explicit. It also prepares the code for a migration to Rust, where we also use references to implement such aliasing.

Alias-nest resolution: Rust does not permit nests of reference variables pointing to the same memory location, unless they are all const. We implemented a refactoring that breaks up non-const alias nests, by unifying the variable name used to refer to the memory location, and eliminating all other reference variables.

Mid 2022

Migrating C++ templates into Rust generics: this requires Rust traits that define the features any type instantiating a template parameter must provide. Sometimes Rust has such traits built-in, such as the Add trait for all types defining a + binary operator. We can migrate templates requiring overloaded operators at this point.

Mid 2022

An "idiomaticity improvement": assignment statements in C++ that copy but actually can be implemented in Rust as a move. This is roughly the case if the right-hand side variable is no longer used after the assignment. The potential improvement can be applied to assignments of the explicit x = y kind, but also to implicit assignments of function and method call arguments to formal parameters. (In some cases, we perform this improvement as a refactoring on the C++ side.)

Early 2022

The trim_front procedure in the Valhalla Routing Library (in file src/midgard/util.cc) is migrated into correct and idiomatic Rust, mostly automatically! This migration covers function and struct definitions, a non-trivial vector traversal (modifying the container), as well as test cases and some (simple) I/O, which we added for assurance demonstration purposes. User assistance is requested currently for selecting the correct traversal pattern for the main loop in that procedure. CRAM correctly migrates iterators used in the traversal, calls the migration procedure recursively on the loop body, and plugs the result into the hole left for the loop body by the migrated loop pattern.

Early 2022

Container traversal as a language idiom: well-designed C++ code is assumed to rely nearly 100% on the STL for its data containers. The most common operation involving a container is perhaps to traverse it: forward or backward (if permitted by the container), modifying it or not, using one or two (or even more) iterators. Such fairly complex code patterns cannot be migrated line-by-line; this would at best lead to non-idiomatic Rust code. We built a library of idiomatic Rust code that implements various container traversal patterns. For now, the user specifies the pattern of the traversal to be migrated; CRAM performs the migration to Rust, by instantiating a Rust idiom.

Early 2022

An internal implementation issue: heterogeneous source code that contains elements from both C++ and Rust. This is inevitable in a gradual source-to-source translation process, and presents a challenge for a tool---like ours---working with intermediate code representations that are meant to be pretty-printable to human-readable code.

Early 2022

Implemented first migration steps, i.e., steps that actually produce Rust code. Basic declarations of variables and functions (this step relies on type migration from C++ to Rust, which is mostly straight-forward, but not quite so for references and pointers), basic expressions and statements (including assignments, function calls, simple loops), a proper main function.

Early 2022

Open-sourced CRAM! We are not generating Rust code yet, but our refactoring steps can be used to harden C++ programs in several ways, which is an intended by-product of this project.

Early 2022

Hardened the C++ code, by introducing const qualifiers in variable declarations wherever possible. This facilitates later migration to Rust, as const (immutable) variables are subject to fewer restrictions than non-const (mutable) ones. For example, Rust permits clusters of references to a variable, but only if they, and the variable, are all immutable. In C++ there are no such restrictions, so mutable-reference clusters must either be broken up, or converted to immutable-reference clusters if possible.

Early 2022

Added tests to validate the correctness of the refactoring steps. A test consists of a small C++ program and a manually refactored version. Our CI process compares the manually refactored version verbatim against the one produced by CRAM.

Early 2022

Implemented some basic refactoring steps on the C++ source:

Late 2021

Successfully parsed C++ code, using the Valhalla Routing Library as testbed. Parsing relies heavily on the in-house Software Evolution Library, which creates an intermediate representation that can be pretty-printed back to human-readable, formatted code ("round-trip"). This is critical for any effort that aims at generating human-maintainable source code.

Late 2021

Created a precise software architecture for CRAM, which you can see on the landing page. The architecture consists of various muses inside CRAM's IDE, which is based on the Mnemosyne software development assistant. The muses are in charge of parsing the C++, refactoring it to prepare the migration to Rust, perform the migration based on detecting and translating idiomatic language elements into "sketches", and finally filling any holes in the sketches to arrive at the generated Rust code.

November 2021

Project started!