Make C++ a better place #3: D as an alternative


The D programming language, commonly known as DLang, offered a fresh perspective on system-level programming by blending familiar features of C++ with modern mechanisms to enhance code safety, simplicity, and performance. Created by Walter Bright, it is meant to be a viable alternative to C++ for developers seeking to improve productivity and code safety while maintaining the low-level control that C++ offers.

The best of both worlds

DLang is a general-purpose, statically typed programming language with many parallels to C++. Like C++, it supports object-oriented programming, manual memory management and systems programming. However, DLang builds on these fundamentals by introducing features that mitigate some of the complexities and pitfalls of C++, like garbage collection.

Producer/consumer implementation with D

If you didn’t see the first article of this series, please read it because I explained there what are the things I want to check about each C++ alternative and how I’m going to do that.

Below you case see the implementation of the reference producer/consumer application written in D:

import std.stdio;                          // needed for writeln
import std.datetime.stopwatch : StopWatch; // needed for time measurement
import core.atomic;                        // needed for atomic operations
import core.sync.mutex;                    // needed for Mutex
import core.thread;                        // needed for Thread
// mutex for shared resource access synchronization
shared Mutex mtx;
// flag shared between the threads
shared bool is_finished = false;
// initialization of global variables at runtime
shared static this() {
    mtx = new shared Mutex();
}
// interface class
interface WorkerInterface
{
    // public class members
    void Run() shared;
}
// templated alias for a type
alias SharedContainerPtr(T) = T[]*;
// template class definition + inheritance
class Producer(T) : WorkerInterface
{
    // class constructor and members initialization
    // together with passing arguments by const reference and non-const reference
    this(const T[3] source_data, SharedContainerPtr!T shared_container)
    {
        source_data_ = source_data;
        shared_container_ = shared_container;
        // if statement
        if (!shared_container_) {
            // error reporting by throwing an exception
            throw new Exception("Given shared_container is null!");
        }
    }
    override shared void Run()
    {
        // indexed for loop
        for (uint i=0U; i<10_000U; i++) {
            // range based for loop
            foreach (const ref element; source_data_) {
                // print statement
                writeln("Putting ", element, " into the shared container");
                // blocking reading and writing from/to the shared container
                synchronized (mtx) {
                    *shared_container_ ~= element;
                }
            }
        }
        is_finished.atomicStore(true);
        writeln("Producer done");
    }
// private class members
private:
    // fixed-length array with a generic type
    T[3] source_data_;
    SharedContainerPtr!T shared_container_;
}

class Consumer(T) : WorkerInterface
{
    this(const uint id, SharedContainerPtr!T shared_container)
    {
        id_ = id;
        last_size_ = shared_container.length;
        shared_container_ = shared_container;

        if (!shared_container_) {
            throw new Exception("Given shared_container is null!");
        }
    }
    override shared void Run()
    {
        // infinite while loop
        while (true) {
            // waiting for the input from another thread
            if (!(shared_container_.length > last_size_ || is_finished.atomicLoad())) {
                continue;
            }
            if (is_finished.atomicLoad()) {
                break;
            }
            // blocking writing to the shared container
            synchronized (mtx) {
                writeln("Consumer ", id_, " noticed new element: ", (*shared_container_)[$-1]);
                last_size_ = shared_container_.length;
            }
        }
        writeln("Consumer ", id_, " done");
    }

private:
    uint id_;
    size_t last_size_;
    SharedContainerPtr!T shared_container_;
}
// enum definition
enum WorkerType
{
    kProducer,
    kConsumer
}
// template function with a compile time argument and variadic arguments
WorkerInterface CreateWorker(WorkerType worker_type, Args...)(Args args)
{
    // compile time if statement excluding constructors of the classes to which
    // current arguments don't fit - an ordinary runtime if would cause a compilation error
    // because Producer's and Consumer's constructors have different signatures
    static if (worker_type == WorkerType.kProducer) {
        // dynamic memory allocation of the polymorphic type
        return new Producer!string(args);
    } else static if (worker_type == WorkerType.kConsumer) {
        return new Consumer!string(args);
    } else {
        throw new Exception("Unsupported worker type!");
    }
}
// application entry point
void main()
{
    // start measuring execution time
    auto sw = StopWatch();
    sw.start();
    try {
        // fixed-length array
        string[3] source_data = ["Hello", "world", "!!!"];
        // variable-length array
        string[] shared_container;
        // creation of the polymorphic types
        WorkerInterface producer = CreateWorker!(WorkerType.kProducer)(source_data, &shared_container);
        WorkerInterface first_consumer = CreateWorker!(WorkerType.kConsumer)(0, &shared_container);
        WorkerInterface second_consumer = CreateWorker!(WorkerType.kConsumer)(1, &shared_container);
        // start 3 threads
        auto producer_thread = new Thread(() => (cast(shared)producer).Run());
        auto first_consumer_thread = new Thread(() => (cast(shared)first_consumer).Run());
        auto second_consumer_thread = new Thread(() => (cast(shared)second_consumer).Run());

        producer_thread.start();
        first_consumer_thread.start();
        second_consumer_thread.start();

        producer_thread.join();
        first_consumer_thread.join();
        second_consumer_thread.join();
    }
    // error handling
    catch (Exception e) {
        writeln("Error: ", e.msg);
    }
    // measure execution time and display the result
    sw.stop();
    writeln("Time elapsed: ", sw.peek.total!"msecs", "ms");
}

So accordingly to the list of checks that I’m interested in, D gives us the following statistics:

  • length of the code – this code is 114 lines long (124 lines for C++)
  • build time – it takes 233ms to build this application (2149ms for C++) with command:
dmd -O -release -inline -boundscheck=off producer_consumer.d
  • execution time – it takes 522ms to run it (597ms for C++)
  • binary size – 1MB (105kB for C++)

What I like about this code?

Fewer imports

D requires fewer imports than C++. Additionally, whenever I used a function and forgot to add the associated import statement, the compiler was suggesting the full import path required for the given function.

Concise loop syntax

D has very concise loop syntax – no const auto reference mumbo-jumbo, just call foreach (element; container) and you’re done.

Synchronized block

synchronized block is something what really attracted my attention. The interesting part about it is that it can be used not only with mutexes, but with any member variable to assure that it will be accessed only from one thread at a time.

Easy polymorphism

No need for explicit pointers in case of polymorphic types.

Mandatory shared keyword

C++ allows user to add a lot of member function’s specifiers like const, override, final, noexcept etc., but still there is nothing what would let you know, just by looking on the function’s signature, if certain member function is designed to be used in multiple threads or not. In D, it is necessary to use shared keyword for functions which are meant to be used in multiple threads.

What I don’t like about this code?

The amount of code to write

Although the code is slightly shorter than C++ version, the overall experience and amount of writing is generally similar to C++, so D won’t save us time or effort when it comes to pure amount of typing.

Manual threads management

Dealing with threads in D required creation, starting and joining them manually. I read about (and tried to use) message exchange mechanism based on send and receive, but this seems to be more focused on sending some message directly to some specific thread, not to broadcast the information to all the other threads, as in my example.

Strange choice of some operators

I know that the decision regarding what various symbols represent in the programming language syntax is arbitrary, but even after spending some time with D, it still feels very odd to me when I use ~ as an operator for adding elements to array – most likely because most languages use ~ for bitwise negation (C, C++, Java, Python) or Regex matching (Ruby, Perl).

Using D to write C++ code for the existing code bases

D compiles directly to the machine code, so there’s no out-of-the-box way to generate C++ code out of it.

Using existing C++ code in D

To use C++ code inside D code, you must first put the objects’ definitions into extern (C++) block. I wasn’t however able to use the C++ code “as is” – the template function IsEqual turned out to be problematic and I wasn’t able to build the program until I added a template specialization to the C++ source code:

template<>
bool IsEqual<Point>(const Point a, const Point b)
{
    return a == b;
}

That’s a weak point because since I don’t want to have a different behavior for Point type other than for the generic type, I had to essentially duplicate the function to be able to use it in D. Overall, I ended up with the following D implementation of our reference C++ library user:

import std.stdio;

extern (C++)
{
    struct Point
    {
        int x, y;
    }
    int Add(const int a, const int b);
    bool IsEqual(T)(const T a, const T b);
}

void main()
{
    Point p1 = Point(12, 24);
    Point p2 = Point(36, 48);
    const int adding_result = Add(p1.x, p2.x);
    const bool comparison_result = IsEqual(p1, p2);
    
    writeln("adding_result = ", adding_result);
    writeln("comparison_result = ", comparison_result);
}

It can be compiled using the following command:

g++ -c cpp_functionality.cpp &&
dmd lib_user.d cpp_functionality.o

Other interesting D features

DUB

DLang has a dedicated package manager called Dub, which can not only download dependencies but also work as a build system for the entire project by providing a predefined directory structure and a project file (.json or .sdl) which contains the description of the project and its dependencies what allows DUB to download these dependencies on the fly.

An example dub.json configuration file might look like below:

{
    "name": "myproject",
    "description": "A simple DLang project",
    "dependencies": {
        "vibe-d": "~>0.9.4"
    }
}

Compile-time execution as a core philosophy

A distinctive feature of DLang is its emphasis on moving code execution from runtime to compile-time whenever possible. The language encourages developers to leverage compile-time evaluation to reduce runtime errors and performance overhead. For instance, if a function can be determined and executed at compile time, DLang will automatically move it to compile-time execution based on context.

For example, developers may utilize so-called template constraints which allow to impose compile-time conditions on function arguments to ensure that e.g. a given type is integral or that a value falls within a specific range. These constraints are written using an ordinary if statement under the function signature and are evaluated at compile time.

import std.traits;
import std.stdio;

T add(T)(T a, T b)
    if (isIntegral!T)
{
    return a + b;
}

void main() {
    writeln(add(3, 4));   // Works fine
    writeln(add(3.5, 4)); // Fails: 3.5 is not an integral type
}

Another example may be template blocks which enable you to create generic versions of entire code blocks containing various things like variables, functions, type definitions, type instantiations etc. The compiler then generates type-specific versions of the code for every required type.

import std.traits;
import std.stdio;

template MyBlock(T)
{
    T my_variable;

    void print_my_variable() {
        writeln(my_variable);
    }

    struct MyStruct {
        T member_value;
    }

    MyStruct struct_instance;

    auto print_member_value() {
        writeln(struct_instance.member_value);
    }
}

alias IntBlock = MyBlock!int;
alias FloatBlock = MyBlock!float;

void main() {
    FloatBlock.my_variable = 3.5;
    FloatBlock.print_my_variable();

    IntBlock.struct_instance.member_value = 12;
    IntBlock.print_member_value();
}

Additionally, DLang performs compile-time checks for errors like array bounds violations, catching potential issues earlier in the development process. For static arrays, if an out-of-bounds access is attempted, the compiler will throw an error during compilation.

void main() {
    int[3] arr = [1, 2, 3];
    int outOfBounds = arr[12]; // Compile-time error
}

RDMD

DLang offers the convenience of interpreted languages through a tool called rdmd. This tool compiles and executes DLang code on the fly, making it feel like a scripting language. It’s particularly useful during development, allowing developers to quickly run and test code without explicitly managing the compilation process.

Here’s an example of an on-the-fly compiled DLang script:

#!/usr/bin/env rdmd
import std.stdio;

void main() {
    writeln("Running DLang with rdmd!");
}

Universal Function Call Syntax (UFCS)

One features that significantly improves code readability is Universal Function Call Syntax. UFCS allows functions to be called as if they were methods of the object being passed, simplifying the way chained function calls are written. For example, instead of writing parallel(iota(obj.Method())), you can write obj.Method.iota.parallel. This seems to be very useful in practice e.g. when running a loop which can be parallelized – in such case we can just add .parallel at the end of the container we’re iterating over.

import std.parallelism;
import std.stdio;

void main() {
    auto nums = [1, 2, 3, 4, 5];
    foreach (num; nums.parallel) {
        writeln(num);
    }
}

Flexible importing mechanisms

In addition to importing whole modules, developers can selectively import specific functions from a module. For example, you could import just the writeln function from std.stdio using:

import std.stdio : writeln;

void main() {
    writeln("Importing just writeln");
}

Moreover, imports can be scoped, limiting their visibility to specific blocks of code:

void main() {
    {
        import std.stdio;
        writeln("This writeln works");
    }

    writeln("Error, writeln is not defined in this scope");
}

Memory management and safety

When it comes to memory management and safety, DLang provides an interesting combination of garbage collection and manual memory management. By default, DLang uses garbage collection, freeing developers from manually managing memory allocation and deallocation. However, if fine-grained control is needed, developers can disable the garbage collector in specific functions by marking them with @nogc. This enables manual memory management, with mechanisms known from C, so the memory deallocation must always appear in the code, otherwise there’s a memory leak.

DLang gives however some useful built-in mechanisms which help to make this manual memory management safer. For example, all functions can be marked with different attributes, by default it’s @system. When we want safer code, we can mark a certain function as @safe what will prevent calling any @system function inside, including malloc which is marked as @system. Another example of an operation which will be blocked in @safe function is casting from void*:

void safe_function(void* input) @safe {
    int* ptr1 = cast(int*)malloc(4); // Compilation error - malloc not allowed here
    int* ptr2 = cast(int*)input;     // Compilation error - cast from void* not allowed here
}

Function attributes are especially interesting in combination with D improvement proposal DIP1021 which introduces ability to detect memory leaks at compile time by adding @live to a function declaration. It says that:

At any point in the program, for each memory object, there is exactly one live mutable pointer to it or all the live pointers to it are read-only.

Example:

@live function() @nogc
{
    int* ptr = cast(int*)malloc(int.sizeof*5);
}

The compilation of this function will fail with an error stating that ptr is left dangling (due to missing free). This syntax will also detect double allocations and double free of the same pointer. This makes memory management manual, but memory management checks automatic.

Another interesting feature that helps with safer memory management is scope guard statements. They allow for executing some actions when the scope exits which gives the user the ability, for example, to plan memory deallocation directly after allocation instead of at the end of the function. This is better than necessity to remember about calling free at the end, but as free can be forgotten the same can be the scope guard statement, so it can’t be treated as any guarantee of no memory leaks. It seems, however, nice and useful when interfacing with some C code.

import core.stdc.stdlib;
import std.stdio;

void safe_function() {
    int* ptr = cast(int*)malloc(4);
    scope(exit) {
        writeln("Deallocating memory");
        free(ptr);
    }

    writeln("Doing something...");
}

void main() {
    safe_function();
}

The above code produces the following output (notice the order of logs):

Doing something...
Deallocating memory

Consistent and intuitive syntax for arrays

In DLang, arrays have a more intuitive syntax than in C++. Static arrays are declared with type[size] and dynamic arrays use just type[] without specifying the size. Initializing arrays also follows a more natural approach using square brackets [] rather than the curly braces {} in C++ or parentheses () in CppFront.

import std.stdio;

void main() {
    int[3] static_array = [1, 2, 3];
    int[] dynamic_array = [4, 5, 6];

    writeln(static_array);
    writeln(dynamic_array);
}

Immutability of variables

DLang provides two ways of declaring immutability: const and immutable. The immutable keyword is supposed to work as a stronger const, however it still allows for casting to a non-immutable type which allows value modifications which somewhat breaks the entire concept.