Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

C API

Tensogram exposes a flat C ABI through the tensogram-ffi crate. The generated header is tensogram.h; all public functions are prefixed tgm_, most public types follow the tgm_*_t pattern (the option structs TgmEncodeMaskOptions and TgmDecodeMaskOptions are PascalCase exceptions inherited from the underlying Rust types), and all error codes are members of the tgm_error enum.

The C++ wrapper at cpp/include/tensogram.hpp is built directly on top of this C API; see C++ API for the higher-level RAII-and-exceptions front-end. This page covers the C side: how to get the library, how to link it, and what to be aware of regarding versioning.

Three install paths

PathWhen to useToolchain required
Prebuilt binary tarballQuick local install, no Rust toolchainnone (just tar, pkg-config, cc)
cargo cinstallCustom prefix, or platform/arch we do not pre-buildRust stable
Build from sourceHacking on the FFI itselfRust stable + clone of the repo

Each tagged release publishes two tarballs at https://github.com/ecmwf/tensogram/releases:

  • tensogram-ffi-<VERSION>-linux-x86_64.tar.gz
  • tensogram-ffi-<VERSION>-macos-aarch64.tar.gz

The Linux tarball is built on AlmaLinux 8 (glibc 2.28; ABI-compatible with the manylinux_2_28 wheel platform tag, works on RHEL 8 / Debian 11 / Ubuntu 20.04 and newer). The macOS tarball is built on Apple Silicon. For other platforms (linux-aarch64, macos-x86_64, etc.) build from source with cargo cinstall below.

Each tarball is rooted for /usr/local (the bundled tensogram.pc hard-codes prefix=/usr/local), and is packed with uid=0 / gid=0 so extraction under sudo produces root-owned files. The default install is:

VERSION=<release-version>          # e.g. 0.20.0
PLATFORM=linux-x86_64              # or macos-aarch64
ASSET=tensogram-ffi-${VERSION}-${PLATFORM}.tar.gz

curl -LO "https://github.com/ecmwf/tensogram/releases/download/${VERSION}/${ASSET}"
sudo tar --no-same-owner -C /usr/local -xzf "${ASSET}"
sudo ldconfig                      # Linux: refresh dynamic linker cache
pkg-config --modversion tensogram  # → ${VERSION}

After extraction the layout under /usr/local is:

/usr/local/
├── lib/
│   ├── libtensogram.so.0.20.0          (Linux: real shared library)
│   ├── libtensogram.so.0.20            → libtensogram.so.0.20.0
│   ├── libtensogram.so                 → libtensogram.so.0.20.0
│   ├── libtensogram.0.20.0.dylib       (macOS: real dylib)
│   ├── libtensogram.0.20.dylib         → libtensogram.0.20.0.dylib
│   ├── libtensogram.dylib              → libtensogram.0.20.0.dylib
│   ├── libtensogram.a
│   └── pkgconfig/tensogram.pc
└── include/
    └── tensogram/
        └── tensogram.h

Plus LICENSE, README.md, and INSTALL.md at the install root.

For a non-default prefix, use cargo cinstall (next section) — the .pc file embeds the prefix at build time, so simply extracting the tarball under a different directory will leave pkg-config returning broken paths.

cargo-c is a cargo subcommand that builds and installs the C-callable artefacts in one step. Unlike plain cargo build, it produces a properly versioned shared library with SONAME symlinks, a pkg-config .pc file, and an installed header at the right path.

# One-time: install cargo-c (uses your default stable toolchain).
cargo install cargo-c

# Build + install Tensogram FFI under any prefix you control.
# --libdir=lib pins the layout (otherwise Debian/Ubuntu multiarch
# would pick lib/<triplet>, breaking PKG_CONFIG_PATH guesses below).
cargo cinstall --release -p tensogram-ffi \
    --prefix="$HOME/.local" --libdir=lib

# Verify pkg-config can see it.
export PKG_CONFIG_PATH="$HOME/.local/lib/pkgconfig:$PKG_CONFIG_PATH"
pkg-config --modversion tensogram

The cargo-c metadata is committed in rust/tensogram-ffi/Cargo.toml, so no extra configuration is needed.

Build from source

Plain cargo build continues to work for in-tree development: it produces target/release/libtensogram_ffi.{a,so,dylib} (note the _ffi suffix). The C++ wrapper’s CMake integration uses this path.

cargo build --release -p tensogram-ffi
# Outputs:
#   target/release/libtensogram_ffi.a
#   target/release/libtensogram_ffi.so   (or .dylib on macOS)
#   rust/tensogram-ffi/tensogram.h       (regenerated by build.rs cbindgen)

There is no SONAME, no pkg-config file, and no header at a system include path — those are cargo-c’s job. cargo build is the contributor flow; cargo cinstall is the user flow.

Linking against the installed library

cc $(pkg-config --cflags tensogram) my_program.c \
   $(pkg-config --libs tensogram) \
   -o my_program

--cflags returns -I<includedir> so your code uses #include <tensogram/tensogram.h>. --libs returns -L<libdir> -ltensogram.

CMake

cmake_minimum_required(VERSION 3.16)
project(my_project C)

find_package(PkgConfig REQUIRED)
pkg_check_modules(TENSOGRAM REQUIRED IMPORTED_TARGET tensogram)

add_executable(my_program my_program.c)
target_link_libraries(my_program PRIVATE PkgConfig::TENSOGRAM)

Manual flags (when pkg-config is not available)

# Linux
cc -I/usr/local/include my_program.c \
   -L/usr/local/lib -ltensogram \
   -ldl -lpthread -lm \
   -o my_program

# macOS
cc -I/usr/local/include my_program.c \
   -L/usr/local/lib -ltensogram \
   -framework CoreFoundation -framework Security -framework SystemConfiguration \
   -lc++ -lm \
   -o my_program

The Libs.private field of tensogram.pc lists the platform-specific support libraries; pkg-config picks them up automatically when linking the static archive (pkg-config --static --libs tensogram).

Quick start

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <tensogram/tensogram.h>

int main(void) {
    const char *meta_json =
        "{\"descriptors\":[{"
        "\"type\":\"ntensor\",\"ndim\":1,\"shape\":[4],\"strides\":[4],"
        "\"dtype\":\"float32\",\"byte_order\":\"little\","
        "\"encoding\":\"none\",\"filter\":\"none\",\"compression\":\"none\""
        "}]}";

    float in[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
    const uint8_t *ptrs[1] = { (const uint8_t*)in };
    const size_t lens[1] = { sizeof(in) };

    tgm_bytes_t enc = {0};
    if (tgm_encode(meta_json, ptrs, lens, 1, "xxh3", 0, &enc) != TGM_ERROR_OK) {
        fprintf(stderr, "encode failed: %s\n", tgm_last_error());
        return 1;
    }

    tgm_message_t *msg = NULL;
    if (tgm_decode(enc.data, enc.len,
                   /*native_byte_order=*/1, /*threads=*/0,
                   /*verify_hash=*/0, &msg) != TGM_ERROR_OK) {
        fprintf(stderr, "decode failed: %s\n", tgm_last_error());
        tgm_bytes_free(enc);
        return 1;
    }

    size_t out_len = 0;
    const uint8_t *out_data = tgm_object_data(msg, 0, &out_len);
    printf("decoded %zu bytes, equal=%s\n", out_len,
           (out_len == sizeof(in) && memcmp(out_data, in, out_len) == 0) ? "yes" : "no");

    tgm_message_free(msg);
    tgm_bytes_free(enc);
    return 0;
}

Memory ownership

  • Handles returned by tgm_* constructors (e.g. tgm_decode returns tgm_message_t **) are owned by the caller. Free them with the matching tgm_*_free function.
  • Pointers returned by accessor functions (e.g. tgm_object_data, tgm_object_shape) are borrowed from the parent handle and are valid only until the parent is freed.
  • tgm_bytes_t returned by encode functions must be freed with tgm_bytes_free.
  • tgm_last_error() returns a thread-local pointer to the most recent error message. The string is owned by the FFI layer; do not free it. Treat it as valid only until the next tgm_* call on the same thread.

Error handling

Every fallible tgm_* function returns a tgm_error enum value. TGM_ERROR_OK (= 0) is success; any other value indicates failure and populates the thread-local error string accessible via tgm_last_error().

tgm_error rc = tgm_encode(...);
if (rc != TGM_ERROR_OK) {
    const char *msg = tgm_last_error();
    fprintf(stderr, "encode failed [code=%d]: %s\n",
            (int)rc, msg ? msg : "(no detail)");
    return 1;
}

The error enum is non-exhaustive; future Tensogram releases may add new variants without bumping the major version. Always treat any non-TGM_ERROR_OK value as failure and rely on tgm_last_error() for human-readable detail.

Versioning policy

Tensogram is currently labelled “Emerging” software (pre-1.0). The ABI policy reflects that:

  • The C library SONAME is MAJOR.MINOR (e.g. libtensogram.so.0.20). Every minor release bumps the SONAME, so consumers must rebuild. Patch releases (0.20.x → 0.20.y) keep the SONAME stable.
  • The C ABI may change between minor releases without an explicit deprecation cycle.
  • The wire format version is independent of the SONAME and lives in the TGM_WIRE_VERSION constant exposed by tensogram.h. See plans/WIRE_FORMAT.md for the wire-level versioning policy.
  • When the project crosses 1.0, the SONAME suffix policy will be reviewed; the cargo-c version_suffix_components = 2 lock in Cargo.toml is the explicit mechanism for that decision.

If you depend on Tensogram from C, expect to rebuild on every minor release for now. The Rust API has the same caveat.

See also

  • C++ API — RAII / exceptions wrapper over the C API.
  • Error Handling — taxonomy of tgm_error variants and what each one means.
  • Internals — wire format, encoding pipeline.
  • rust/tensogram-ffi/README.md — crate-level overview.
  • The full C header: tensogram.h, regenerated by cbindgen on every cargo build. Source of truth for every type and function signature.