Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This book is conceptual documentation for Graviola: a schema-driven semantic CRUD framework used across several projects while it matures. It is written for developers who are comfortable with architecture, integration, and data modeling—not for end users.

What you will find here

What this book is not

  • Not the full framework API or package-by-package reference (that lives in the monorepo and will grow in separate technical docs).
  • Not a Storybook substitute: UI components and interactive examples belong in Storybook; this book points there when useful.
  • Not a single customer narrative: examples are illustrative across domains (heritage, internal tools, offline-first, etc.).

How to read progressively

  1. Start with The shape of a federated application if the problem is new.
  2. Read What Graviola is and Capabilities today for the current product story.
  3. Use Architecture and data flow as the map of layers and pipelines.
  4. Treat Architectural trajectory, Graviola in the age of generative tools, Outlook and open questions, and the Glossary as deepening material—optional until you need precision on lenses, sync, trust, generative workflows, or vocabulary.
  5. If authoring fragmentation (many files per model) matters to your team, read LinkML as an authoring source for schemas for a build-time pattern that leaves Graviola's runtime unchanged.

Canonical seed sources for this edition live under seed/ in the same repository; chapters here are the book-shaped rearrangement of that content.

See also

The shape of a federated application

A reader's primer to the problems Graviola addresses.


1. Where the difficulty begins

Most software is built on a comfortable assumption: there is one database, the application owns it, the schema is what the team decided, and every record is under the same roof. Frameworks, tooling, and conventional wisdom all rest on this picture. It works well for a great many systems and should not be abandoned where it suffices.

Some applications, however, cannot live inside that picture. A cataloging system needs to reference biographies that are maintained by a national library. A research tool needs to align its records with public datasets that change on their own schedule. A personal information system needs to make sense of files, messages, and bookmarks that arrived from many different programs over many years. In each of these cases, the application has data of its own — but it is surrounded by, and dependent on, data it did not produce.

This document is a guided tour of that surrounding landscape, written for readers who have not yet built systems at this shape. It introduces the conceptual terrain in two halves: the data side (where information comes from) and the representation side (how that information is given visible form). Both halves shape Graviola's design.


2. The data landscape

It is common to speak of data sources in terms of ownership — your data versus theirs — but ownership is a coarse instrument. The more useful question is how much interpretive work is required to bring data into your application's working model. The landscape can be sketched in roughly four bands.

flowchart LR
    A["Primary data<br/>under your control"]
    B["External, aligned<br/>peer triple stores, Solid pods,<br/>federation partners"]
    C["External, structured<br/>open data, public APIs,<br/>authority files"]
    D["External, unstructured<br/>PDFs, web pages,<br/>scanned documents"]

    A --> APP["Application's<br/>working model"]
    B -->|federated query| APP
    C -->|declarative mapping| APP
    D -->|extraction process| APP

Primary data is the application's own. Its schema is decided by the team that builds the application; its migrations are run on the team's terms; its reliability is the team's responsibility. This is the comfortable case.

External, aligned data is held by others but already speaks a vocabulary the application can understand. Another organization's triple store, a federation partner's Solid pod, a peer running the same software at a different site — in each case the data arrives in a form that requires identification but not interpretation. The work is to query it, to merge it, and to keep track of provenance.

External, structured-but-unaligned data is the largest band by volume. It is well-formed — public datasets, authority files such as Wikidata or the German Integrated Authority File, REST APIs returning JSON — but it speaks someone else's vocabulary. To enter the application's working model, it must be transformed: a birthDate in one schema becomes a dateOfBirth in another; a flat string is split into structured components; a nested array is flattened or restructured. This is the territory of declarative mapping.

External, unstructured data carries information with no schema at all: PDFs, scanned documents, web pages, audio transcripts, photographs of receipts, and so on. In order to become usable, this data must undergo a structuring process—whether that's a hand-written extractor, a rule-based pipeline, supervised machine learning, or (increasingly) a large language model. The goal of such processes is to output structured data, which then enters the same declarative mapping funnel as the structured-but-unaligned data described earlier.

It's important to note that the boundaries between these bands are rarely clear-cut. A federated peer's data may be perfectly aligned in some areas and completely foreign in others; a language model extractor may yield structured output along with confidence scores or provenance metadata requiring further interpretation. The key takeaway is not to obsessively classify, but rather to understand the distance and transformation work required to integrate any given data source into your application's model.

This layered view has deep parallels with Tim Berners-Lee's 5-star deployment scheme for Linked Open Data, which describes a progression from raw data on the web, to structured formats, to standardized schemas, to linked data, and finally to full interlinking with external sources. But this is not a concern only for "open data" or public datasets: every application—whether its sources are open, closed, or internal—faces this gradient of alignment, structuring, and integration. The five-star model offers a lens for thinking about all data sources and the varying levels of effort required to bring them "home" into your application's ecosystem.


3. From having data to showing data

Once information has reached the application's working model, a second question opens. People do not consume models; they consume views of models. The same record will be encountered in a list of search results, a row in a table, a card in a sidebar, an entry on a map, a node in a graph, a full-page detail screen. Each appearance shows part of the same underlying entity, but the part shown — and the way it is shown — varies enormously.

The variety can be organized along two axes.

The first is the arrangement of many entities. A table arranges entities into rows and columns. A list arranges them vertically with custom layout per row. An explorer view arranges them as a folder hierarchy. A map arranges them by geographic coordinate. A timeline arranges them by date. A graph arranges them by relationship. Each is an answer to "how should many of these be shown together?" and each is appropriate to different data and different tasks.

The second axis is the size of the canvas given to a single entity. The same person record may need to appear:

flowchart TB
    E["A single entity<br/>e.g. a person record"]

    E --> Cell["Cell in a table<br/>name only, perhaps abbreviated"]
    E --> Chip["Chip in a query result<br/>label + icon + color"]
    E --> Card["Card in a sidebar<br/>portrait + summary + key facts"]
    E --> Page["Full detail page<br/>all fields, all links"]

Each of these is, in some sense, a "detail view" — but the term flattens an important distinction. A detail view is not a single thing. It is a family of representations of an entity, parameterized by available space, by the user's current task, and by the device on the other end of the screen. A chip has perhaps thirty pixels of width and must communicate identity in a glance: a label, perhaps an icon, perhaps a color band. A sidebar card has more room and can introduce an image, a brief summary, a few key facts. A full page is unconstrained and can show everything the schema describes.

The harder design question is not how to render any one of these. It is how to choose, among the many possible representations of a given entity, the one that fits the current context — and to do so in a way that does not require the application's authors to write a separate component for every entity type at every size.


4. How Graviola approaches representation

Graviola does not prescribe a fixed library of representations. It provides a dispatch mechanism that allows representations to be registered and selected based on the data they encounter and the role they are filling.

The mechanism is built around what Graviola, following the convention of JSON Forms, calls testers. A tester is a small function that examines a piece of data and a context, and reports how well it can render that data in that context. Multiple testers may claim the same data; the one that reports the best fit wins. New testers can be added to a Graviola application without modifying existing ones, and the dispatch table can be inspected, reordered, or overridden per deployment.

Testers operate at every level of the rendering surface. There are testers that decide how a single cell in a table should be rendered — whether the value is shown as plain text, as a link, as a colored badge, or hidden entirely if the column is irrelevant in the current context. There are testers that decide how an entity should be shown as a chip, with the limited vocabulary chips offer: one label, perhaps a popover for more detail, perhaps an icon, perhaps a pattern or color drawn from a category. There are testers that select among detail-view layouts when an entity is opened in a sidebar, a panel, or a full page.

The principle that unifies these uses is structural dispatch: testers match against the shape of the data, not against an entity's nominal type. A tester written to render any object with a latitude and longitude field will fire for places, events, and observations alike, without those types being declared as related. A tester written to render any object with a signedBy field will recognize signed records wherever they appear. The same principle scales from individual fields (where JSON Forms applies it) to whole entities (where Graviola extends it).

The result is that a Graviola application's representation layer is composed, not architected. New representations are added incrementally, conflicts are resolved by ranking rather than by code change, and the same entity can be presented differently in different parts of the application without the application's authors enumerating those differences in advance.


5. Why both halves matter

Discussions of data federation often focus on the data side: how to query across sources, how to merge results, how to maintain provenance. These are real problems and Graviola addresses them. But the representation side is where federated applications most often fail to scale.

A system that brings together data from many sources, in many vocabularies, at many levels of structure, will encounter a corresponding multiplicity of entities and entity shapes. If each shape requires a hand-written representation for each role (cell, chip, card, page), the cost of maintaining the representation layer grows faster than the value of the data being represented. If, on the other hand, the representation layer is fixed — one card design, one detail page — the application loses the ability to show specialized data well.

The middle path is to make representation, like data, a layer that can be composed from declarative pieces and dispatched by shape. This is the design Graviola pursues. The data side and the representation side share a common discipline: in both, the framework's job is to provide structure for cooperation among many small contributions, not to produce a single answer that fits all situations.

A reader who carries away one observation from this primer should carry this: federation is not only the problem of bringing information together. It is also, and equally, the problem of giving that information form once it has arrived.


See also

What Graviola is

A semantic CRUD framework for schema-driven applications


Overview

Graviola is a TypeScript framework for building applications whose central abstraction at runtime is a JSON Schema (or Zod-derived JSON Schema) describing the shape of domain entities. Teams may maintain that schema by hand or generate it upstream in the build (for one documented pattern, see LinkML as an authoring source for schemas); the framework consumes the same artifact shapes either way. From the schema definition Graviola generates and operates: forms for creating and editing entities, tables for browsing them, queries against the storage backend, and validation of the data flowing in and out. The same schema drives the user interface, the persistence layer, and the integration layer.

The framework is storage-agnostic at its core. The same schemas, forms, and tables operate against an in-browser SPARQL store (Oxigraph compiled to WebAssembly), a remote SPARQL endpoint, a Prisma-backed relational database, a REST API, or an in-memory store for testing. This is not abstraction for its own sake: Graviola has been deployed in each of these configurations across different projects.

Graviola also includes a declarative mapping layer for ingesting structured data from external authority sources — Wikidata, the German Integrated Authority File (GND), DBpedia — into the application's local data model. This layer is the framework's most mature non-CRUD subsystem and is currently the primary mechanism by which Graviola handles cross-source data integration.

The framework is published as a monorepo of approximately fifty packages under the @graviola/ scope, designed to be consumed individually rather than as a bundle.


Why Graviola exists

A recurring pattern in domain-specific applications — cultural heritage catalogs, scientific data collection, internal tooling, knowledge management — is the gap between two competing needs:

  • The data model is rich and evolving: nested entities, references between records, multilingual fields, links to external authorities, schema changes over the lifetime of the project.
  • The development resources are bounded: the team cannot afford to hand-write a bespoke form, table, validation rule, and query for every entity type, and cannot afford to rewrite them every time the schema changes.

The conventional answers to this gap each fall short for one of Graviola's core use cases. ORM-driven scaffolding (Django admin, Rails forms, etc.) assumes a relational backend and a single deployed schema. Generic form libraries solve the form problem but not the persistence or query problem. Hand-rolled CRUD abstractions accumulate domain logic and resist reuse across projects.

Graviola's response is to take JSON Schema as the runtime single source of truth (Zod is supported where JSON Schema is derived from it) and derive everything else from it: the form (via JSON Forms), the table (via material-react-table with Graviola wrappers), the query (via the framework's schema-to-SPARQL translator or the equivalent for other backends), and the validation (via Ajv, against the same schema). The schema travels with the data; tooling built on Graviola can be ported between storage backends with minimal change.


See also

Capabilities today

This chapter describes Graviola as it exists in production today. Directions that are not yet implemented in the form described are kept in Architectural trajectory.


Schema-driven CRUD

Given a JSON Schema definition with @id and @type semantics, Graviola provides:

  • GenericForm — a top-level component that, given a schema and an entity IRI, generates a form, loads the entity from the configured store, manages dirty state and validation, and writes changes back. No per-entity-type code is required.
  • SemanticJsonForm — the lower-level component, used when explicit control over schema, UI schema, or data flow is needed.
  • CRUD hooksuseFormData, useFormEditor, useCRUDWithQueryClient, integrated with TanStack Query for caching and invalidation.

The CRUD pipeline translates JSON Schema definitions into store-appropriate operations. For SPARQL backends, this means generating CONSTRUCT queries for reads and INSERT/DELETE patterns for writes; for Prisma backends, it means typed ORM operations; for REST, configurable endpoint patterns.

Whether JSON Schema (and companion UI or mapping files) are authored by hand or generated in the application build — for example from LinkML — does not change this pipeline: Graviola consumes the same outputs at runtime.


Form rendering

Graviola uses JSON Forms as its UI rendering substrate. The framework ships a renderer registry covering:

  • Standard field types (text, number, date, boolean, enum)
  • Linked-data-aware renderers (entity pickers that query the configured store, authority lookup widgets)
  • Layout renderers (grids, tabs, sections)
  • Specialized renderers for color input, MapLibre GL maps, and Markdown editing

Renderers are registered once and dispatched by schema shape rather than by entity type. Adding a new entity type to a Graviola application typically requires no new renderer code.


SemanticTable

SemanticTable is a schema-driven table component providing:

  • Pagination, sorting, and filtering against the configured store
  • Soft-delete (move to trash, restore from trash)
  • CSV export
  • Column visibility configuration
  • Row selection and inline editing hooks

The table derives its columns and filters from the same JSON Schema used by the forms, so a change in the schema propagates to both surfaces without intervention.


Declarative authority mapping

Graviola's mapping layer is the production-tested mechanism for transforming records from external authority sources into the application's local data model. Mappings are written as JSON-LD-flavored declarative documents, not code. Each mapping entry pairs a source path (JSONPath against the authority response) with a target path in the local schema, optionally invoking a named strategy for non-trivial transformations.

The strategy catalog includes operations for concatenation, first-match selection, date-string-to-integer conversion, entity creation with authoritative back-links, template substitution, and recursion into nested mappings. The catalog is extensible, and new strategies can be added without modifying the mapping engine.

This layer is currently used for ingestion from Wikidata, GND, and DBpedia in cultural heritage applications. It is documented and has been refined across multiple deployments.


Storage backends

Concrete AbstractDatastore implementations available today:

BackendStatusTypical use
In-browser Oxigraph (WebAssembly)ProductionLocal-first applications, no-server deployments
Remote SPARQL endpointProductionFederated data, existing institutional triple stores
Prisma (PostgreSQL, SQLite, others)ProductionInternal tools, classical web applications
REST APIProductionIntegration with existing HTTP services
In-memory (Zustand)ProductionTesting, prototyping

The SPARQL backend supports multiple dialects (standard SPARQL 1.1, Oxigraph, Blazegraph, Allegro) selectable per deployment.


Browser/server symmetry

Graviola's foundation and schema-to-query layers are constrained to be free of React, MUI, or any browser-only dependency. This constraint is enforced because the same packages are consumed by command-line tools (@graviola/edb-cli) and a REST API server (apps/edb-api) running on Bun. The translation from JSON Schema to SPARQL, the graph-to-JSON extraction, and the data-mapping engine all run identically in browser and server environments.

This symmetry is a load-bearing property of Graviola's design and shapes how new capabilities are added.


See also

Architecture and data flow

The framework is organized into six layers, each consuming only from layers below it:

graph TD
    L6["Layer 6 — UI Components<br/>SemanticTable, EntityFinder, advanced components"]
    L5["Layer 5 — Form Rendering<br/>SemanticJsonForm, GenericForm, JSON Forms renderers"]
    L4["Layer 4 — Store Providers<br/>SPARQL, Oxigraph, REST, Prisma, in-memory"]
    L3["Layer 3 — State Management<br/>React hooks, data mapping hooks"]
    L2["Layer 2 — Schema → Query Translation<br/>sparql-schema, graph-traversal, db-impl packages"]
    L1["Layer 1 — Foundation<br/>Core types, utils, JSON Schema utilities, JSON-LD utilities"]

    L6 --> L5
    L5 --> L4
    L5 --> L3
    L4 --> L3
    L3 --> L2
    L2 --> L1

Layers 1 and 2 are the server-safe core: no frontend dependencies, consumed by both browser applications and command-line tooling. Layers 3 and 4 introduce React and storage-specific code. Layers 5 and 6 are the user-facing surfaces.

The data flow for a typical read operation:

flowchart LR
    A["JSON Schema<br/>definition"] --> B["sparql-schema<br/>translator"]
    B --> C["SPARQL CONSTRUCT<br/>query"]
    C --> D["RDF graph<br/>from store"]
    D --> E["graph-traversal<br/>extractor"]
    E --> F["Typed JSON<br/>object"]
    F --> G["State hooks<br/>TanStack Query"]
    G --> H["React<br/>component"]

Writes follow the inverse pipeline: form data is validated against the schema, transformed into RDF triples (or the equivalent for non-RDF stores), and committed via INSERT/DELETE operations.


See also

Deployment scenarios

The scenarios below describe shapes of deployment that Graviola has supported or is designed to support. Each scenario indicates which capabilities are involved and which are drawn from Architectural trajectory rather than current production.


Cultural heritage and library catalogs

The original driver of Graviola's design. A cataloging team needs to enter records about books, persons, places, exhibitions, or works, with frequent reference to external authorities (GND for German-language records, Wikidata for cross-domain links, VIAF for international author identifiers). The data model is rich, evolves slowly, and must produce valid RDF Linked Data for publication.

Graviola serves this scenario today through:

  • JSON Schema definitions with @id and @type semantics, enabling round-trip to RDF
  • GenericForm for manual data entry, with linked-data-aware renderers for authority lookups
  • SemanticTable for catalog browsing
  • The declarative mapping layer for ingesting authority records into the local model
  • A SPARQL endpoint as the storage backend, allowing the catalog to be queried as Linked Open Data

Trajectory capabilities relevant to this scenario: signed states for expert-curated records; lens-based migration as the schema evolves across project lifetimes.


Offline-first field deployments

A team operates in an environment with intermittent or absent connectivity — a research vessel, a field site, a remote installation. Multiple devices need to share a working data model and current data, without depending on a central server.

Graviola has been deployed in this configuration with a JSON Forms-based schema designer producing schemas at runtime, distributed alongside the data over a Yjs-based WebRTC transport. The application operates entirely offline; reconnection synchronizes both schema and data changes between peers.

The current implementation handles a single shared schema version per peer group. The trajectory direction extends this to peer-specific schema versions reconciled via lens application — a capability whose architectural shape is clear but whose implementation has not been completed.


Privacy-sensitive data collection

An application handles data whose disclosure to a server operator is unacceptable: personal records under regulatory protection, sensitive interview transcripts, internal information that must not be visible to infrastructure providers. The deployment requires that the server function as a transport and storage layer only, never gaining access to plaintext.

Graviola's browser/server symmetry is the load-bearing property here. The schema, the form, the validation, and the encryption all run in the browser before data leaves the device. Servers handle ciphertext. The same Graviola components used for non-sensitive applications operate in this mode without modification, given the appropriate AbstractDatastore implementation.


Internal tools with classical backends

Not every Graviola deployment requires the federated, peer-to-peer, schema-evolving model. The framework is also used as a productivity layer over conventional Prisma-backed PostgreSQL or MongoDB databases, where the team owns the model and uses standard migration tooling.

In this configuration, Graviola provides JSON Schema-driven forms, tables, and validation over a Prisma-managed store. Schema evolution is handled by Prisma migrations in the conventional way; the lens engine is not enabled. This deployment shape is an intentional first-class case, not a downgrade.


Authority-linked reference databases

A reference database — biographical, geographical, terminological — needs to maintain links between local entities and one or more external authorities, allowing data to be re-fetched or cross-referenced without losing local annotations.

Graviola provides this through its primary/secondary IRI distinction: each local entity carries a canonical local IRI and any number of sameAs links to authority entries. The mapping layer governs how authority data is transformed into local schema shape on initial ingestion; subsequent updates can re-fetch from the authority and merge changes against the local annotations.

Trajectory capabilities relevant here: signed states allowing experts to annotate or correct authority-derived data with an audit trail.


See also

Limits, fit, and evaluation


What Graviola is not

Equally important to scope is what Graviola does not attempt:

  • Graviola is not a database. It is a layer over storage backends. The choice of triple store, relational database, or REST service is the application's, not the framework's.
  • Graviola is not a reasoner. The conceptual model is reasoning-shaped (property-driven class derivation, transitive sameAs), but inference, where required, is performed by the underlying store or by application code. The framework does not ship an OWL reasoner.
  • Graviola is not a CMS. There is no built-in role model, publication workflow, asset pipeline, or page composition system. Applications building such features do so on top of Graviola's CRUD primitives.
  • Graviola is not a complete frontend stack. It provides components for forms and tables, but page routing, application shell, theming, and authentication are application concerns. The example application (apps/testapp) demonstrates one way to compose these but does not prescribe.
  • Graviola is not a substitute for a hand-tuned schema in performance-critical scenarios. Schema-driven query generation introduces overhead. For high-throughput services with stable schemas, Graviola is appropriate at the application layer but should not be assumed performant in the inner loop of a search engine or analytics system.

Evaluating Graviola for a project

The framework is most appropriate when the following hold:

  • The application is built around a domain data model with multiple related entity types.
  • The data model is expected to evolve, or already exists in multiple representations across data sources.
  • Forms and tables for these entity types would otherwise need to be hand-written and maintained.
  • JSON Schema is acceptable as the central description language.
  • The deployment can accept TypeScript on the application side.

The framework is less appropriate when:

  • The data model is fixed, simple, and unlikely to change.
  • The application is dominated by a single bespoke interaction surface (a custom editor, a domain-specific visualization) rather than CRUD over structured records.
  • The existing technology stack is not JavaScript/TypeScript and crossing that boundary is undesirable.

For teams considering Graviola, the recommended starting point is apps/testapp in the framework's monorepo. It is a minimal Vite + React application demonstrating GenericForm over a small schema with nested entities. The application is approximately one screen of code and exercises the core CRUD path end-to-end.


Repository and reference

The framework is published at github.com/gravio-la/graviola-framework. The monorepo contains approximately fifty packages under the @graviola/ scope, organized by the layer architecture described in Architecture and data flow. The canonical example application is apps/testapp.

A separate Glossary defines the framework's terminology, with references to the literature and prior projects underlying each concept.


See also

Graviola in the age of generative tools

This chapter addresses a question readers may bring from the current tooling landscape: whether a schema-driven framework like Graviola remains relevant when generative models can produce working application code from a single prompt.


1. The question worth asking

A reader could be forgiven for asking why a framework like Graviola should exist at all in a moment when a single prompt can produce a working application. The trajectory of generative coding tools has been steep, and the comfortable assumption that hand-written software is the durable form of an application is being tested in real time. If an LLM can write the form, the validator, the database access, the storage layer, and the UI in one shot, the case for a structured framework is at least worth re-examining.

This chapter takes the question seriously rather than waving it away. It argues, briefly, that the moment generative tools become genuinely capable of producing application code is precisely the moment a framework like Graviola becomes more valuable to its users — not less. The argument rests on what kind of artifact is produced, who can revise it, and how AI assistance can be layered onto a structured system in ways that are difficult to layer onto a hand-rolled monolith.

A working starting point for this chapter is the assisted-forms-designer, an existing project in the Graviola orbit. It is a WYSIWYG editor for JSON Forms (and Graviola forms) that has recently been extended with an AI assistant. The assistant can produce a full schema and form from a prose description of what the application needs, can take an existing schema and propose a form layout, or can offer incremental suggestions while a domain expert builds a form by hand. The project is small, it is real, and it sketches the shape of a broader pattern.


2. The economics of generation

When generation is cheap, the question shifts from can the system produce code to what should the produced artifact look like. Two extremes are worth contrasting.

In one direction, a generative tool produces an entire bespoke application — its own forms, its own validation rules, its own database access, its own UI. The application is a self-contained monolith. Reviewing it requires reading all of it. Modifying it requires understanding how its pieces fit together, none of which has been factored against any external convention. Regenerating part of it risks invalidating the rest. The result is fast to produce and slow to evolve.

In the other direction, a generative tool produces a small set of declarative artifacts — a schema, a form definition, a few annotations, perhaps a custom tester or two — that plug into an existing framework with known semantics. The framework supplies the form rendering, the validation, the persistence, the query engine, the UI components, the storage abstraction. The generated surface is small, its boundaries are clean, and each part can be regenerated independently. Reviewing the generated artifacts is the same task as reviewing hand-written ones. Domain experts who could not read application code can read a JSON Schema, or a form layout, or an annotation set.

The second pattern is what Graviola enables. The framework's architecture — JSON Schema as source of truth, structural dispatch for representation, declarative mappings for integration, the entire structure described in earlier chapters — is precisely the structure that makes generative assistance tractable. The model an LLM is asked to produce is small, well-defined, and reviewable. Most of the application is supplied by the framework, not by the model.

This is a less glamorous claim than the headline that AI will write entire applications. It is also closer to what teams actually need.


3. Three layers of assistance

A schema in a Graviola application has a lifecycle. It is authored, then it is used to fill in data, and over its lifetime it accumulates relationships with external records that need to be mapped into its terms. AI assistance can attach at each of these stages, doing different work in each, but always against the same schema.

flowchart TB
    subgraph LIFECYCLE ["The schema's lifecycle in a Graviola application"]
        A["Authoring<br/><i>schema, form,<br/>annotations</i>"]
        F["Filling<br/><i>creating instances<br/>against the schema</i>"]
        I["Integration<br/><i>mapping external<br/>records into the model</i>"]
        A --> F
        F --> I
    end

    AID_A["AI assistance:<br/>generate or refine<br/>schema and form"] -.-> A
    AID_F["AI assistance:<br/>guide the filler<br/>using field annotations"] -.-> F
    AID_I["AI assistance:<br/>suggest mappings<br/>for unfamiliar records"] -.-> I

3.1 Authoring assistance

This is where assisted-forms-designer already operates. A domain expert — a librarian, a curator, a researcher, a small-NGO administrator — describes what their application needs. The assistant produces a draft schema and form. The expert reviews the draft in a WYSIWYG editor, adjusts what the assistant got wrong, and adds the constraints that only a domain expert can know. The result is a schema and form definition that the framework consumes directly.

The crucial property is that the artifact under review is the deliverable. The expert is not reviewing generated code that will then be deployed; they are reviewing a schema that is itself the description of the application. If the assistant misunderstood the domain, the expert sees the misunderstanding in a form they can read, and can correct it in the same WYSIWYG editor. There is no opaque code layer between the description and the running application.

3.2 Filling assistance

Once the schema and form exist, the application's users are not the same people who authored it. A field researcher uses the form to enter observations. A volunteer enters event registrations. A cataloger enters bibliographic records. These users are domain-knowledgeable but may not be familiar with every field, every constraint, or every edge case the schema admits.

A second layer of AI assistance attaches here, drawing its instructions from the same annotations the authors placed on form fields. An annotation might say "describe the substrate's texture as rough, smooth, or granular"; the assistant uses this to help a researcher whose hands are full convert a verbal description into a structured field value. Another annotation might say "this field expects the canonical English title; if the source uses a translated title, prefer the original"; the assistant uses this to flag an inconsistency in what the user entered.

The pattern is that the schema's annotations become the assistant's instructions. The application author writes guidance for human users; the same guidance, read by an AI assistant, helps users follow it. No separate authoring effort is required for the AI layer. The annotations exist because human users benefit from them, and the AI assistance is a derivative use of the same content.

3.3 Integration assistance

The third layer is the one most familiar to projects that work with linked data. A cataloger encounters a record from an external authority — a Wikidata entry, a GND person, a record from a partner institution — and needs to bring it into the local model. The existing declarative mappings cover the common cases, but the cataloger has found a record that does not quite fit. Some fields are present in unfamiliar shapes; some have no obvious counterpart; some carry information at a different level of granularity than the local schema expects.

An AI assistant placed at this point in the workflow has access to the local schema, the existing mapping configurations, and the unfamiliar record. It can suggest a candidate mapping, flag which fields would be lossy, and propose either a one-off transformation for this record or a new general mapping rule for the project. The cataloger reviews the suggestion in the same way an expert would review a draft from a junior colleague — accepting, refining, or rejecting — and the accepted result becomes part of the project's mappings, available to the next cataloger who encounters a similar record.

This is the kind of integration work that is genuinely tedious for humans, genuinely tractable for AI assistants, and genuinely consequential when it goes wrong. The framework's existing structure makes it possible to assist without taking over: the assistant proposes against an explicit, reviewable schema; the human's role is judgment, not transcription.


4. Why this configuration works

Three properties of Graviola's existing design make the layered assistance pattern viable, and none of them was added with AI assistance in mind. They are consequences of the framework's structural-dispatch and schema-as-source-of-truth choices.

Small surface area for generation. An LLM asked to produce a Graviola application produces a schema, possibly a form definition, possibly some annotations, and possibly a custom tester. It does not produce the form rendering, the validation, the database access, the query engine, or the UI scaffolding. The model's output is small, its shape is well-defined, and its correctness can be checked by reading the artifact rather than by running it.

Reviewable artifacts. The artifacts the assistant produces — schemas, form definitions, annotations, mappings — are the same artifacts a domain expert authors by hand. They are not intermediate representations or scaffolds for code that will be generated next. The expert reviews the actual deliverable. When the assistant is wrong, the wrongness is visible at the level the expert can correct.

Attachment points for guidance. The same annotations that drive UI rendering, that mark calculated fields, that declare authorization rules, also serve as the natural places to attach guidance for human users — and, by extension, instructions for AI assistants helping those users. The annotation surface is unified; there is no separate "AI configuration" layer.

These properties are independent. A framework could have any one of them without the others. Graviola has all three because they fall out of the same design discipline.


5. What this future is not

Equally important to the vision is the boundary on what the framework will not become.

This is not a pivot to AI-first development. Graviola's primary commitment remains to applications that domain experts can build, evolve, and own without AI assistance. The pattern described here is additive: applications that never use any AI assistance run identically to applications that use it at every stage.

This is not autonomous agents replacing human authors or users. At every stage described in section 3, a human reviews and accepts the assistant's output. The assistant proposes; the human decides. The framework's audit trail (the schemas, the mappings, the annotations) reflects the human's decisions, not the assistant's suggestions.

This is not a claim that AI will replace the framework's structural choices. Reasoning, dispatching, validating, and querying are still the framework's responsibilities. Generative tools change what is supplied to the framework, not what the framework does with it.

This is not a roadmap of features. The assisted-forms-designer is the only piece of this picture currently implemented. The form-filling assistance and integration assistance described in sections 3.2 and 3.3 are directional: they require building, not just enabling. They are sketched here because the framework's existing structure makes them feasible without architectural change, not because they are imminent.


6. A modest closing claim

The most defensible claim about Graviola in the age of generative tools is the modest one: a framework whose central artifact is a small, reviewable, declarative schema is well-positioned for a world in which schemas can be drafted, refined, and used with AI assistance. The same properties that make the framework approachable for human authors — small artifacts, explicit annotations, structural dispatch — make it approachable for assistants working alongside human authors.

The earlier chapters of this book describe what Graviola is today and where its architecture is heading. This chapter sits beside them rather than in front of them: the future glimpsed here does not require the framework to become something it is not. It requires the framework to remain what it has been — small, structured, schema-driven, oriented toward domain experts — while letting new tools attach themselves to the surfaces that already exist for human use.

The first place to look, for readers wanting to see this in motion, is the assisted-forms-designer repository. It is the smallest concrete instance of the pattern this chapter describes, and it is the foundation on which the rest can be built.


See also

Architectural trajectory

The capabilities below are not yet implemented in production in the form described. They represent the architectural direction of the framework, informed by both prior research and the requirements of Graviola's existing users. Each is documented here so that current development decisions remain compatible with these directions.

The discipline applied to this section: a capability is described here only when its shape is clear enough that the team has chosen not to foreclose it through current design choices.

For what ships today, see Capabilities today.

Authoring versus trajectory: build-time modeling choices (for example generating JSON Schema and UI schema from LinkML as an authoring source for schemas) are separate from the capabilities below. LinkML is an optional application build step; it does not move unfinished runtime features into production.

Generative tooling versus trajectory: assistance that drafts or refines schemas, forms, and mappings (see Graviola in the age of generative tools) attaches to the same declarative surfaces Graviola already uses; it does not substitute for the runtime capabilities sketched below.


Schema evolution via lenses

Schemas evolve over the lifetime of an application. In Graviola's current deployments, this is handled either by classical migration scripts (where the application owns its database) or by manual rewriting of mapping configurations (where data is ingested from a versioned authority).

The architectural direction is to express version-to-version transformations as bidirectional lenses — small, composable, declarative documents that describe how to migrate data forward to a newer schema and, where possible, backward to an older one. This is a well-studied pattern; the closest existing implementation is Project Cambria from Ink & Switch.

In Graviola's intended model, each entity carries a gra:version property identifying the schema version under which it was authored. A consumer encountering an entity at a different version applies the appropriate lens chain at query time. The lens engine is an opt-in capability of an AbstractDatastore implementation, not a requirement.

Related glossary entries: Lens, Entity version, Schema drift, Lens-as-data.


Calculated fields

Some schema properties are best expressed as derivations rather than stored values: a person's full name from forename and surname; an aggregate computed across linked entities; a status flag derived from temporal conditions. The intended mechanism is a declarative formula language (HyperFormula-shaped), with each calculated field declaring its dependencies, its complexity class, and the resources it requires to evaluate.

The complexity and capability annotations matter because Graviola's deployment targets range from in-browser applications on commodity hardware to server-side processes with substantial compute. A calculated field that is acceptable on a server may be prohibitive in a browser; the runtime will choose between eager and lazy evaluation, or refuse to evaluate, based on the host's declared capabilities.

Related glossary entries: Calculated field, Capability context, IVM.


Signed states and authoritative value

For applications where the credibility of data matters — historical databases, cultural heritage catalogs, expert-curated reference works — the framework's intended trust model is built on signed states: cryptographically signed snapshots of an entity (or of a lens, or of a schema) attesting that a named party vouches for its correctness at a moment in time. Multiple signatures, weighted by the trust graph among signers, contribute to a computed authoritative value used to surface plausible versus contested entries.

The cryptographic substrate is intended to be the W3C Verifiable Credentials Data Model.

Related glossary entries: Signed state, Authoritative value.


Schema and lens as syncable data

Graviola's existing storage layer treats data as documents. The intended extension is to treat schemas and lenses themselves as documents — JSON-LD documents with stable @ids, syncing through the same transport (Yjs, Solid, SPARQL endpoints) as application data. This generalizes a pattern observed in field deployments where domain experts authored schemas via JSON Forms-based designers and distributed them peer-to-peer alongside the data.

When schemas, lenses, and data all flow through one transport, signing extends uniformly to all three.

Related glossary entries: Schema-as-data, Federated sync layer.


See also

LinkML as an authoring source for schemas

This chapter describes an optional authoring path for teams using Graviola: adopt LinkML as a single document they edit, then run a build-time generator that emits the same artifacts Graviola already consumes — JSON Schema or Zod, JSON Forms UI schema, mapping and other declarative configuration. Graviola at runtime is unchanged and takes no LinkML dependency.

For what ships today in the framework, see Capabilities today. Planned features mentioned below (for example calculated fields) are described under Architectural trajectory.


Context

Graviola is built around JSON Schema at runtime. The choice is deliberate: JSON Schema is widely understood, well-tooled, and used across many communities outside the semantic-data world. JSON Forms consumes it directly. Validators are abundant. The translation from JSON Schema to SPARQL, to relational queries, and to TypeScript types is well-explored in the framework. Some applications use Zod 4 instead, deriving JSON Schema from Zod where the framework requires it; the framework supports both shapes.

In practice, however, a Graviola application's model rarely lives in a single file. Around the central JSON Schema (or Zod schema) accumulate complementary declarations for the same conceptual model:

  • A UI schema giving JSON Forms rendering hints that the schema alone cannot supply.
  • Declarative mappings for transforming data from external authorities (Wikidata, GND, DBpedia) into the local model.
  • Application-specific configuration for authorization, calculated fields, default views, and other cross-cutting concerns.
  • Occasional JSON Schema extensions (x-* keys) for things the standard does not express — for example the inverse of a property.

At runtime these pieces are unified by a shared addressing convention: scopes (JSON Pointer-like paths), mapping-layer selectors, and type IRIs for entity classes. That co-existence is intentional.

What is awkward at authoring time is fragmentation: domain experts may edit several files in several formats, each with its own naming and reuse conventions.

One concrete example: an x-inverseOf extension can declare that one property is the reverse of another. JSON Schema has no native inverse; the extension lets the query planner and graph-to-JSON extractor know which side is canonical. It works, but it pushes semantic detail into a vocabulary that was not designed for it. Similar pressures appear as the framework grows.


What LinkML offers

LinkML is a modeling language (YAML-based, linked-data aware) aimed at describing models richly enough that many downstream representations can be generated: JSON Schema, OWL, SHACL, RDF, documentation, and more.

For Graviola-oriented authoring, four properties stand out:

  1. Native constructs for several things JSON Schema only covers by extension — for example inverse:, equals_expression:, identifier: true, multivalued slots, slot reuse across classes. The inverse-property case can be expressed as first-class LinkML instead of only as x-inverseOf in JSON Schema.
  2. Namespaced annotations on classes, slots, and types — for example ui.label, auth.read, calc.complexity. LinkML carries them; a project-specific generator decides how they map to emitted files.
  3. Compile-first workflow — author one schema, generate artifacts per consumer. JSON Schema becomes an output, not necessarily the hand-maintained source.
  4. Single-document legibility — for humans and for tooling (including LLM-assisted authoring), one file can hold structure, relationships, presentation hints, and cross-cutting metadata in one place.

Further reading:


Build-time pattern

The authored source is a LinkML schema. The application author runs a generator in the build pipeline. It emits artifacts that would otherwise be maintained by hand: JSON Schema (or Zod), JSON Forms UI schema, mapping configuration, authorization rules, calculated-field declarations (when the application adopts the trajectory described in Architectural trajectory), and any other agreed outputs.

flowchart LR
    subgraph buildLayer [Application build]
        LK["LinkML schema"]
        M["Generator"]
        JS["JSON Schema or Zod"]
        UI["UI schema"]
        MC["Mapping config"]
        AC["Authorization config"]
        CC["Calculated field declarations"]
    end

    subgraph runtimeLayer [Application runtime]
        APP["Graviola"]
    end

    LK --> M
    M --> JS
    M --> UI
    M --> MC
    M --> AC
    M --> CC
    JS --> APP
    UI --> APP
    MC --> APP
    AC --> APP
    CC --> APP

The generator is an application concern — CLI, build script, bundler plugin, or small program — not part of Graviola. Generated files can be committed for review or produced in CI; either fits the framework.

Two consequences:

  • No framework change is required for this path. CRUD, forms, SemanticTable, and mapping keep consuming the same artifact shapes; Graviola does not care whether they were handwritten or generated.
  • Hand authoring remains fully supported. LinkML is one possible upstream; others are equally valid.

Annotated example

The following LinkML sketch shows classes and slots with annotations in several namespaces. A real project's generator maps each namespace to its target format.

id: https://example.org/schemas/library
name: LibrarySchema
description: Persons and works in a small library catalog
prefixes:
  ex: https://example.org/
  linkml: https://w3id.org/linkml/
default_prefix: ex
imports:
  - linkml:types

classes:
  Person:
    description: A natural person
    tree_root: true
    slots:
      - id
      - forename
      - surname
      - fullName
      - authoredWorks
    annotations:
      ui.list_renderer: chip
      ui.detail_layout: two_column
      auth.read: public
      auth.write: "role:editor"

  Work:
    description: A book, article, or other authored work
    tree_root: true
    slots:
      - id
      - title
      - author
      - publicationYear
    annotations:
      ui.list_renderer: card
      auth.read: public
      auth.write: "role:editor"

slots:
  id:
    identifier: true
    range: uriorcurie

  forename:
    range: string
    required: true
    annotations:
      ui.label: "First name"
      ui.detail.priority: 10

  surname:
    range: string
    required: true
    annotations:
      ui.label: "Last name"
      ui.detail.priority: 10

  fullName:
    range: string
    equals_expression: "{forename} + ' ' + {surname}"
    annotations:
      calc.complexity: "O(1)"
      calc.cached: false
      ui.detail.priority: 1
      ui.label: "Full name"

  authoredWorks:
    range: Work
    multivalued: true
    inverse: author
    annotations:
      ui.list.collapsed: true
      ui.label: "Works"

  author:
    range: Person
    required: true
    annotations:
      ui.label: "Author"

  title:
    range: string
    required: true
    annotations:
      ui.detail.priority: 1

  publicationYear:
    range: integer
    annotations:
      ui.label: "Year of publication"
      ui.detail.priority: 5

Emitted JSON Schema (or Zod) — ranges, identifiers, required and multivalued flags, and class structure carry over in the usual way for your chosen generator.

Emitted UI schema — from ui.* annotations: labels, layout hints, list renderers, field ordering, collapsed lists.

Inverseinverse: author on authoredWorks replaces a hand-maintained x-inverseOf-style declaration. The generator emits whatever companion shape the project uses for the query planner and graph-to-JSON layer; runtime still does not read LinkML.

Calculated fieldsequals_expression and calc.* annotations can feed generated declarations aligned with the direction in Calculated fields.

Authorizationauth.* maps into the application's own rule format; the vocabulary is project-defined.


Implementation footprint

Adopting this pattern is bounded work for the application author:

  • A LinkML reader — often the official linkml tooling from a build subprocess, or a TypeScript reader for the subset the project uses.
  • A generator that walks parsed LinkML and emits the chosen artifact set, ideally as small per-namespace handlers (ui, auth, calc, view, …) so new concerns add handlers rather than entangling the whole pipeline.
  • A documented registry of annotation namespaces and meanings for the team.

Graviola packages continue to consume the same outputs as before.


Boundaries

This is a new authoring path, not a mandate.

  • Per-project or per-schema LinkML adoption is fine; the same monorepo can mix LinkML-backed and hand-authored apps.
  • The generator runs at build time; Graviola at runtime sees only generated (or handwritten) artifacts.
  • The generator should translate only a documented subset of LinkML plus agreed annotations. Constructs outside that subset should be ignored or reported, not silently mis-translated.

The generator stays small by design: its contract is what Graviola and the application can act on, not everything LinkML can express.


Summary

Runtime Graviola stays centered on JSON Schema (or Zod-derived JSON Schema) plus companion declaratives, coordinated by scopes, selectors, and type IRIs. Optional LinkML authoring reduces authoring-surface fragmentation: one edited document and a build step that produces the same artifact bundle the framework already expects — with no runtime LinkML dependency and no requirement to abandon hand-maintained schemas.


See also

Outlook and open questions

This chapter collects unresolved design tensions and research-style questions that arise from combining Graviola's trajectory (lenses, calculated fields, signing, federated sync) with real deployments. It is intentionally separate from the Glossary, which stays focused on definitions and stable vocabulary.

For capabilities that are directionally chosen but not yet production guarantees, see Architectural trajectory. For how generative tooling may attach to schema-driven workflows without changing the framework's core contract, see Graviola in the age of generative tools.


Cross-version calc sync

How to reconcile a Calculated field computed on a peer at V_a with one computed on a peer at V_b when the underlying schemas are linked by a Lossy lens. Likely requires the calc to declare its valid version range and the runtime to skip cross-version cache reuse.


Lens inference

Whether (and to what extent) lenses between adjacent schema versions can be inferred from a structural diff of the schemas themselves, rather than authored by hand. Promising for trivial cases (rename, add-with-default); intractable in general.


Provenance through lenses

How Signed state survives forward-and-back migration. A signature over Person_V1 is not a signature over Person_V2 — but if the lens is signed and well-behaved, the trust can be transitively reconstructed. Design unclear.


Calc migration across lossy boundaries

When a Calculated field reads a field that gets split or merged by a Lossy lens, the formula no longer references valid sources in the new schema. Auto-rewriting formulas across lossy boundaries silently produces wrong results; the safe default is to mark the calc as invalidated under that migration and surface it. A better answer is open.


See also

  • Architectural trajectory — where each frontier topic connects to intended architecture.
  • Glossary — formal definitions and references for terms used above.

Graviola Glossary

Navigation: This glossary deepens vocabulary used across the book. For the product overview first, see What Graviola is and Capabilities today. For future direction, see Architectural trajectory. For unresolved design questions that span multiple terms, see Outlook and open questions.

A working vocabulary for the Graviola framework: federated, schema-evolving, local-first semantic data infrastructure. This glossary names the concepts Graviola relies on, points at the literature and prior projects that defined or refined each one, and gives short examples grounded in Graviola's actual use cases (cultural heritage, personal information management, offline-first field deployments).

The glossary is organized in layers, working roughly from foundations outward. Cross-references between entries use bold on first mention. Each entry has a short definition, an Example where one helps, See also cross-references, and References with links to literature or prior projects.

A note on optionality. Graviola is built from small, composable libraries. Most of the machinery described here — lenses, IVM, signed states, reasoning — is optional. Many Graviola applications use a single fixed schema with classical migrations (see Classical Migration) and never touch the lens engine; others use only the UI dispatch layer over a Prisma-backed PostgreSQL or MongoDB store. The architecture is designed so that none of these advanced concepts becomes a blocker for the simple cases.


1. Foundations

1.1 Schema-as-Data

Schemas (LinkML, JSON Schema, SHACL shapes) are not ambient configuration baked into deploys but first-class documents with @ids that travel through the same sync layer as the data they describe. A consequence: schemas can be authored, versioned, signed, and migrated by the same machinery as any other entity.

Example: On a ship with intermittent connectivity, a domain expert authors a JSON Schema via JSON Forms. The schema syncs across peers via Yjs/WebRTC alongside the data conforming to it.

See also: Lens-as-Data, Federated Sync Layer, Entity Version.

References:


1.2 Structural Dispatch

The architectural principle that behavior — UI rendering, mapping, validation, calculation, lens application — is bound to the shape declared by a schema (or to a property carried by an entity), not to a nominal type or class. This single pattern recurs at every layer of Graviola and is what allows components to survive Schema Drift without code changes.

Dispatch is itself a schema-level operation: testers match against Scopes (schema-node pointers) rather than Binding Paths (data traversals). This is why a renderer registered for #/properties/birthDate works against any Person instance without further configuration.

Example: JSON Forms resolves a renderer for {type: "string", format: "date"} regardless of which entity type contains the field. The same principle drives lens dispatch by Entity Version and class derivation by property (see Derived Versioned Class).

See also: Scope, JSON Forms, Declarative Mapping.

References:


1.3 Federated Sync Layer

The transport-and-replication substrate (in Graviola: Yjs, optionally over WebRTC, WebSocket, or Solid Pods) that moves both data and schema documents between peers without assuming a central authority. The sync layer makes no semantic decisions; it only guarantees eventual consistency of opaque documents.

Example: Ship deployment where servers are pure relays and have no plaintext access; peers sync end-to-end-encrypted documents and resolve schema versions locally on reconnection.

References:

  • Yjs
  • Local-first software (Kleppmann, Wiggins, van Hardenberg, McGranaghan, Ink & Switch, 2019)
  • Shapiro, Preguiça, Baquero, Zawirski, "Conflict-Free Replicated Data Types" (2011)

1.4 Reasoning-Compatible, Reasoner-Optional

Graviola's conceptual model is shaped by description-logic and rule-based reasoning (property-driven class derivation, entailment, transitive sameAs), but Graviola does not ship a reasoner. Where the underlying datastore supports reasoning (e.g., a triple store with OWL RL or a SHACL-AF engine), derivations can be materialized; where it does not, the same derivations can be computed at extraction time by Graviola's pipeline. Both paths produce the same query semantics for the cases Graviola cares about.

See also: Derived Versioned Class.

References:


1.5 Scope

A pointer into a schema document — a JSON Pointer such as #/properties/name — that identifies a schema node: a property definition, a type definition, or a rule. Scopes exist at schema-compile time and are absolute relative to a single schema document; they answer the question where in the schema does this apply? JSON Forms uses scopes to bind UI elements to schema nodes (scope: "#/properties/name" means "this UI element corresponds to this schema node"). A scope navigates the schema document, not the data graph; it never crosses a $ref to another entity and has no concept of a current instance.

In knowledge-representation terms a scope is a TBox pointer: it addresses the schema, not the data.

Example: A renderer registry tester matching scope: "#/properties/birthDate" selects the schema property declaration regardless of which Person instance is rendered. The same scope is correct on the empty form, on a half-filled form, and on a saved entity.

See also: Binding Path, Structural Dispatch, JSON Forms.

References:


1.6 Binding Path

A traversal through instance data, guided by the schema, that resolves to one or more data values relative to a current entity. A binding path such as patch.lane.owner starts at "the current instance," follows declared relations across $ref boundaries, and may fan out to many values when any hop is an array. Paths exist at runtime, against a live store; they answer the question what value do I need from the data graph?

In knowledge-representation terms a binding path is an ABox traversal, shaped by the TBox: the schema declares which hops are valid; the data supplies the values.

Example: A formula binding EQ(ownerId, currentUserId) resolves ownerId by following lane.owner.id from the current Patch. Substituting a Scope here would be a category error: schema nodes have no runtime values to compare.

Rule of thumb: use a Scope when you are describing something about the schema structure itself — which property a rule applies to, which field a renderer corresponds to, which definition an annotation targets. Use a binding path when you are describing a traversal through data — what value to retrieve, what relation to follow, what to compute over. Scopes are answered by the schema alone; binding paths are answered by the schema plus the data.

See also: Scope, Calculated Field, Structural Dispatch.


2. Schema Evolution & Versioning

2.1 Schema Version

A specific, content-addressable state of a schema document, identified by an @id and typically a semver tag. Schema versions are themselves documents and live in the same sync layer as data (see Schema-as-Data).

See also: Entity Version, Lens.


2.2 Entity Version

A property — gra:version or equivalent — carried by each named entity, recording which Schema Version the entity was authored under. Version is a property of the entity, not (only) of its container or store. This is what makes mixed-version data within a single store the normal case rather than an exception: a query for entities of a given conceptual type returns instances of all versions, and refinements by version are just property filters.

Example: A query against a federated person index returns {@id: ..., @type: ex:Person, gra:version: "0.3.2", name: ...} and {@id: ..., @type: ex:Person, gra:version: "0.4.5", forename: ..., surname: ...} side by side. The consumer's tool decides whether to apply a Lens based on the gra:version it sees.

See also: Derived Versioned Class, Schema Drift.


2.3 Derived Versioned Class

A conceptual subclass of an entity type, derived at runtime by the value of a property — most importantly Entity Version, but the pattern is general. ex:Person_V2_3_0 is the (conceptual) subclass of ex:Person whose members carry gra:version "2.3.0". Such derived classes can drive query refinement, dispatch in the graph-to-JSON extraction pipeline, and lens selection.

This is the same pattern as deriving ex:Author from ex:Person by the presence of an authored work, or ex:SignedDocument from ex:Document by the presence of a Signed State — property-driven subclass derivation, well-understood in description logics. Graviola applies it to versioning.

Example: The graph-to-JSON pipeline for a visualization plugin declared at V0_3_8 selects entities that are either ex:Person_V0_3_8 directly or are reachable via lens composition from another version. The selection is expressed as a query over gra:version, not as a separate negotiation step.

See also: Structural Dispatch, Reasoning-Compatible, Reasoner-Optional.

References:

  • Baader, Calvanese, McGuinness, Nardi, Patel-Schneider, The Description Logic Handbook (Cambridge University Press, 2003) — for the general theory of property-driven class derivation.

2.4 Schema Drift

The condition in which entities within a federation — or within a single store — carry different values of Entity Version for the same conceptual type. Drift is the normal case in Graviola, not an error to recover from. Tools handle drift either by applying a Lens chain, by restricting their query to a specific Derived Versioned Class, or by falling back to Classical Migration where the application owns the model.

References:

  • COPE / Edapt (Herrmannsdoerfer et al.)
  • Curino, Moon, Zaniolo, "Graceful database schema evolution: the PRISM workbench" (VLDB 2008).

2.5 Lens

A pair of transformations between two schema versions (or between two parallel schemas) consisting of a forward function (get) and a reverse function (put). Lenses are Graviola's primary mechanism for handling Schema Drift. The lens engine is an optional, pluggable component — a concrete AbstractDatastore implementation may or may not enable it, and many Graviola applications run without it.

Example: A lens from Person_V1 (single name field) to Person_V2 (forename, surname) defines how to split forward and how to recombine backward.

See also: Asymmetric Lens, Symmetric Lens, Lens Law, Lens Composition.

References:


2.6 Asymmetric Lens

A lens where one side is canonical and the other is a view. The reverse direction reconstructs the source from the view plus the original source. Most version-pair migrations are asymmetric.


2.7 Symmetric Lens

A lens where neither side is a strict view of the other; both sides may hold information the other lacks. Necessary for cross-vocabulary alignment (e.g., Graviola's local model ↔ Wikidata) where each side has fields the other doesn't.

References:


2.8 Lens Law

A property a well-behaved lens must satisfy. The three canonical laws:

  • GetPut: getting a view and putting it back unchanged yields the original source.
  • PutGet: putting a view, then getting, yields what was put.
  • PutPut: putting twice equals putting once with the latest value (very well-behaved lenses only).

Lenses that fail PutGet are Lossy and require Witness Preservation to round-trip safely.


2.9 Lens-as-Data

Lenses are themselves serializable JSON-LD documents with @ids, syncing through the same Federated Sync Layer as schemas and data. A lens can be authored, versioned, and signed independently of code.

Example: A historian signs a lens migrating heritage Person_V1 → Person_V2, vouching that the split of name into forename/surname was performed correctly for their corpus. The signature itself is a Signed State over the lens document.

References:


2.10 Lens Composition

The act of chaining lenses (A→B, B→C, C→D) into a single lens (A→D). Composition is associative; well-behavedness composes. In Graviola, a peer encountering data at V0_3_2 while running V0_4_5 assembles the migration chain by composition.

See also: Lens Fusion.


2.11 Lens Fusion

Static algebraic simplification of a composed lens chain before execution: a rename followed by a rename of the same field collapses; an add followed by a remove cancels. Graviola's "compile to fast runtime struct" step is fusion, not codegen.

References:


2.12 Lens Operator Catalog

The fixed, small set of lens primitives from which all migrations are built. Cambria's catalog: rename, hoist, plunge, wrap, head, add, remove. Graviola's catalog will likely overlap heavily; the design question is granularity (more primitives = more fusion opportunities; fewer = simpler authoring).


2.13 Lossy Lens

A lens whose forward direction discards information that the reverse direction cannot recover from the view alone. Splitting name → (forename, surname) is lossy in reverse if the original whitespace, ordering, or particle handling matters.

See also: Witness Preservation.


2.14 Witness Preservation

The technique of carrying a small companion record alongside migrated data that records what would otherwise be lost. The reverse lens consults the witness when reconstructing the source.

Example: Forward migration of "van der Berg, Jan" to {forename: "Jan", surname: "van der Berg"} emits a witness {originalName: "van der Berg, Jan", splitStrategy: "comma-first"}.


3. Mapping & Integration

3.1 Declarative Mapping

A JSON-LD-flavored DSL (Graviola's existing implementation) describing how to transform a source document into a target document via path-based source/target pairs and optional named Mapping Strategies. Used today primarily for ingesting authority data (GND, Wikidata, DBpedia) into the local model.

Example: The wikidataPersonMapping in Graviola's existing codebase: source $.claims.P569[*].mainsnak.datavalue.value.time → target birthDate via the dateStringToSpecialInt strategy.

See also: Mapping Strategy, R2RML / RML.


3.2 Mapping Strategy

A named, reusable transformation function (concatenate, takeFirst, createEntity, dateStringToSpecialInt, etc.) referenced by id from a Declarative Mapping entry. Strategies receive the source value, the current target value, options, and a Strategy Context.


3.3 Strategy Context

The runtime environment passed to a mapping strategy: logger, IRI minter, authority access, secondary-IRI resolver, mapping table, and a createDeeperContext continuation for recursive mapping into nested entities.


3.4 Migration Lens vs. Cross-Source Query vs. Tool Projection

Three distinct mapping shapes that Graviola deliberately keeps separate:

  • Migration Lens: between two versions of the same conceptual schema; bidirectional in principle; used for Schema Drift.
  • Cross-Source Query: assembles a target document from one or more foreign sources (Wikidata, GND); typically forward-only; the existing Graviola Declarative Mapping is this.
  • Tool Projection: narrows a canonical entity to the fields a specific tool needs (e.g., a filelight visualization needs only {path, size, parent}); read-only; cheap.

Conflating these is a known failure mode of "universal data integration" projects.


3.5 Mediated Schema (LAV / GAV / GLAV)

The classical data-integration framings:

  • GAV (Global-as-View): the global schema is defined as views over local sources. Adding a new source requires updating the global schema.
  • LAV (Local-as-View): each local source is described as a view over the global schema. New sources join without touching the mediator. Best fit for Graviola.
  • GLAV: a hybrid.

References:


3.6 R2RML / RML

W3C-standard declarative mapping languages from relational (R2RML) or heterogeneous (RML) sources to RDF. Graviola's declarative mapping is a JSON-LD cousin of these, optimized for JSON Linked Data rather than serialization-level transformation.

References:


3.7 Ontology Alignment

The (largely separate) problem of relating concepts across vocabularies, e.g., declaring that schema:Person and foaf:Person refer to the same class. Often expressed via owl:sameAs, skos:exactMatch, skos:closeMatch. Distinct from Lens-based migration: alignment is about identity of concepts, lenses are about transformation of representations.

References:

  • Euzenat & Shvaiko, Ontology Matching (Springer, 2nd ed. 2013).

4. Calculated Fields & Reactivity

4.1 Calculated Field

A schema property whose value is derived from other fields by a declarative formula (HyperFormula-style or similar) rather than stored directly. Structurally equivalent to a one-directional lens (get only).

Formula inputs are addressed by Binding Paths, not by Scopes: a calc must read live values from the current entity (and its declared relations), so its references are ABox traversals shaped by the schema. Schema-node pointers would be a category error here — there is nothing to compute over until a path is resolved against actual data.

Example: Person.fullName calculated as CONCAT(forename, " ", surname); an aggregate calc on a Patch reads lane.owner.id via a binding path that crosses a $ref boundary.

See also: Binding Path, Stratification, Incremental View Maintenance.

References:

  • HyperFormula — the formula engine Graviola's calc layer is patterned after.

4.2 Dependency Graph

The DAG of which calculated fields read which other fields. Used to determine recomputation order and to detect cycles.


4.3 Stratification

The ordering of a Dependency Graph into layers such that each layer depends only on previous layers. Required for safe evaluation of recursive or aggregate calculations in Datalog-style systems.

References:


4.4 Incremental View Maintenance (IVM)

The technique of updating a derived view in response to input changes by computing only the delta, rather than re-evaluating from scratch. The performance backbone of any nontrivial Calculated Field system at scale.

References:

  • Gupta & Mumick, "Maintenance of Materialized Views: Problems, Techniques, and Applications" (IEEE Data Eng. Bulletin, 1995).
  • Differential Dataflow (McSherry et al.).

4.5 Differential Dataflow

The modern, industrial-grade form of IVM: a dataflow framework that maintains the result of arbitrarily complex relational and iterative computations under input changes, with provable efficiency. Likely overkill for in-browser Graviola but the right reference point for server-side calc-heavy workloads.

References:


4.6 Complexity Annotation

A declarative tag on a Calculated Field describing its computational cost class (e.g., O(1), O(n), O(n²)) and optionally its memory footprint. Used by the runtime to choose between eager and lazy evaluation strategies and to decide whether the calc is admissible in a given Capability Context.


4.7 Capability Context

The set of resources available to the current Graviola host: memory, persistence, network, server-presence, GPU, etc. Calculated fields and visualizations declare what they need; the runtime matches and either runs, degrades, or refuses.

Example: A calc that needs {memory: "high", server: true} is skipped on the encrypted-ship deployment and surfaced as "unavailable in this environment."


4.8 Calc-as-Pure-Derivation vs. Calc-as-Cached-Materialized-View

The unresolved design tension for federated calculated fields:

  • Pure derivation: each peer recomputes locally from synced inputs. Clean, always consistent, potentially expensive.
  • Cached materialized view: results are computed once (e.g., server-side) and synced; must invalidate correctly across version skew.

Genuinely an open problem when combined with Schema Drift across CRDT-synced peers. See Cross-version calc sync.


5. Authority, Trust, & Provenance

5.1 Authority

An external data source treated as a reference for entity identity and attributes (Wikidata, GND, DBpedia, VIAF). Graviola's Declarative Mapping layer was built primarily to ingest from authorities into the local model.


A link from a local entity to its corresponding entry in an Authority, typically expressed as owl:sameAs or via a domain-specific property. Enables later re-fetching, cross-referencing, and trust evaluation.

Example: A local Person with sameAs http://www.wikidata.org/entity/Q42 for Douglas Adams.


5.3 Primary IRI / Secondary IRI

Graviola's distinction between the local canonical IRI of an entity (primary) and any external Authority IRI it is linked to (secondary). The getPrimaryIRIBySecondaryIRI resolver in the Strategy Context mediates this.


5.4 Signed State

A cryptographically signed snapshot of an entity (or a subset of its fields, or a Lens document) attesting that a named party (historian, expert, institution) vouches for its correctness at a moment in time. Multiple signatures on the same data raise its Authoritative Value.

See also: Lens-as-Data.

References:


5.5 Authoritative Value

A computed score for a piece of data based on the number, identity, and reputation of its Signed States (and possibly the trust graph among signers). Used in open historical databases to surface plausible vs. contested entries. Concrete formula is application-defined.


6. Architecture & Deployment

6.1 Local-First

The architectural stance, articulated by Ink & Switch, that user data lives primarily on user devices and remains available, editable, and useful without a central server. Graviola is local-first by default; servers, when present, are transports or accelerators, not authorities.

References:


6.2 Browser/Server Symmetry

The Graviola constraint that core layers (lens engine, validator, IVM) run identically in browser and server environments. Drives the choice of pure JS / WASM implementations and forbids server-only dependencies in core packages.


6.3 Classical Migration

The traditional path of evolving a schema by writing imperative migration scripts run in staging and production, typically against a relational or document database via an ORM. Graviola supports this path explicitly: where an application has strong authorship over its data model and runs a centralized backend (e.g., Prisma on PostgreSQL or MongoDB), the Lens machinery is unnecessary and the application uses Prisma migrations directly. The lens engine is plugged into a concrete AbstractDatastore implementation only when Schema Drift across uncoordinated peers is actually a concern.

This dual path is deliberate. Lenses solve a real but specific problem (federated, uncoordinated, version-skewed peers); classical migration solves the common case (one team owns the database). Graviola treats both as first-class.


6.4 Spine vs. Tissue Packages

A monorepo discipline distinguishing spine packages (interfaces, contracts, types — versioned slowly, broadly depended on) from tissue packages (implementations — versioned freely, narrowly depended on). Reduces the sync burden of a 100-package monorepo.

Example: @graviola/mapping-contracts (spine) defines the DeclarativeMapping types; @graviola/mapping-strategies-cultural-heritage (tissue) implements specific strategies for that domain.


6.5 JSON Forms

The schema-driven form rendering library Graviola uses for UI generation. Embodies Structural Dispatch: a renderer registry resolves shape→component at runtime, decoupling UI from concrete entity types.

UI schema elements bind to schema nodes via a Scope (e.g., scope: "#/properties/name"). Scopes here address the schema, not the data — a property a Graviola application reuses when it adds annotations or rules at the same surface. Operations that read or write entity values (calculated fields, formula bindings, traversals across $ref) use Binding Paths instead.

See also: Scope, Binding Path.

References:


6.6 AbstractDatastore

Graviola's interface contract for a concrete data backend. Implementations include in-memory stores, Yjs-backed stores, SPARQL endpoints, Prisma-backed relational stores, and others. The lens engine, IVM layer, and signing layer are opt-in features that a given AbstractDatastore implementation may or may not expose; consumers of an AbstractDatastore discover available capabilities through its declared interface.

See also: Capability Context, Classical Migration.


Appendix: Reading Order for Newcomers

For a developer new to Graviola who wants to understand the conceptual stack, roughly in this order:

  1. Local-first software (Kleppmann et al., 2019) — the why.
  2. Project Cambria — the closest existing system.
  3. Foster et al., Combinators for Bidirectional Tree Transformations — the lens foundations.
  4. Lenzerini, Data Integration: A Theoretical Perspective — the federation framing.
  5. Existing Graviola Declarative Mapping code and example mappings (Wikidata person, GND).
  6. JSON Forms documentation — for the UI dispatch pattern that mirrors the data layer.

Further reading

This chapter collects curated pointers for deepening beyond this book. It does not duplicate framework API documentation or Storybook; those remain the right places for component-level detail.


From the glossary

The Glossary appendix: reading order for newcomers lists a sensible sequence of external papers and tools. Start there if you want a single ordered list.

For unresolved design questions that are not tied to a single glossary entry, see Outlook and open questions. For how generative assistants can attach to schema-driven workflows, see Graviola in the age of generative tools.


Framework and examples

  • Graviola framework monorepo — packages, layers, and apps/testapp as the minimal CRUD walkthrough.
  • Storybook (in the monorepo) — interactive documentation for JSON Forms renderers, tables, and some conceptual demos (e.g. mapping); intended to be cleaned up over time, but already useful for experiencing behavior.

Concepts touched in this book

TopicStarting point
Local-firstLocal-first software (Ink & Switch)
Schema evolution / lensesProject Cambria, Pierce et al. on lenses (linked from Glossary)
Data integrationLenzerini, Data Integration: A Theoretical Perspective (linked from Glossary)
UI structural dispatchJSON Forms
Generative tooling + schema-first UIGraviola in the age of generative tools, assisted-forms-designer

See also