Jump to content

ICT:Data Model and Media Architecture

From Costa Sano MediaWiki
Revision as of 16:45, 17 January 2026 by Mngr (talk | contribs) (Created page with "= Data Modelling & Media Management Architecture = This page documents the architectural decisions and working principles used for structuring data, relationships, and digital media in this MediaWiki installation. It reflects conclusions reached during the design phase and serves as a reference for future development and maintenance. == Purpose == The goal of this architecture is to: * Support complex research data with many relationships * Separate internal researc...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Data Modelling & Media Management Architecture

This page documents the architectural decisions and working principles used for structuring data, relationships, and digital media in this MediaWiki installation.

It reflects conclusions reached during the design phase and serves as a reference for future development and maintenance.

Purpose

The goal of this architecture is to:

  • Support complex research data with many relationships
  • Separate internal research work from public presentation
  • Manage digital media (files, derivatives, metadata) in a controlled and scalable way
  • Avoid schema drift and ad-hoc solutions
  • Remain compatible with MediaWiki core and extensions

Core Technologies

The system is based on the following components:

  • Cargo – authoritative data storage (database schema)
  • Page Schemas – page structure and data-entry UI
  • Forms – controlled data input
  • Namespaces – separation of concerns and access control
  • File namespace – physical file storage (MediaWiki core)

Each component has a clearly defined responsibility.

Authority and Responsibility

Component Responsibility
Cargo tables Define and own the database schema
Page Schemas Define page types and map fields to Cargo
Forms Control how editors enter data
Templates Store data via Cargo
Database (MySQL) Implementation detail only

Important: Database tables must not be created or modified directly in MySQL. The Cargo table pages are the single source of truth for schema definition.

Workflow Principle

The general workflow is:

  1. Design the model (DBML / diagrams)
  2. Create or update Cargo tables
  3. Activate Cargo tables by saving the Cargo page
  4. Create Page Schemas using the UI
  5. Generate forms
  6. Enter and test real data
  7. Iterate carefully

Namespaces and Their Roles

Namespaces are used to separate concerns and control access.

Namespace Purpose Visibility
(Main) Public research results and narratives Public
HO: Structured Heritage Objects Club members
DA: Digital Asset metadata and relationships Club members
File: Physical file storage Uploads restricted
ICT: Technical and architectural documentation Club members

Namespaces must not be redefined. The built-in File: namespace remains unchanged.

File Management Strategy

MediaWiki’s file model is page-centric and limited for complex workflows. To address this, a clear separation is enforced:

  • File: pages store physical files only
  • DA: pages store semantic metadata about digital assets

Digital Assets (DA) act as the central abstraction layer.

Internal vs Public Files

Not all files are equal:

  • Internal files:
    • High-resolution masters
    • OCR outputs
    • Working derivatives
  • Public files:
    • Curated, approved derivatives
    • Downscaled or watermarked versions

Internal and public files are managed by convention, not by redefining namespaces.

Upload Permissions

  • Uploading files is restricted to club members
  • Viewing files remains public (to support public pages)
  • Editors must ensure that only approved files are embedded on public pages

Public pages must never depend on internal-only files.

Digital Assets (DA)

A Digital Asset represents a conceptual media object and may reference:

  • One or more File: pages
  • A parent Digital Asset (for derivatives)
  • One or more Heritage Objects

DA pages are internal and never directly exposed to the public.

Handling Many-to-Many Relationships

Relational join tables from the original data model are implemented as:

  • Cargo subtables
  • Repeatable sections in Page Schemas

Example use cases:

  • Object–Person relationships
  • Object–Set memberships
  • Provenance chains

This avoids page explosion and keeps relationships contextual.

Page Schemas vs Cargo Tables

Cargo tables may be created in two ways:

  • Manually (Cargo-first)
  • Via Page Schemas UI (UI-first)

In both cases:

  • The Cargo table page must be saved manually to activate the database table
  • Page Schemas do not execute database changes automatically

For complex, stable models, a Cargo-first approach is preferred.

Design Principles

The following principles guide all future development:

  • Cargo owns structure, Page Schemas own usability
  • Internal complexity is allowed; public simplicity is required
  • Publication is a controlled act, not a permission toggle
  • Add fields freely; rename or remove fields carefully
  • Prefer conventions over hacks
  • Avoid direct database manipulation

Status

This architecture is considered the current baseline.

Changes must be documented and reviewed before implementation.