ICT:Data Model and Media Architecture
Data Modelling & Media Management Architecture
This page documents the architectural decisions and working principles used for structuring data, relationships, and digital media in this MediaWiki installation.
It reflects conclusions reached during the design phase and serves as a reference for future development and maintenance.
Purpose
The goal of this architecture is to:
- Support complex research data with many relationships
- Separate internal research work from public presentation
- Manage digital media (files, derivatives, metadata) in a controlled and scalable way
- Avoid schema drift and ad-hoc solutions
- Remain compatible with MediaWiki core and extensions
Core Technologies
The system is based on the following components:
- Cargo – authoritative data storage (database schema)
- Page Schemas – page structure and data-entry UI
- Forms – controlled data input
- Namespaces – separation of concerns and access control
- File namespace – physical file storage (MediaWiki core)
Each component has a clearly defined responsibility.
Authority and Responsibility
| Component | Responsibility |
|---|---|
| Cargo tables | Define and own the database schema |
| Page Schemas | Define page types and map fields to Cargo |
| Forms | Control how editors enter data |
| Templates | Store data via Cargo |
| Database (MySQL) | Implementation detail only |
Important: Database tables must not be created or modified directly in MySQL. The Cargo table pages are the single source of truth for schema definition.
Workflow Principle
The general workflow is:
- Design the model (DBML / diagrams)
- Create or update Cargo tables
- Activate Cargo tables by saving the Cargo page
- Create Page Schemas using the UI
- Generate forms
- Enter and test real data
- Iterate carefully
Namespaces and Their Roles
Namespaces are used to separate concerns and control access.
| Namespace | Purpose | Visibility |
|---|---|---|
| (Main) | Public research results and narratives | Public |
| HO: | Structured Heritage Objects | Club members |
| DA: | Digital Asset metadata and relationships | Club members |
| File: | Physical file storage | Uploads restricted |
| ICT: | Technical and architectural documentation | Club members |
Namespaces must not be redefined. The built-in File: namespace remains unchanged.
File Management Strategy
MediaWiki’s file model is page-centric and limited for complex workflows. To address this, a clear separation is enforced:
- File: pages store physical files only
- DA: pages store semantic metadata about digital assets
Digital Assets (DA) act as the central abstraction layer.
Internal vs Public Files
Not all files are equal:
- Internal files:
- High-resolution masters
- OCR outputs
- Working derivatives
- Public files:
- Curated, approved derivatives
- Downscaled or watermarked versions
Internal and public files are managed by convention, not by redefining namespaces.
Upload Permissions
- Uploading files is restricted to club members
- Viewing files remains public (to support public pages)
- Editors must ensure that only approved files are embedded on public pages
Public pages must never depend on internal-only files.
Digital Assets (DA)
A Digital Asset represents a conceptual media object and may reference:
- One or more File: pages
- A parent Digital Asset (for derivatives)
- One or more Heritage Objects
DA pages are internal and never directly exposed to the public.
Handling Many-to-Many Relationships
Relational join tables from the original data model are implemented as:
- Cargo subtables
- Repeatable sections in Page Schemas
Example use cases:
- Object–Person relationships
- Object–Set memberships
- Provenance chains
This avoids page explosion and keeps relationships contextual.
Page Schemas vs Cargo Tables
Cargo tables may be created in two ways:
- Manually (Cargo-first)
- Via Page Schemas UI (UI-first)
In both cases:
- The Cargo table page must be saved manually to activate the database table
- Page Schemas do not execute database changes automatically
For complex, stable models, a Cargo-first approach is preferred.
Design Principles
The following principles guide all future development:
- Cargo owns structure, Page Schemas own usability
- Internal complexity is allowed; public simplicity is required
- Publication is a controlled act, not a permission toggle
- Add fields freely; rename or remove fields carefully
- Prefer conventions over hacks
- Avoid direct database manipulation
Status
This architecture is considered the current baseline.
Changes must be documented and reviewed before implementation.