Multi-AI Offline/Online Setup

This runbook sets up a local-first content pipeline so multiple AI agents can coordinate with the same source of truth offline, then mirror that content online when network services are available.

Goal

  • Keep coordination-critical content available locally at all times.
  • Produce deterministic offline bundles for handoff/replay.
  • Optionally mirror bundle artifacts online to Hugging Face datasets.

Components Already in This Repo

  • Offline governance baseline: docs/OFFLINE_MODE_SPEC.md
  • Session continuity store: docs/map-room/session_handoff_latest.md
  • Notion sync source map: scripts/sync-config.json
  • Doc ingest pipeline: scripts/ingest_docs_to_training_jsonl.py
  • Manifest verifier: training/doc_verifier.py
  • HF uploader: scripts/push_to_hf.py
  • New one-command orchestrator: scripts/run_multi_ai_content_sync.py
  • Cloud verification+shipping pipeline: scripts/cloud_kernel_data_pipeline.py

Prerequisites

  • Python 3.10+
  • Node.js (for optional Notion sync)
  • Optional env for online mirror: HF_TOKEN
  • Optional env for Notion sync: NOTION_API_KEY

Offline-Only Setup

Run from repo root:

python scripts/run_multi_ai_content_sync.py

PowerShell wrapper:

.\scripts\run_multi_ai_content_sync.ps1

Offline + Notion Refresh

$env:NOTION_API_KEY = "your_notion_key"
python scripts/run_multi_ai_content_sync.py --sync-notion

Sync only specific mapped pages:

python scripts/run_multi_ai_content_sync.py --sync-notion --notion-config-key multiAiDevelopmentCoordination --notion-config-key hydraMultiAgentCoordinationSystem

Offline + Online Mirror (Hugging Face)

$env:HF_TOKEN = "your_hf_token"
python scripts/run_multi_ai_content_sync.py --sync-notion --hf-dataset-repo issdandavis/scbe-multi-ai-corpus

Outputs

Each run creates a timestamped bundle:

  • training/runs/multi_ai_sync/<timestamp>/offline_docs.jsonl
  • training/runs/multi_ai_sync/<timestamp>/doc_manifest.json
  • training/runs/multi_ai_sync/<timestamp>/run_summary.json
  • training/runs/multi_ai_sync/<timestamp>.zip
  • Pointer to latest run: training/ingest/latest_multi_ai_sync.txt

How Agents Use It

  • Offline lane uses filesystem MCP to read bundle files.
  • Offline lane uses scbe_map_room_read_latest and scbe_map_room_write_latest for continuity.
  • Offline lane uses scbe_decide_offline for deterministic policy gating.
  • Online lane mirrors JSON/manifest artifacts to a HF dataset.
  • Online lane keeps GitHub as canonical versioned source.
  • During active development: run every major session end.
  • For shared teams: run on a scheduler (for example every 4-6 hours) with --sync-notion and optional HF mirror.

Cloud-First Variant

If you want cloud storage as the primary dataset source for model training, use:

  • Runbook: docs/CLOUD_KERNEL_DATA_PIPELINE.md
  • Script: scripts/cloud_kernel_data_pipeline.py
  • Workflow: .github/workflows/cloud-kernel-data-pipeline.yml

© 2026 Aethermoore - Issac Davis, Founder | Patent Pending (63/961,403) | Products | Demo

This site uses Just the Docs, a documentation theme for Jekyll.