Download the Dataset
The full 4ORBS knowledge base as structured JSON: every theory, claim, evidence artifact, and reference extracted from 649 analyzed videos. Free for research and non-commercial use.
Dataset Overview
4ORBS maintains a structured knowledge base extracted from YouTube video analyses by Ashton Forbes. Each video is processed through an AI analysis pipeline that extracts theories, claims, evidence, scientific references, key figures, and technical terminology. The resulting dataset is the largest structured collection of MH370 investigation research available.
Provenance
All data is derived from publicly available YouTube content. AI-extracted claims carry confidence ratings; no editorial claims of absolute truth are made.
Freshness
Data exports are regenerated with every site build. New videos are ingested regularly through the automated pipeline. Last generated: April 13, 2026.
Methodology
See the Methodology page for details on how content is extracted, classified, and quality-scored.
Collections
Each collection is a JSON array of objects. Files are pretty-printed for readability and served with gzip compression.
Theories
Structured hypotheses extracted from video analyses: name, description, supporting points, cluster, tags, quality tier, and investigation status.
Claims
Specific factual assertions with cited evidence and confidence ratings (definitive, strong, speculative). The atomic building blocks of theories.
Videos
YouTube metadata for all analyzed videos: title, duration, view count, publish date, summary, and embedded claims, theories, and people arrays.
People
Key figures mentioned across the archive: scientists, whistleblowers, officials, and researchers with their roles and appearance contexts.
Evidence
Documents, footage, testimony, patents, and data artifacts cited as proof: typed, described, and linked to source videos.
References
Scientific topics, papers, and research programs categorized by field and mainstream acceptance status (mainstream, emerging, speculative, classified).
Glossary
Technical terms and jargon defined in plain language, tagged by field (physics, aviation, intelligence, etc.).
Clusters
The 5 topic clusters that organize theories: MH370, Energy & Physics, Disclosure, Consciousness, and Technology.
Schema Reference
Field documentation for each collection. Fields marked with ? are optional and may not be present on all items.
Theories
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| name | string | Theory name/title |
| description | string | Full description of the theory |
| supportingPoints | string[] | List of supporting arguments |
| relatedTheories | string[] | Slugs of related theories |
| cluster | string | Topic cluster (e.g. "MH370 Investigation") |
| tags | string[] | Classification tags |
| sourceVideo | object | null | Source video {title, url, videoId} |
| qualityTier | string? | featured | enhanced | stub |
| investigation_status | string? | active | strong_evidence | circumstantial | speculative | debated |
| enhanced_description | string? | Editorial-enhanced description |
| key_insight | string? | Core insight summary |
| critical_context | string? | Important contextual information |
| editorial_tags | string[]? | Editorial classification tags |
| connections_narrative | string? | How this theory connects to others |
Claims
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| claim | string | The assertion being made |
| evidenceCited | string | Evidence cited in support |
| confidence | string | definitive | strong | speculative |
| category | string | Normalized category |
| sourceVideo | object | null | Source video {title, url, videoId} |
Videos
| Field | Type | Description |
|---|---|---|
| videoId | string | YouTube video ID |
| title | string | Video title |
| url | string | Full YouTube URL |
| duration | number | Duration in seconds |
| durationString | string | Human-readable duration (e.g. "1:23:45") |
| viewCount | number | View count at time of analysis |
| description | string | YouTube video description |
| thumbnailUrl | string | Thumbnail image URL |
| hasAnalysis | boolean | Whether AI analysis was performed |
| hasTranscript | boolean | Whether transcript is available |
| summary | string | AI-generated content summary |
| publishDate | string? | ISO date string |
| tags | string[] | Content tags |
| claims | object[] | Embedded claims extracted from this video |
| theories | object[] | Theory references found in this video |
| people | object[] | People mentioned in this video |
People
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| name | string | Person's name |
| appearances | object[] | List of appearances with role, context, and sourceVideo |
Evidence
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| type | string | Normalized evidence type (document, video, testimony, patent, etc.) |
| description | string | What this evidence is |
| source | string | Where this evidence comes from |
| significance | string | Why this evidence matters |
| sourceVideo | object | null | Source video {title, url, videoId} |
| confidence | string? | definitive | strong | speculative |
| counterargument | string? | Known counter-arguments |
| counterargumentSource | string? | Source of counter-arguments |
| enhancedDescription | string? | Editorial-enhanced description |
References
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| topic | string | Research topic name |
| description | string | What this reference covers |
| field | string | Scientific field (physics, engineering, etc.) |
| mainstreamStatus | string | mainstream | emerging | speculative | classified |
| scientistsOrPapers | string | Key scientists or papers |
| sourceVideo | object | null | Source video {title, url, videoId} |
Glossary
| Field | Type | Description |
|---|---|---|
| slug | string | URL-safe identifier |
| term | string | The technical term |
| definition | string | Plain-language definition |
| field | string | Subject field (physics, aviation, intelligence, etc.) |
Clusters
| Field | Type | Description |
|---|---|---|
| id | string | Cluster identifier |
| name | string | Cluster name |
| description | string | What this cluster covers |
| color | string | Hex color code for visualization |
| theoryCount | number | Number of theories in this cluster |
License
Creative Commons Attribution-NonCommercial-ShareAlike 4.0
You are free to share and adapt this data for non-commercial purposes, provided you give appropriate credit and distribute any derivative works under the same license.
Usage Examples
Quick examples for working with the data in common tools and languages.
curl
# Download all theories
curl -o theories.json https://4orbs.com/data/theories.json
# Check manifest for latest counts
curl -s https://4orbs.com/data/manifest.json | python3 -m json.tool Python
import json, urllib.request
url = "https://4orbs.com/data/theories.json"
theories = json.loads(urllib.request.urlopen(url).read())
# Filter by cluster
mh370 = [t for t in theories
if "MH370" in t["cluster"]]
print(f"{len(mh370)} MH370 theories") JavaScript
const res = await fetch("https://4orbs.com/data/claims.json");
const claims = await res.json();
// Count by confidence level
const counts = Object.groupBy(claims, c => c.confidence);
console.log(Object.keys(counts).map(
k => `${k}: ${counts[k].length}`
)); jq
# Top 10 most-viewed videos
curl -s https://4orbs.com/data/videos.json | \
jq 'sort_by(-.viewCount) | .[0:10] |
.[] | {title, viewCount}'
# People with 5+ appearances
curl -s https://4orbs.com/data/people.json | \
jq '[.[] | select((.appearances | length) >= 5)] |
sort_by(-(.appearances | length)) |
.[] | {name, count: (.appearances | length)}'