EO-AI Labs  /  Applied Research Roadmap

Applied Research Roadmap:
Foundation Models · Data Schemas · Agentic Systems

The operational blueprint for building production-grade now-casting engines, scenario planning platforms, and real-time alert & decision support systems using EO-AI, Industrial AI, and Research Commercialisation infrastructure.

9 Architecture Parts 5-Year Roadmap L1–L5 Maturity Framework FM · Schemas · Agents · DSS Led by Dr. Abhay Gupta
🎯
Part I — Strategic Context & Capability Architecture

EO-AI, Industrial AI, and Research Commercialisation are converging on a common intelligence stack. The thesis: the satellite data exists, the AI methods exist, the compute exists. What's missing is organised, well-framed research that connects them to decisions that matter — at the speed those decisions need to be made. Now-casting, scenario planning, and real-time alert systems are the three highest-value commercialisation surfaces at this convergence.

Tri-Domain Convergence

Domain 01
EO-AI
Planetary sensing backbone — continuous, multi-modal, physically grounded observations of Earth systems. Commercialisation surfaces: now-casting dashboards, environmental compliance APIs, crop/disaster intelligence SaaS.
Domain 02
Industrial AI
Operational reasoning layer — process models, supply chain intelligence, infrastructure monitoring, real-time control. Commercialisation surfaces: predictive maintenance, operational decision support, industrial scenario planning.
Domain 03
EO-AI Labs (Research Commercialisation)
Translational bridge — converting research-grade models into reproducible, auditable, scalable products with defensible IP and revenue models. Commercialisation surfaces: licensed FMs, data products, consulting-led deployments.
Maturity Gate
5-Level Capability Model
L1 Data Ingestion → L2 Feature Intelligence → L3 Fusion & Now-Casting → L4 Scenario Planning → L5 Autonomous Decision Support. Each level has quantified gate conditions. Skipping levels creates compounding technical debt.
🧠
Part II — Foundation Model Architecture

Four-Tier Model Taxonomy

Foundation models in the EO-AI stack are not monolithic. Four model tiers — Sensing, Fusion, Reasoning, and Action — each have distinct training paradigms, fine-tuning protocols, and latency budgets. Components can be independently upgraded without cascading failures.

Tier 1
Sensing FM
Self-supervised geospatial encoder pretrained on 100M+ global EO patches. Architecture: Vision Transformer (ViT-L) with spectral token embeddings. Pretraining: Masked Autoencoder (MAE) on Sentinel-1/2, Landsat, MODIS, ISRO ResourceSat. Output: 768-dim patch embeddings. Examples: Prithvi-100M, SatMAE, ScaleMAE.
Tier 2
Fusion FM
Cross-modal attention transformer aligning representations across sensors, timesteps, and spatial resolutions. Learnable positional encodings for spatial + temporal + spectral axes. Harmonises Sentinel-1 SAR with Sentinel-2 optical and MODIS temporal context in a unified latent space.
Tier 3
Reasoning FM
Domain-tuned vision-language model for geospatial QA, change captioning, and scenario interpretation. Architecture: Multimodal LLM with EO adapter (e.g., LLaVA-GEO or fine-tuned LLaMA-3). Capabilities: NL query → spatial analysis, change event narration, policy brief generation from model outputs.
Tier 4
Action FM
ReAct / Toolformer-style agentic model translating inference outputs into structured action recommendations, alert payloads, and workflow triggers. Output: JSON-structured action plans with confidence, urgency, and stakeholder routing. All P1 actions require HITL confirmation.

Pretraining Protocol

Stage 1
Global Pretraining
Data: 500M+ global EO patches (Sentinel-1/2, Landsat, MODIS, SAR) · Objective: Self-supervised MAE / contrastive learning; no labels required · Compute: 4,000–16,000 GPU-hours (A100); one-time cost
Stage 2
Domain Adaptation
Data: 50M+ domain patches (e.g., India crops, Arctic ice, tropical forest) · Objective: Continued MAE + domain-specific contrastive pairs · Compute: 500–2,000 GPU-hours; repeat per domain
Stage 3
Task Fine-Tuning
Data: 10K–500K labelled examples per task · Objective: Supervised head training (segmentation, detection, regression) · Compute: 10–200 GPU-hours per task; iterate frequently
Stage 4
RLHF / RLAIF
Data: Expert analyst feedback + automated reward signals (NDVI correlation, physics constraints) · Objective: Align outputs with domain expert expectations and physical laws · Compute: 50–500 GPU-hours; monthly cadence
🗂️
Part III — Canonical Data Schemas

Data interoperability is the single largest bottleneck in EO-AI production systems. Five-layer schema stack from raw ingestion to decision payloads — STAC-compatible, OGC-compliant, and JSON-LD serialisable.

Layer 0
Raw Ingestion
Sensor metadata (platform, acquisition datetime, orbit, cloud cover %), raw band values (uint16/float32), CRS (EPSG), tile reference (MGRS/H3), processing baseline version, provenance URI (STAC Item href).
Layer 1
Analysis-Ready Data
Surface reflectance / sigma-naught (SAR), quality band (SCL/cloud mask), per-pixel uncertainty, harmonised to common 10m UTM grid. Schema: STAC Item + EO Extension + SAR Extension. Required: eo:bands[], sar:polarizations[], view:sun_elevation, created, updated.
Layer 2
Feature Store
Computed features keyed by (tile_id, timestamp, feature_set_version). Stores: spectral indices (NDVI, NDWI, NBR, RVI), GLCM texture, temporal statistics (mean, std, 10th/90th percentile), model embeddings (768-dim vector). Storage: columnar GeoParquet + vector index (FAISS/pgvector).
Layer 3
Inference Outputs
Per-scene inference results. Fields: model_id, inference_datetime, input_tile_ids[], output_class or output_value, confidence_score (0–1), uncertainty_band (aleatoric + epistemic), spatial_mask (GeoJSON), anomaly_score, change_flag (bool), lineage_hash. Schema enforced by Pydantic v2.
Layer 4
Decision Payload
Structured output consumed by downstream systems. Fields: alert_id (UUID), event_type (taxonomy code), severity (P1–P4), geo_extent (GeoJSON), trigger_model_id, evidence_tiles[], recommended_actions[] (enum), stakeholder_routes[] (role-based), expiry_datetime, human_review_required (bool), audit_trail_url.

Data Standards Compliance

Standard
STAC 1.1
Canonical metadata for all EO assets. All ingested datasets must have valid STAC Items; STAC API endpoint required for all data lakes.
Standard
GeoParquet 1.1
High-performance columnar storage for feature store and inference outputs. H3 resolution-8 spatial index embedded in schema.
Standard
OGC API — Features / Coverages
Interoperable query interface for vector and raster outputs. All alert polygons and inference outputs exposed via OGC API.
Standard
CF Conventions 1.11 + FAIR
All gridded now-cast and scenario outputs written as CF-compliant NetCDF4. All training datasets assigned DOI; data cards published alongside model cards.
🤖
Part IV — Agentic Framework Design

The agentic layer transforms passive model inference into active decision support via a Perceive–Reason–Act–Learn loop, continuously processing EO data streams, reasoning over multi-step analytical chains, triggering actions in connected systems, and learning from feedback.

PRAL Loop Architecture

Perceive
Continuous ingestion of EO inference outputs (Layer 3 records), industrial sensor streams, weather model outputs, and external event feeds (GDACS, USGS, IMD). Implements stream processors (Apache Kafka / Flink) with configurable trigger rules. Output: prioritised event queue ranked by anomaly score × asset exposure.
Reason
Multi-step LLM-based reasoning chain (ReAct pattern) that interprets event queue items, queries relevant historical context from feature store, invokes specialised analytical tools, and formulates a structured situation assessment. Uses chain-of-thought with physical constraint checking.
Act
Converts situation assessment into structured Decision Payloads (Layer 4) and routes them to appropriate stakeholder systems: alert APIs, dashboard state updates, email/SMS/WhatsApp triggers, GIS platform updates, SCADA system inputs, and human analyst review queues. All actions logged to immutable audit trail.
Learn
Collects feedback signals: analyst confirmations/rejections, outcome observations (was the predicted flood confirmed?), false positive reports. Feeds into RLHF pipeline for Tier 4 Action FM. Updates alert threshold calibration tables weekly. Generates automated performance reports for model governance.

Multi-Agent Orchestration Patterns

Pattern 1
Supervisor–Worker
A Supervisor Agent decomposes complex requests into sub-tasks dispatched to specialist Worker Agents (damage detector, exposure calculator, relief route planner). Results aggregated into unified situation report. Suitable for: post-disaster response, multi-hazard compound events.
Pattern 2
Pipeline Chain
Agents chained in fixed sequence: Ingest → Preprocess → Detect → Assess → Route. Each agent passes structured output to the next via the canonical schema stack. Suitable for: routine now-casting pipelines, scheduled monitoring products, regulatory compliance reports.
Pattern 3
Debate & Consensus
Multiple specialised agents independently analyse the same event (SAR agent, optical agent, hydrological model agent) and a Consensus Agent resolves disagreements using Bayesian ensemble weighting. Suitable for: high-stakes alerts where false positives have large costs (dam safety, evacuation orders).
Pattern 4
Human-in-the-Loop
Agent generates draft assessment and flags for human analyst review before any external action is taken. Analyst can approve, modify, or reject. Rejection triggers explanation request and updates RLHF feedback queue. Mandatory for: all P1 alerts, novel event types outside training distribution, regulatory submissions.
Pattern 5
Background Monitor
Always-on agent subscribes to EO data streams and maintains a live world model state per AOI and asset class. Triggers event-driven alerts when state change exceeds configured thresholds. Suitable for: persistent monitoring of critical infrastructure, early warning systems, environmental compliance.
⏱️
Part V — Now-Casting Engine

Now-casting produces high-confidence estimates of current Earth system state (0–72 hour horizon), filling the gap between the last satellite overpass and the present moment by combining recent EO observations with NWP model outputs, sensor fusion, and learned temporal dynamics.

5-Module Architecture

Module 1
Observation Fusion
Ingests all available observations from the past 24–48 hours: SAR (cloud-penetrating), optical (cloud-free), passive microwave, VIIRS NTL, and in-situ IoT sensors. Harmonises to common spatial grid (1 km default; 10 m for priority AOIs). Outputs a gapless composite via weighted spatial interpolation using observation quality flags.
Module 2
Temporal Dynamics
LSTM or Temporal Transformer trained on 5+ years of historical feature time series per land cover class. Predicts expected current state based on recent trajectory and seasonal climatology. Outputs prior estimate with uncertainty for each gridcell, serving as background field for data assimilation.
Module 3
Data Assimilation
Ensemble Kalman Filter (EnKF) or learned 4D-Var analogue merging observation fusion outputs with temporal dynamics model prior. Produces posterior now-cast state estimate with full covariance matrix. Correctly propagates observation uncertainty and model uncertainty into the final product.
Module 4
NWP Downscaling
Statistical or physics-constrained GAN downscaling of NWP fields (temperature, humidity, wind, precipitation) from 10–50 km to 1 km. Conditioned on high-resolution topography and land cover. Provides atmospheric forcing for surface state extrapolation beyond last EO overpass.
Module 5
Output Renderer
Assembles final now-cast product as CF-compliant NetCDF + GeoTIFF + STAC Item. Computes skill scores against persistence and climatology baselines (CRPS, FSS, ACC). Publishes to OGC API — Coverages endpoint and pushes update events to subscribed downstream agents.

Now-Cast Product Catalogue

Product · Daily
Vegetation Stress Index (VSI)
250 m / 1 km. Primary users: Agriculture ministries, crop insurance, commodity traders. Feeds PM-FASAL and PMFBY systems.
Product · 6-hourly
Flood Inundation Extent
20–30 m (SAR pass). Primary users: DMA/NDMA, emergency managers, infrastructure operators. Brahmaputra / Ganga / coastal deltas.
Product · Hourly
Active Fire & Smoke Plume
375 m / 1 km (VIIRS/Himawari). Primary users: Forest departments, aviation, air quality agencies. Feeds SAFAR and CPCB systems.
Product · 12-hourly
Surface Soil Moisture
1 km (downscaled SMAP). Primary users: Farmers, irrigation agencies, drought monitors. Feeds NDWRS and IMD advisories.
Product · 3-hourly
Urban Heat Stress Index
70 m (ECOSTRESS). Primary users: Public health authorities, smart city operators. Feeds NDMA heat action plans.
Product · Daily
Industrial Emission Plume
3.5 × 5.5 km (TROPOMI). Primary users: Regulators (CPCB, EPA), ESG auditors, climate disclosure platforms. Feeds NCAP monitoring.
🔮
Part VI — Scenario Planning Platform

Scenario planning extends now-casting into probabilistic futures under alternative policy, climate, or operational assumptions. The platform ingests now-cast state as its initial condition and propagates it forward using an ensemble of models conditioned on user-defined scenario parameters.

6-Module Platform Architecture

Scenario Parameter Engine
Accepts structured scenario definitions via API or natural language. Parameters include: climate forcing (SSP1.9 / SSP2.4 / SSP5.8), policy levers (deforestation ban, irrigation quota), extreme event forcings (Category 4 cyclone, 1-in-100-yr drought), and operational decisions (infrastructure expansion, crop switching). Outputs validated JSON Scenario Specification.
Ensemble Model Runner
Executes N=50–500 ensemble members per scenario using cloud HPC. Near-term (<1 month): perturbed initial condition ensembles. Medium-term (1–12 months): multi-model ensembles (EO-AI + NWP + hydrological). Long-term (1–50 yr): CMIP6-downscaled statistical emulators. All runs output CF-compliant NetCDF to shared object storage.
Impact Assessment Module
Post-processes ensemble outputs through sector-specific impact models: (1) Agricultural yield response functions (DSSAT/APSIM surrogates), (2) Flood damage curves (HAZUS-calibrated, country-specific), (3) Carbon flux models (flux tower-calibrated), (4) Infrastructure failure probability (engineering fragility curves). Outputs: 10th/50th/90th percentile impact estimates per scenario.
Scenario Comparison Engine
Computes pairwise divergence between scenarios using Earth Mover Distance on spatial distributions. Identifies 'tipping point' scenarios where small parameter changes produce disproportionate impact shifts. Flags scenarios outside training distribution of underlying models (OOD warning).
Counterfactual Generator
Produces 'what if?' counterfactuals using DoWhy causal inference graphs, ensuring physical consistency and policy relevance. Example: What if Cyclone Fani had made landfall 50 km north? What if stubble burning was eliminated from Punjab-Haryana?
Scenario Narrative Writer
LLM-based module (Reasoning FM, Tier 3) converting ensemble output statistics into structured scenario narratives: executive summary (200 words), sector impact assessment (500 words/sector), uncertainty statement, recommended contingency actions, and monitoring indicators. Output: structured markdown + auto-formatted PDF brief.
🚨
Part VII — Real-Time Alert & Decision Support Systems

Alert Pipeline Architecture

Event Detection
Stream processor (Apache Flink / Kafka Streams) applying three detection modes: (1) Threshold crossing (flood depth >0.5 m in populated area), (2) Rate-of-change alert (NDVI decline >2σ in 7 days), (3) Compound event (drought + locust presence + crop stress co-occurring). All events logged with nanosecond timestamp, model provenance, and raw trigger values.
Triage & Deduplication
Merges spatially overlapping alerts within a 1-hr window. Applies geospatial clustering (DBSCAN on alert centroids) and temporal windowing. Assigns unified Compound Alert ID. Applies pre-configured suppression rules for known false-positive patterns (SAR layover artefacts in mountains, cloud shadows misclassified as flood).
Severity Classification
P1 — Life-safety imminent; auto-route to emergency services (HITL required) · P2 — High economic/environmental impact; route to sector decision-makers · P3 — Monitoring advisory; route to operational teams · P4 — Background information; push to dashboards, no active notification. Classification uses: population exposure × asset value at risk × confidence × time-to-impact.
Stakeholder Routing
Configurable routing table mapping event_type × severity × geography to notification lists. Supports: REST webhooks, SMS (Twilio), WhatsApp (WABA), email, ESRI ArcGIS push, Slack/Teams channels, SCADA alarm bus (OPC-UA), and national emergency portals (IEMS, NDEMS). All routing decisions logged and auditable.
Alert Verification
Every alert has a defined verification pathway: automated verification (subsequent EO observation confirms/denies), human analyst verification (mandatory P1/P2), or stakeholder field confirmation. Verified outcomes feed back into the Learn Layer. Alert closure triggers a structured post-event report generation (Reasoning FM).

DSS Design Principles

Explainability First
Every recommendation must include: specific evidence tiles and model outputs that triggered it, confidence score decomposition (aleatoric vs. epistemic uncertainty), comparable historical events from the training database, and a physical mechanism explanation (1–3 sentences). Unexplainable alerts are automatically downgraded to P4.
Graceful Degradation
DSS must produce useful outputs even when primary data sources are unavailable. Degradation hierarchy: (1) latest SAR + optical fusion; (2) SAR-only (cloud events); (3) climatological prior + NWP downscaling; (4) persistence-based alert hold. System must always communicate current data availability with explicit staleness indicators.
Human Authority Preservation
The system recommends; humans decide. All P1 alerts require human analyst confirmation before external notification is sent. The system must never autonomously trigger physical-world actions (valve closure, evacuation orders, financial transactions) without explicit human authorisation. This constraint is hardcoded at the Act Layer and cannot be overridden by agent reasoning.
Equity Auditing
Alert systems must be audited quarterly for spatial equity: Are P1 alerts equally responsive in low-income vs. high-income areas? Are monitoring AOIs representative of vulnerable populations? Audit reports must be published. Alert thresholds must be calibrated separately for urban and rural contexts to prevent systematic under-alerting in data-sparse regions.
Immutable Audit Trail
All alerts, agent actions, analyst decisions, and feedback events written to an append-only audit log (cryptographically signed, WORM-compliant storage). Enables: post-event investigation, regulatory reporting, model performance attribution, and legal defensibility. Retention: 7 years minimum.
Sovereign Data Compliance
Data routing must respect sovereignty constraints. Indian EO data (Cartosat, RISAT) must be processed within Indian cloud infrastructure (MeitY-approved). EU data (Copernicus) must comply with Copernicus Data Policy. Cross-border data flows must have documented legal basis. DSS architecture must support data residency configuration per AOI.
🏢
Part VIII — EO-AI Labs: Research Commercialisation Roadmap

Phased Commercialisation Plan

Phase 0
Months 0–6
Foundation
  • Cloud HPC + MLOps infrastructure
  • STAC data lake + GeoParquet store
  • 3 FM checkpoints begun
  • Schema stack v1.0 published
  • 2 PoC partner agreements
  • IP disclosure framework
  • DSIR recognition application
10M+ tiles · STAC <500ms · FM loss converging
Phase 1
Months 6–18
Intelligence
  • Fusion FM trained
  • Reasoning FM fine-tuned
  • Feature store operational
  • First now-cast product live
  • Basic alert engine (P3–P4)
  • 3 publications submitted
  • Provisional patent filed
MRR >₹5L or grant >₹1Cr · POD >0.85
Phase 2
Months 18–36
Operationalise
  • Action FM (agentic layer)
  • P1–P2 alerts with HITL
  • Scenario planning MVP
  • DSS dashboard v1.0
  • Industrial AI integration
  • ISO 27001 certification
ARR >₹2Cr · FAR <0.20 · 2+ enterprise clients
Phase 3
Months 36–60
Scale
  • 5+ domain verticals
  • FM licensing programme
  • Multi-country deployment
  • ESA BIC / ADB Ventures
  • Series A or Govt. contract
ARR >₹5Cr · FM licensed in 3+ products
Phase 4
Year 3–5
Lead
  • Global EO-AI Lab network
  • Sovereign FM deployments
  • Net Zero MRV platform
  • L5 autonomous monitoring
  • Patent portfolio >10
ARR >₹15Cr · 3+ countries · Top-3 ranking

Funding Architecture

DST-SERB / ANRF
₹50L – ₹2Cr · 3-year
Core research grant. Best fit Phase 0–1: foundational model research, dataset creation.
ISRO RESPOND / SAC
₹25L – ₹1Cr · 2-year
Mission-linked deliverables. Best fit Phase 0–1: India-specific EO application development.
MeitY TIDE 2.0
₹50L – ₹1.5Cr
Incubation grant + mentorship. Best fit Phase 1: product MVP, first enterprise pilot.
ESA BIC / Copernicus
€50K – €200K equity-free
2-year programme. Best fit Phase 1–2: international product validation, European clients.
USAID DIV
$100K – $2M milestone
Humanitarian use case focus. Best fit Phase 1–2: disaster management, food security.
ADB Ventures / GEF
$250K – $2M
Climate/environment focus. Best fit Phase 2: climate adaptation products, SIDS deployment.
Venture Capital
₹3Cr – ₹25Cr Series A
15–25% equity. Best fit Phase 2–3: commercial scale-up, international expansion.
Enterprise SaaS
₹50L – ₹5Cr/yr ARR
Per-client ARR model. Best fit Phase 1 onwards: primary commercial revenue stream.
FM Data Licensing
₹10L – ₹50L/yr
Passive revenue per licensee. Best fit Phase 2 onwards: scale without proportional cost growth.
💻
Part IX — Reference Technology Stack
ML Training
& MLOps
Primary: PyTorch 2.x + HuggingFace Accelerate + FSDP · MLflow + DVC + Weights & Biases · Alternatives: JAX/Flax, Kubeflow, SageMaker, Vertex AI
Data Infrastructure
Primary: STAC-fastapi + rio-tiler + Dask · S3/GCS + GeoParquet + pgvector · Alternatives: OpenEO, Azure Planetary Computer, Databricks Delta Lake
Stream Processing
Primary: Apache Kafka + Apache Flink · Alternatives: AWS Kinesis, Apache Pulsar, Confluent Cloud
Agentic Orchestration
Primary: LangGraph + LangChain Tools + FastAPI · Alternatives: AutoGen, CrewAI, Haystack, DSPy
LLM Serving
Primary: vLLM + TGI on GPU cluster · Alternatives: AWS Bedrock, Azure OpenAI Service, Ollama (edge)
Geospatial API & UI
Primary: pygeoapi + TiTiler + FastAPI · Mapbox GL JS + React + Deck.gl · Alternatives: GeoServer, ESRI ArcGIS Online, Leaflet + MapLibre
Alert Delivery
Primary: Twilio (SMS) + Firebase (push) + REST webhooks · SCADA via OPC-UA · Alternatives: AWS SNS, PagerDuty, Grafana OnCall
Observability
& Governance
Primary: Prometheus + Grafana + Evidently AI (model drift) · HashiCorp Vault + OPA + MeitY-approved cloud (sovereign) · Alternatives: Datadog, Arize AI, Azure Purview