Technical Whitepaperv3.0

NodeRun
Execution Infrastructure for AI Agents

Reproducible, auditable, cost-efficient distributed execution layer—taking AI Agents from demos to production scale.

Abstract

As LLM-powered AI Agents transition from theory to practice, a structural bottleneck has emerged: execution. This paper examines the inherent tensions in current Agent execution approaches—cost, security, scalability, and reproducibility—and presents NodeRun's architecture and technical implementation in detail.

NodeRun is an execution layer purpose-built for AI Agents. By decoupling compute from the LLM and placing it in a stateless, isolated, auditable, and cost-efficient distributed runtime, NodeRun provides the critical infrastructure for scaling AI Agents from demos to production. For deterministic workloads, reproducible execution is available as an opt-in feature.

The NodeX Labs Ecosystem: NodeRun & NodeHub Synergy

NodeRun is the AI Agent execution layer product from NodeX Labs, designed to work in tight synergy with NodeHub, the company's core infrastructure. Understanding this ecosystem positioning is key to grasping NodeRun's strategic value and technical advantages.

NodeX Labs Product Portfolio

NodeX Labs is building next-generation distributed compute infrastructure. The core products include:

•NodeHub: The flagship product—a decentralized compute network with 11,000+ globally distributed nodes. NodeHub provides elastic, low-cost compute capacity for diverse workloads and serves as the infrastructure layer for the entire NodeX Labs ecosystem.
•NodeRun: The AI Agent execution layer built on NodeHub. NodeRun abstracts NodeHub's distributed compute into a standardized, Agent-friendly execution service—NodeX Labs' strategic product for the AI Agent market.

Technical Synergy Between NodeRun & NodeHub

Layer	NodeHub Provides	NodeRun Abstracts
Compute Supply	11,000+ globally distributed nodes with elastic, redundant capacity	Intelligent scheduling that routes tasks to optimal nodes for low latency and high availability
Cost Structure	Structural cost advantages from distributed supply—far below centralized cloud providers	Pay-per-run, sub-second billing that passes cost savings directly to Agent developers
Trust Mechanisms	Node registration, reputation systems, and economic incentives as foundational trust framework	Proof-Lite / Proof-Strong execution attestation system providing auditability for AI Agents
Network Coverage	Global node distribution across multiple geographic regions	Proximity-based routing and IP diversity for compliance and performance in public web interactions

Ecosystem Role: The "Execution Gateway" for AI Agents

Within the NodeX Labs ecosystem, NodeRun serves as the "execution gateway for AI Agents":

•Upstream: Provides plug-and-play execution capabilities for AI Agent frameworks (LangChain, AutoGPT, Claude Computer Use, etc.) through standard interfaces like MCP.
•Downstream: Translates Agent execution demands into NodeHub distributed compute calls, fully leveraging NodeHub's scale and cost advantages.
•Lateral: Integrates with other NodeX Labs ecosystem products (storage, data services, etc.) to provide comprehensive infrastructure support for AI Agents.

NodeRun's Core Value Proposition: Transform NodeHub's distributed compute advantages into a standardized execution service that AI Agent developers can use directly. This "infrastructure + application layer" dual-engine model is NodeX Labs' strategic positioning for the AI Agent era.

NodeX Labs Ecosystem

AI Agent Framework Layer

LangChain

AutoGPT

Claude MCP

Others

MCP Protocol

NodeRun · Execution Layer

MCP Gateway

Standard Access

Smart Scheduling

Task Routing

Execution Proofs

Auditability

Compute Calls

NodeHub · Compute Network

11,000+Global Nodes

APAC

NodeRun (App)

NodeHub (Infra)

The Structural Tension: LLM Probabilism vs. Agent Determinism

The AI Agent paradigm centers on the "perceive-reason-act" loop. While "reasoning" is powered by LLM capabilities, "acting" requires a reliable execution environment. The problem: LLM characteristics fundamentally conflict with traditional compute execution requirements, creating architectural friction in current Agent systems.

Dimension	LLM Characteristics	Execution Requirements
Output	Probabilistic: Same input doesn't guarantee identical output; prone to hallucination.	Deterministic: Requires isolated, stateless environments for predictable results; reproducible execution available for closed-form tasks.
State	Stateful: Relies on large context windows to maintain conversation and reasoning coherence.	Stateless: Each execution should run in a clean, isolated environment to prevent cross-task state pollution.
Cost Model	Token-based: Cost correlates with input/output text length, not computational complexity.	Resource-based: Cost should correlate with actual compute resources consumed (CPU, memory, time).
Trust	Implicit: Users trust the model provider's brand and opaque internal processes.	Explicit Audit: Execution process and results must be auditable, ideally with cryptographic proofs for integrity and reproducibility.

Force-coupling these fundamentally incompatible modules leads to predictable problems:

•Cost Explosion: Having LLMs "simulate" execution or pass large intermediate data through context consumes massive tokens, making high-frequency tasks economically unviable.
•Scalability Ceiling: Relying on LLM context for execution state management creates fragile, complex systems that resist large-scale, high-concurrency Agent deployment.
•Audit Impossibility: In finance, automated testing, and other high-stakes domains, probabilistic outputs and opaque processes are deal-breakers.

The conclusion is clear: architectural decoupling is mandatory—reasoning and execution must be cleanly separated.

Use Cases: NodeRun in Real-World Applications

NodeRun's design philosophy stems from real business needs. The following are typical application scenarios demonstrating how NodeRun solves practical challenges faced by AI Agents in production environments.

Multi-Agent Collaboration Systems

In complex business workflows, multiple AI Agents need to work together. For example, an automated research system might include: Data Collection Agent, Analysis Agent, Report Generation Agent, and Quality Review Agent.

Challenges

•How to avoid resource contention and state pollution when multiple Agents execute simultaneously?
•How to prevent fault propagation when one Agent crashes?
•How to precisely track each Agent's resource consumption and costs?

NodeRun Solution

•Independent Sandboxes: Each Agent executes in a completely isolated environment
•Fault Isolation: Single Agent failures don't affect other Agents
•Cost Attribution: Each execution has independent billing records for precise cost allocation

Real Results: An automated research platform using NodeRun achieved 95% improvement in multi-Agent system stability and 80% reduction in troubleshooting time.

Financial Data Analysis & Compliance

The financial industry has strict compliance requirements for data processing. AI Agents performing financial analysis must ensure data security and operational auditability.

Challenges

•How to prove Agent analysis results are computed from original data, not fabricated?
•How to meet regulatory audit requirements for data processing?
•How to ensure sensitive financial data doesn't leak outside the execution environment?

NodeRun Solution

•Proof-Lite Attestation: Generate cryptographic proofs for each execution, ensuring verifiable results
•Complete Audit Logs: Record all inputs, outputs, and intermediate states for compliance
•Data Isolation: Sandbox destroyed immediately after execution, no data residue

Compliance Advantage: NodeRun's execution attestation mechanism can be submitted as audit evidence to regulators.

Security Penetration Testing

AI Agents are increasingly used for automated security testing. These tasks require executing potentially dangerous code while needing diverse IP addresses to simulate real attacks.

Challenges

•How to safely execute test scripts that may contain malicious code?
•How to obtain diverse IP addresses to test target system defenses?
•How to ensure the test environment doesn't affect production systems?

NodeRun Solution

•Complete Isolation: Test code runs in fully isolated sandboxes, unable to affect host systems
•11K+ Residential IPs: Globally distributed residential IP network simulating real user access
•Ephemeral Environments: Environment automatically destroyed after each test, leaving no traces

Security Notice: Ensure your security testing has authorization from target system owners. NodeRun strictly prohibits unauthorized penetration testing.

Content Collection & Operations

Content operations teams use AI Agents to automate content collection, processing, and publishing. These tasks typically involve large volumes of network requests and data processing.

Challenges

•How to avoid IP bans from target websites due to frequent access?
•How to handle large concurrent collection tasks?
•How to control collection task costs?

NodeRun Solution

•IP Rotation: Automatic rotation through different residential IPs to avoid bans
•Elastic Scaling: Auto-scale based on task volume for high concurrency
•Pay-per-Run: Only pay for actually executed tasks, predictable costs

Cost Comparison: Compared to traditional cloud server + proxy pool solutions, NodeRun can save up to 90% on content collection costs.

Trusted AI Inference

In high-stakes decision scenarios (healthcare, legal, finance), AI inference processes must be verifiable. Users need confidence that AI conclusions are based on correct data and logic.

Challenges

•How to prove AI computation processes haven't been tampered with?
•How to enable third parties to independently verify AI inference results?
•How to prove computation correctness without exposing raw data?

NodeRun Solution

•Proof-Strong Attestation: TEE-based trusted execution environment providing hardware-level execution proofs
•Reproducible Execution: For deterministic tasks, anyone can re-execute and verify results
•Zero-Knowledge Proofs (Roadmap): Prove computation correctness without exposing raw data

Application: A medical AI company uses NodeRun's Proof-Strong attestation to let patients and regulators verify the reliability of AI diagnostic results.

Data Processing Pipelines

Large-scale data processing tasks typically need to be split into multiple subtasks for parallel execution. AI Agents can intelligently orchestrate these tasks but need a reliable execution environment.

Challenges

•How to efficiently execute thousands of data processing tasks in parallel?
•How to handle partial task failures?
•How to control costs for large-scale parallel tasks?

NodeRun Solution

•Auto-Parallel: Intelligently distribute tasks to global nodes, maximizing parallelism
•Fault Tolerance: Single task failures don't affect others, with automatic retry support
•Cost Optimization: Select optimal nodes based on task characteristics, balancing cost and performance

Metric	Traditional Approach	NodeRun
Parallelism	Limited by single-machine resources	Global 11K+ nodes
Fault Impact	May cause entire pipeline failure	Only affects single task
Cost Model	Pay for server time	Pay for actual execution

Current Execution Approaches: Technical Trade-offs Analyzed

To build a better execution layer, we must first understand the specific trade-offs in current mainstream approaches. These aren't poorly designed—they're rational choices under specific constraints and goals. But these choices also determine why they can't serve as general-purpose, scalable Agent execution infrastructure.

Model-Embedded Execution: Google Gemini Code Interpreter

•Implementation: Bundling a code interpreter as a built-in tool delivers "works out of the box" UX, but at the cost of ecosystem lock-in and opaque pricing.
•Trade-offs: The execution environment is a "black box" tightly coupled to the model. Costs are bundled with expensive token usage—for high-frequency tasks, this hidden cost adds up fast.

Platform-Outsourced Cloud Sandboxes: Manus + E2B

•Implementation: General-purpose Agent platforms integrate third-party cloud sandboxes for execution capabilities.
•Trade-offs: Double cost structure. Users pay platform subscription fees while indirectly bearing the platform's high cloud sandbox costs—poor unit economics for end users.

Interaction Layer Optimization: Anthropic's "Code-as-Tool" Pattern

•Implementation: Having Agents "write code" instead of "call tools directly" cleverly reduces token consumption.
•Trade-offs: This solves "how to invoke" but sidesteps the core question of "where to execute"—leaving execution complexity to developers.

Summary: Current approaches attempt solutions at different layers, but none delivers a general-purpose solution satisfying all five requirements: isolated execution, statelessness, auditability, low cost, and easy integration. The market needs a paradigm shift—and that's exactly what NodeRun aims to deliver.

NodeRun Architecture: Purpose-Built for AI Agent Execution

NodeRun's design philosophy is "back to fundamentals." We don't build a sprawling general-purpose platform—we focus on doing one thing exceptionally well: providing AI Agents with an ultra-efficient, low-cost, trustworthy execution layer. The entire architecture is built around this singular goal.

NodeRun Architecture Overview

AI Agent

LLM-Powered

NodeRun Gateway

MCP Protocol Entry

Distributed Scheduler

Centralized Control Plane

Node A

Docker Sandbox

Node B

Docker Sandbox

Node C

Docker Sandbox

Control Plane

Agent Layer

Execution Plane

MCP-First Entry Point: The NodeRun Gateway

NodeRun's entry point is an ultra-lightweight gateway bridging Agents to the backend execution network. We've chosen MCP (Model Context Protocol) as the primary, native integration protocol—any MCP-compliant Agent can seamlessly add NodeRun as a standard "execution tool" with zero integration cost.

Distributed Scheduling System

This is NodeRun's "brain." The control plane (gateway, scheduling, policies) is centralized to ensure SLA guarantees and rapid iteration; the execution plane is distributed to capture structural cost advantages and global coverage.

This "centralized control, distributed execution" hybrid architecture is a deliberate design choice—not a transitional step toward full decentralization.

Execution Sandbox: Docker-Based Cross-Platform Isolation

Sandbox Technology Choice: NodeRun's cross-platform execution foundation is Docker. Standardized container technology ensures high consistency between development and production environments. On this foundation, we've built a multi-layer security model:

•Default Secure Mode (GA): All tasks run in rootless Docker containers with strict seccomp, cgroups, and namespace configurations—limiting process privileges, resource usage, and system visibility for solid baseline isolation.
•Optional Hardened Mode (Roadmap): For scenarios requiring stronger isolation guarantees, NodeRun plans to offer pluggable hardening engines. On Linux nodes, gVisor can provide stronger kernel-level isolation; for tasks requiring full virtualization, MicroVM (e.g., Firecracker) may be supported as a higher-cost, higher-isolation option.

Extreme Lightweight Design: NodeRun uses lightweight container sandboxes with layered image caching and Warm Pool mechanisms. For common runtimes (Python/Node) in warm-start scenarios, typical sandbox startup is in the hundreds of milliseconds (target: 100-500ms); for cold starts, pre-fetching and local image caching keep overhead to sub-second or low single-digit seconds.

Stateless Execution Model: Isolation & Determinism

NodeRun strictly adheres to stateless and run-to-completion execution models. Every execution runs in complete isolation, ensuring deterministic and auditable results. For closed-form compute tasks (fixed inputs, no external state dependencies), reproducible execution is available as an opt-in feature supporting third-party re-computation and verification.

Each task runs in a freshly created, just-in-time sandbox instance. Upon completion, the sandbox and all its state are immediately and completely destroyed. This provides a solid foundation for code behavior reproducibility and simplifies system design.

Execution Proofs: Building Auditability & Trust

This is what fundamentally differentiates NodeRun from traditional cloud execution services. NodeRun doesn't claim that any single execution is "mathematically provable"—instead, we build auditability through layered engineering and economic mechanisms, progressively increasing overall execution trustworthiness.

NodeRun's proof system operates at two levels:

Execution Proof System

Proof-Lite

input_hash

output_hash

runtime_hash

signature

Implemented ✓

Proof-Strong

N-of-M Redundant

Random Sampling

Economic Penalty

Arbitration Mode

Roadmap →

Proof-Lite: Integrity & Reproducibility Foundation

This lightweight proof is generated by default for all NodeRun executions. Its core goal is ensuring execution integrity and auditability, while providing the foundation for third-party re-computation. It contains these key fields:

•input_hash: Combined hash of code, parameters, and resource limits.
•output_hash: Combined hash of stdout/stderr and all output artifacts.
•runtime_image_hash: Container image digest used for execution—ensuring absolute environment consistency.
•dependency_lock_hash: Hash of dependency lock files (poetry.lock, package-lock.json)—ensuring precise third-party library version reproducibility.
•sandbox_policy_hash: Hash of sandbox policies applied (network, file permissions, syscall restrictions).
•node_id and timestamp.
•signature: Executing node's digital signature over all the above.

Replay Bundle: Optional Reproducible Execution

For closed-form compute tasks (fixed inputs, no external state dependencies—math calculations, code execution, data transformations, etc.), NodeRun offers an optional Replay Bundle feature. This self-contained package includes everything needed to reproduce execution:

•Original code and input parameters.
•Exact container image reference (runtime_image_hash).
•Complete dependency lock files.
•Sandbox policy configuration.
•Original execution's Proof-Lite record.

Third-party auditors can download this Bundle, re-execute the task in their own environment using standard Docker commands, and compare the newly generated output_hash against the original record to independently verify reproducibility. Note: For open-form tasks (tasks depending on external state—web scraping, API calls, real-time data queries, etc.), inputs change over time, making reproducible Replay Bundles impossible. However, Proof-Lite attestations and complete audit logs are still provided.

Node Signing Key Management

The trustworthiness of Proof-Lite signatures depends on robust node key management. NodeRun implements strict lifecycle management:

•Registration: Nodes generate key pairs when joining the network and register public keys with the centralized control plane.
•Rotation: Keys rotate on policy-defined schedules; old keys expire after transition periods.
•Revocation (CRL): Nodes detected cheating or going offline have their keys added to the Certificate Revocation List (CRL)—their signatures are no longer trusted.
•Reputation Isolation: Node reputation scores are bound to their keys. Low-reputation nodes are restricted to low-value tasks, and their signatures carry reduced "trust weight."

Proof-Strong: Economic Mechanisms for Enhanced Trust

We recognize that Proof-Lite alone can't fully prevent malicious nodes. For high-value or high-risk tasks, NodeRun's roadmap includes the Proof-Strong mechanism.

It achieves probabilistic trustworthiness through sampled redundant execution (N-of-M) and arbitration: regular tasks are randomly sampled at a certain probability and sent to multiple nodes for redundant execution. When multiple nodes' output_hash values disagree, broader re-computation and arbitration are automatically triggered, with economic penalties (e.g., stake slashing) for malicious nodes.

This layered mechanism shifts trust from a single centralized organization to comprehensive trust in open, auditable engineering processes and economic models—paving the way for AI Agent applications in finance, auditing, automated science, and other high-stakes domains.

Public Web Interaction Layer: Hard Boundaries

NodeRun's distributed architecture provides unique advantages as an efficient, robust public web interaction layer. However, distributed node networks also face potential abuse risks: malicious users might attempt DDoS attacks, pollute node IP reputations, or use nodes as proxies to access illegal sites.

We enforce these hard boundary rules for compliance and security:

•Default Network Isolation with Explicit Opt-in: All sandboxes are offline by default. Network access is only enabled through transparent proxy when explicitly declared in task definition (network_enabled: true).
•Dynamic Reputation Firewall: NodeRun includes a lightweight firewall that validates outbound request destination domains and IP addresses only. Business content is not collected by default—only minimal metadata for billing and abuse prevention.
•Multi-Source Threat Intelligence & Blocklists: The system uses a pluggable multi-source threat intelligence architecture with blocklist mechanisms to intercept requests to known botnets, phishing sites, or illegal content.
•No Persistent Identity Credentials (GA): Sandbox environments are designed as "memoryless." All cookies and tokens are physically wiped on sandbox destruction. GA explicitly does not support account hosting or persistent login state.
•Distributed Anti-Abuse Rate Limiting: The scheduling system maintains global-view access counters. Abnormally high-frequency requests to single targets automatically trigger circuit breakers.

Our design goal is providing stable, low-false-positive, compliant network routing capabilities—not any form of "anti-bot bypass" service.

Unit Economics: Technical Analysis of Order-of-Magnitude Cost Advantages

NodeRun's unit economics must consider both compute costs and non-compute costs.

Order-of-Magnitude Compute Cost Advantages

Through distributed supply, Warm Pools with intelligent scheduling, layered image and dependency caching, and stateless short-lifecycle design, NodeRun fundamentally reshapes the compute cost curve—reducing per-execution compute costs from traditional cloud sandbox "minute-level VM rental" to "second-level container runtime + distributed network scheduling costs," achieving order-of-magnitude cost advantages.

Non-Compute Cost Considerations

We also recognize that a commercially viable service must account for non-compute costs. These are offered as premium, billable capabilities:

•Network & IP Costs: For tasks requiring public web interaction, network egress and usage of specific geographic locations or high-quality IP profiles are significant cost components.
•Abuse & Risk Control Costs: Maintaining a healthy, compliant network requires ongoing investment in abuse monitoring and risk control.
•High-Trust Level Costs: Tasks requiring Proof-Strong involve redundant execution, re-computation, and arbitration—additional compute and scheduling overhead.
•Failure Retry Costs: In distributed networks, automatic retries for failed tasks (to maintain SLA) also incur additional costs.

Execution-as-a-Service: Product Contract & Roadmap

NodeRun delivers not just technology, but a clear, trustworthy "Execution-as-a-Service" product contract. We define our service through standard APIs and quantifiable metrics.

Service Contract

Service Dimension	Contract Details
Quotas & Limits	Clear QPS, concurrency limits, and domain-specific rate limits based on subscription tier.
Service Level Objectives (SLO)	We're committed to high-availability, low-latency execution. Specific SLAs will be progressively defined based on product maturity and user feedback, published at commercial launch.
Audit & Trust	All executions include `Proof-Lite` by default for integrity and reproducibility auditing. `Proof-Strong` available as an optional capability for enterprise or high-risk tasks.

Evolution Roadmap

NodeRun's development path strictly follows its core value proposition, avoiding the heavy-asset "cloud computer" narrative.

•Phase A (Current): Stateless Execution: Focus on high-frequency, one-shot, stateless execution capabilities—NodeRun's foundational value.
•Phase B: Public Web Interaction: Building on Phase A, develop compliant, robust public web interaction capabilities with clear cost models.
•Phase C: Enhanced Trust (Proof-Strong): Progressively launch and refine the economically-backed Proof-Strong mechanism for high-stakes scenarios.
•Phase D (Optional Branch): Long-Running Tasks / VM Mode: We recognize market demand for long-running, stateful tasks. But this is a high-cost branch product line that won't enter NodeRun's mainline narrative or pricing structure.

Product Roadmap

Stateless

Current

Web Access

Trust+

Proof-Strong

VM Mode

Optional

Stateless

Current

Web Access

Trust+

Proof-Strong

VM Mode

Optional

Conclusion: Building Critical Infrastructure for Next-Gen AI Applications

NodeRun isn't an incremental improvement on existing cloud compute models—it's a paradigm shift targeting AI Agent execution requirements. By combining distributed compute, lightweight sandbox technology, and cryptographic proofs, we've built an execution layer that architecturally satisfies all five requirements: reproducibility, statelessness, auditability, low cost, and easy integration.

We believe an open, standardized, trustworthy execution layer is the key to igniting the AI Agent ecosystem's "Cambrian explosion." LLMs handle reasoning, NodeRun handles execution. We're building the critical infrastructure that enables AI Agents to move beyond demos into production-scale applications.