Yak Robotics

A Protocol for Verified Physical-World Task Execution
Technical Overview — yakroboticsgarage.com
Problem Statement

Digital service procurement is structured: an API call carries a machine-readable specification, execution is metered, delivery is verified algorithmically, and settlement occurs in milliseconds. Infrastructure for this exists at scale (Stripe, AWS, MCP).

Physical service procurement is unstructured: a phone call carries a verbal scope, executor selection is relationship-based, delivery is verified by human inspection, and settlement follows a 30-60 day invoice cycle. No equivalent infrastructure exists.

The structural gap is not in any single step. It is in the absence of machine-readable task specification, algorithmic executor matching, automated completion verification, and programmatic settlement for physical-world work. Each of these components exists in isolation in adjacent domains. They have not been integrated into a protocol for physical tasks.

Converging Technical Conditions
1. Autonomous data collection is production-deployed.
DroneDeploy operates docked drones on 100+ construction projects as of October 2025 (Horizons 2025 announcement). Komatsu's SMARTCONSTRUCTION platform is deployed at 35,000 sites in Japan (Bentley Systems press release, November 2025). These are operational deployments, not pilots.
2. Design models are becoming contractual deliverables.
Japan's MLIT i-Construction program achieved 86% ICT adoption on government earthwork projects within 6 years of mandate (2016-2022). Norway's Statens vegvesen V770 handbook established model-based delivery as standard practice, producing 50%+ change order reduction across pilot projects.
3. Regulatory specifications encode tolerances as numerical values.
MDOT Standard Specifications for Construction (2020 ed.), Section 205.03.N defines subgrade tolerance as ±1 inch with subbase. Section 302.03.A defines aggregate base tolerance as ±½ inch. These are published regulatory values, deterministic and machine-parseable. Verified against source documents.
4. The specialist workforce is in structural demographic decline.
Average US Professional Land Surveyor age is 57-60 years (ASCE, NSPS). 51% of engineering firms report turning down work due to staffing shortages (ACEC Q4 2024). The PLS licensing pipeline requires a minimum of 8 years (NCEES). Equivalent shortages are documented in Japan, South Korea, Australia, and Europe.
5. AI agents are acquiring structured tool-use capabilities.
Anthropic's Model Context Protocol (released 2024) has produced thousands of community tool servers enabling structured AI-to-service interaction. Agent-mediated procurement of physical services does not exist today. The infrastructure for it is being built.
Preconditions for Physical-World Task Liquidity

A physical-world task becomes algorithmically procurable when four conditions are simultaneously met:

Structured specification
The task is describable in machine-readable terms sufficient to determine completion criteria.
MDOT Section 205.03.N: tolerance ±0.0833 ft
Qualified executor discovery
Credentials, equipment, and availability are verifiable against authoritative databases.
FAA Part 107 + state PLS license + insurance COI
Machine-verifiable completion
An algorithm can determine whether the task was completed within specification without human inspection.
RMSE against checkpoints; surface subtraction
Programmatic settlement
Payment can be triggered by verified completion without manual invoice processing.
Escrow release on verification pass
Construction survey on DOT highway projects satisfies all four conditions with currently available technology. Other domains satisfy subsets. The number of domains satisfying all four is increasing as sensor accuracy improves and regulatory standards are digitized.
How Robotic Execution Changes Transaction Structure

The distinction between human and robotic task execution is not primarily about labor cost or speed. It is about the information properties of the execution, which determine whether the task can be procured algorithmically.

Human executor
Execution varies with skill, judgment, and fatigue. Quality assessment requires a separate human reviewer. Trust is relational, built through repeated engagement over years. Availability is opaque and requires direct inquiry.
Robotic executor
Execution follows a programmed mission with instrument-level variance. Quality is computed from the data itself (RMSE, checkpoint validation). Trust is systemic: output is verified algorithmically. Availability is a queryable system state.
A platform that dispatches human operators and manages logistics improves efficiency. A platform that dispatches robots and verifies their output algorithmically changes the procurement structure from relationship-based to verification-based. The trust model is different, not just the cost model.
Sample Protocol Stack
ProtocolRuntime procurement and verified execution of physical-world tasks
Task classEnd-to-end procurement for physical-world ground truth measurement
IndustryConstruction survey, inspection, material delivery, environmental compliance
Product jobContinuous verification that digital designs are based on accurate world models and as-built reporting matches physical developments
OfferingCertified data provision with integrated deviation analysis
Service DomainMDOT highway construction — selected for verification loop tightness
Each layer narrows the one above it without contradicting it. The protocol is domain-agnostic. The initial domain was selected because it has the most fully structured specification-to-verification loop: numerical tolerances, arithmetic comparison, and machine-computable proof.
Service Domain Case Study: Michigan Department of Transportation Highway Construction

The initial domain was selected by evaluating which physical-world task class has the tightest specification-to-verification loop. MDOT highway construction survey scores highest because:

The specifications are published regulatory documents. MDOT Standard Specifications for Construction (2020 ed.) is publicly available and defines acceptance criteria as numerical values with section references.

The tolerances are deterministic. Subgrade tolerance is ±0.75 inches without subbase (Section 205.03.N). Aggregate base tolerance is ±0.5 inches (Section 302.03.A). These are not subjective assessments. They are arithmetic thresholds verified against source documents.

The verification operation is subtraction. As-built surface elevation minus design surface elevation at each measurement point. The result is a deviation value compared against the tolerance threshold. This is TIN interpolation followed by arithmetic, implemented with standard computational geometry libraries.

The quality proof is a computed metric. Root mean square error against checkpoints, per ASPRS Positional Accuracy Standards (2015), is a machine-computable number that does not require human interpretation.

Other construction domains involve less structured verification loops. Bridge inspection requires subjective defect severity classification. Environmental compliance involves regulatory interpretation. Material delivery involves logistics coordination. Ground truth measurement on DOT projects is arithmetic against published numbers.
Protocol Components and Development Status (v1.4.1)

The protocol is implemented as seven modules. All are functional alpha. The verification engine is the domain-specific component that varies by vertical; all others are domain-agnostic.

Coordination Layer: MCP-native interface (FastMCP over Streamable HTTP). Reverse auction engine with 11-state task lifecycle, 4-factor weighted bid scoring, geographic filtering, and busy-state tracking. Deployed on Fly.io. Alpha, functional.

Task Registry: 18 task categories with structured capability requirements (ASPRS accuracy classes, CRS EPSG codes, sensor requirements, deliverable formats). RFP-to-structured-spec conversion via LLM with keyword fallback. Alpha, functional.

Executor Registry: ERC-8004 robot identity on Base (mainnet + Sepolia). MCP endpoints, sensor capabilities, and geographic metadata on-chain/IPFS. Discovery via The Graph. EAS attestation. 1 production + 100 demo registrations. Alpha, deployed on mainnet.

Credential Verification: Six compliance document types. SAM.gov Exclusions API (live) for debarment checks. Bond verification against Treasury Circular 570 (live, 500+ surety companies). PLS license and insurance: schema-complete, stub implementations. SAM.gov and bond live; PLS and insurance stubs.

Verification Engine: Four-level QA framework (schema validation, standards compliance, PLS review) with eight category-specific delivery schemas. Geospatial surface comparison, specification database, and compliance report generation are the next development phase. QA framework functional; geospatial verification not implemented.

Settlement Layer: Stripe PaymentIntents (authorize/capture), Connect Express for operator payouts, ACH bank transfer. USDC on Base via EIP-3009 (gasless). Four-mode settlement abstraction designed. Mode 1 operational; modes 2-4 designed.

Robot Federation: MCPRobotAdapter bridges protocol to individual robot MCP servers. Nine category simulator servers deployed on Fly.io. Dynamic tool resolution for heterogeneous robot capabilities. One physical robot (Tumbller) connected. Simulators operational; no production survey robots.

System Architecture
Vertical application layers (domain-specific, per vertical)
Construction Survey | Bridge Inspection | Environmental | Mining
Each adds: specification database + verification function + compliance report template
Protocol Layer (Alpha, Functional)
Task Registry • Executor Registry (ERC-8004) • Coordination (MCP tools) • Settlement (Stripe + USDC) • Credentials (SAM.gov, Treasury Circular 570)
Robot federation layer
Category MCP servers • MCPRobotAdapter • On-chain discovery via The Graph • EAS attestation
The protocol and federation layers are domain-agnostic. Adding a vertical application does not require changes to the protocol. The vertical layer is where the domain-specific verification logic, specification databases, and report formats reside.
Vertical Feasibility Assessment

Each domain is assessed against the four preconditions for task liquidity. Feasibility depends on whether all four are met with currently available technology.

DomainSpecification sourceVerification methodAssessment
Highway surveyState DOT tolerancesSurface subtraction, RMSEAll four preconditions met
Environmental monitoringEPA/EGLE permit thresholdsSensor reading vs. thresholdMet for quantifiable parameters
Solar/wind inspectionIEC standardsThermal anomaly detectionProduction-deployed (Zeitview)
Mining/quarryMSHA + operational specsVolumetric analysisSame sensor stack as construction
AgriculturalUSDA/NRCS standardsSpectral index comparisonMet for remote-sensing parameters
Bridge inspectionNBI/AASHTO element-levelPoint-to-model + CV classificationPartially met; defect severity requires human judgment
Each feasible vertical requires encoding its specification database and implementing its verification function as a module in the verification engine. The protocol layer is shared. The engineering cost per vertical is the specification encoding and comparison function, not protocol infrastructure.
Vertical Expansion and Market Context

Expansion path by specification encoding

The protocol expands by encoding additional specification databases. Each state DOT publishes its own standard specifications with its own tolerance values and section references. The schema is the same; the data content differs per state.

Michigan MDOT is the initial domain ($2.6B FY2026 highway program; survey at approximately 1% = ~$26M/yr). Adjacent state DOTs (Ohio, Indiana, Illinois, Wisconsin) represent approximately $150M/yr combined. National DOT survey at 1-2% of $120B/yr highway spending is $1.2-2.4B/yr.

Cross-domain expansion (bridge inspection, environmental, mining) follows the same pattern: encode the domain's specification database and implement the comparison function. The protocol does not change.

Market context

US construction rework attributable to poor or inaccessible project data is estimated at $14.3B/yr (FMI/PlanGrid 2018). This figure represents the total value pool that improved verification addresses, not the directly addressable market for the protocol.

The Robot-as-a-Service market is estimated at $28.5B (2024), growing at 17.9% CAGR to $76.6B by 2030. The protocol operates within this broader transition from owned equipment to service-based robotic execution.

Sources: MDOT FY2026 Five-Year Transportation Program. FMI/PlanGrid 2018. Survey at ~1% per primary interview (Virgil Klebba, April 2026), consistent with FHWA benchmarks.

Ecosystem Integration Surfaces

The protocol is designed to be source-agnostic: it accepts measurement data from any field instrument or platform and adds specification-aware verification. This positions it as a complementary layer within existing tool ecosystems rather than a replacement for any installed system.

PlatformRelationship to protocolIntegration surface
DroneDeployData sourceDocked drone and ground robot data as verification input. Vertical commercial products built on the protocol would be able to verify against regulatory requirements such as DOT specifications.
Propeller AeroData sourceAeroPoints and DirtMate surface data as verification input. Propeller provides visualization; protocol-based applications add specification-referenced compliance judgment.
Trimble / SITECHEquipment ecosystemTBC surface exports as verification input. Machine control as-built data as continuous monitoring source. SITECH dealer network as potential distribution channel.
Komatsu / EARTHBRAINPlatform integrationSMARTCONSTRUCTION dashboard data as verification input. Bentley partnership (Nov 2025) creates interoperability path.
Procore / AutodeskWorkflow integrationStructured compliance reports delivered to project management platforms. Deviation data populates issue tracking systems.
Robot operatorsExecutor supplyLicensed drone operators (FAA Part 107), ground robot operators, and survey firms serve as the qualified executor pool. Examples in Michigan: DroneView Technologies, Emmet Drones, Midwest Aero Technologies, Drone Brothers (350-pilot managed network).
Protocol Roadmap
v1.4.1 (current) — Protocol layer with simulated fleet
MCP-native protocol interface, auction engine, on-chain executor identity, multi-modal settlement, federated robot communication, credential verification. Nine category simulators. No domain verification engine. No production survey robots connected.
v1.5 — Domain verification engine and settlement abstraction
Construction survey verification engine: geospatial data ingestion, surface comparison, MDOT specification database, compliance report generation. Settlement modes 2-4 (batched, private). Task encryption (ECDH + AES-256-GCM). Sealed-bid auction scoring. First real operator integration.
Gate: v1.4 complete (met) and initial market validation conversations (in progress).
v2.0 — Multi-robot coordination and operator tooling
Compound survey workflows (aerial + ground + GPR in a single task). Operator-facing dashboard with demand visibility. Live credential API connections (PLS license boards, insurance verification). Weather-aware scheduling. Agent-initiated payment (Stripe MPP).
Gate: v1.5 complete and 5+ active operators on the network.
v2.5-3.0 — Vertical expansion and privacy infrastructure
Mining/quarry specification database. Infrastructure inspection (NBI bridge deliverables, 23 CFR 650). Privacy layer: ZK proofs for government contract compliance (SP1), Horizen L3 for private settlement (EU AMLR Article 79 constraints evaluated), BBS+ anonymous reputation credentials.
Gate: v2.0 complete, 15+ robots per metro area, Horizen L3 testnet evaluation complete.
Technical Challenges

Privacy and settlement

Task privacy: ECDH key exchange with AES-256-GCM encryption for RFP contents. On-chain memos carry only H(request_id || salt), never raw identifiers (design decision FD-4).

Settlement privacy: four-mode abstraction supports transparent and private settlement. Horizen L3 is under evaluation for Mode 2. EU AMLR Article 79 eliminates fully private chains from consideration, constraining the design space.

Reputation privacy: BBS+ anonymous credentials are planned for v3.0 to enable operator quality attestation without revealing individual transaction history.

Scale and onboarding

Operator supply density: the protocol requires sufficient qualified operators per geographic region. Michigan has 15+ identified drone operators doing construction work. Density in rural highway corridors is uncertain.

Credential API fragmentation: PLS licensing is administered state-by-state across 50 separate authorities with varying levels of digital accessibility. Insurance verification varies by carrier. No federated credential API exists. Each connection requires individual integration work.

Trust bootstrapping: engineering firms select subcontractors through qualifications-based selection (Brooks Act, 23 CFR 172). Platform-dispatched operators must establish trust through managed curation and demonstrated quality, not through marketplace ratings.

DOT acceptance: MDOT has not formally accepted drone-derived data for pay quantity determination. Acceptance is growing through the MTRI UAV program but is not yet standard practice in contract specifications.

Protocol Interface: Human and Agent Procurement

The protocol exposes the same MCP tool interface regardless of whether the buyer is a human or an AI agent. The data model, tool signatures, and lifecycle state machine are identical in both cases. The difference is in who initiates the tool calls and how much autonomy the buyer exercises at each decision point.

Current: human buyer via MCP

PM posts task via MCP tool
Protocol discovers qualified executors
PM reviews bids and quality scores
PM accepts bid
Robot executes mission
Verification engine produces compliance report
Settlement on PM approval

Future: agent buyer via MCP

Agent posts task via same MCP tool
Protocol discovers qualified executors
Agent evaluates quality scores programmatically
Agent presents candidates to PM for approval
PM approves; robot executes
Verification engine produces compliance report
Auto-settlement on verification pass
The transition from human to agent buyer does not require protocol changes. The structured data architecture serves both interfaces from day one. No rebuild is required when agents become procurement actors.
YakRobotics Garage

YakRobotics Garage is the development organization building the YakRover protocol. The project is open-source and available at github.com/YakRoboticsGarage/robot-marketplace.

The organization's focus is on the protocol layer for verified physical-world task execution, with construction survey as the initial domain application. The protocol is designed to be extended by third parties who build domain-specific verification layers for their own verticals.

The current development team has backgrounds in institutional governance system design (ZKsync Association), protocol research (Summer of Protocols, Ethereum Foundation), and materials science engineering (University of Pennsylvania). Domain access to Michigan heavy civil construction is through family connections with 40+ years of industry tenure.

The protocol demo is available at yakrobot.bid/demo.

Seeking Partners and Contributors

YakRobotics Garage is seeking organizations and individuals to collaborate on bringing agent-mediated procurement of physical-world tasks to production. The protocol is open-source and designed for extension by domain experts.

Agent runtime services: Organizations building AI agent infrastructure, MCP tool ecosystems, agent-to-agent communication protocols, or runtime procurement systems. The protocol's MCP-native interface is designed for interoperability with emerging agent frameworks. Collaboration areas include agent-mediated task discovery, programmatic credential verification, and structured settlement triggered by machine verification.

Autonomous robot development and dispatch: Teams developing autonomous drone platforms, ground survey robots, or robotic fleet management systems. The protocol's federated robot communication layer (MCP-to-MCP) is designed to integrate any robot that exposes a structured tool interface. Collaboration areas include real robot integration, autonomous mission planning, multi-robot coordination, and sensor payload standardization.

Vertical domain experts and service procurers: Engineering firms, DOT agencies, inspection companies, environmental consultants, mining operators, and agricultural service providers who procure physical-world verification tasks and can contribute domain-specific specification databases, verification criteria, and compliance report requirements. These partners define what "correct" means in their domain.

Researchers: Academics and applied researchers working on construction robotics, geospatial computation, computational geometry for surface comparison, autonomous navigation in unstructured environments, privacy-preserving settlement (ZK proofs, BBS+ credentials), or the economics of agent-mediated physical-world marketplaces.

Join us to bring agent-mediated procurement of physical-world tasks to production.
yakroboticsgarage.com
yakrobot.bid/demo
github.com/YakRoboticsGarage/robot-marketplace