Automated Software Research

Software that
researches itself

Give ASR Labs any codebase with a measurable metric. It runs structured AI research cycles overnight — analysing literature, designing targeted changes, and rigorously evaluating results. You wake up to better software. First application: combinatorial optimization solvers.

Research Cycles / Night

AI Research Agents

Lines of Rust (Testbed)

Vendor Lock-in

01 — The Problem

Software improves once,
then stagnates

Every performance-critical codebase follows the same arc: build, tune, ship, forget. Improvement requires scarce experts, and their knowledge leaves when they do.

The Status Quo

Human-driven optimization

Expert engineers hand-tune algorithms. Months of work. Good results — until the expert leaves, the codebase fossilizes, and the literature moves on without you.

$500K+/year to maintain

The Alternative

Off-the-shelf frameworks

Generic solutions get you 80% of the way. Performance plateaus. You can't tune what you don't own. The last 20% is where the competitive advantage lives.

5–50% gap to what's possible

Automated Software Research

What if your codebase could research its own improvements?

ASR Labs treats software improvement as a research problem. Given a codebase and a metric, AI agents run structured research cycles — reading literature, designing changes, testing rigorously, archiving what they learn. Continuous improvement without continuous headcount.

→ Research-grade improvement, zero expert headcount

02 — How It Works

Two loops, one outcome:
software that researches itself

Karpathy's autoresearch lets AI agents iterate on ML training code overnight. We generalise the pattern: any codebase with a measurable metric can be continuously improved through structured research cycles.

Genesis

Problem → Baseline in Hours

Describe your problem in structured YAML. ASR Labs classifies it, selects proven components from a catalog, assembles a working codebase, verifies correctness, and establishes a baseline metric.

The output is a project you own. Not an API call. Not an opaque runtime. Source code.

YAML PRD → classify → select components → assemble → verify → baseline

Research

Structured Research Cycle

Not random mutation — a structured research cycle. AI agents study literature and production data to identify a goal, design a targeted change informed by domain knowledge and past reflections from the archive, implement and test it, benchmark with statistical rigor, then evaluate what was good and bad — feeding insights back into the archive.

~100 research cycles per night. ~$10–50 in API cost. Each cycle is smarter than the last because the archive accumulates.

analyse → design → experiment → evaluate → approve/reject → archive → repeat

Live Research Board agents are working — 0 cycles completed

Backlog

Analysis

Design

Experiment

Evaluation

Done

agent activity

04 — First Testbed

40,000 lines of battle-tested Rust

The first codebase under ASR is a production VRP solver that already competes with state-of-the-art academic implementations. The platform is generic — this is where we prove it works.

Search Operators

9 destroy, 5 repair, 5 intra-route, 6 inter-route local search

Constraint Types

Capacity, time windows, skills, sequence, breaks — enum-dispatched, zero overhead

Property Tests

Delta-vs-recompute invariants for every operator via proptest

O(1)

TW Feasibility

Savelsbergh timing arrays — the hardest piece to get right

Benchmark Suites

TSP, TSPTW, CVRP, CVRPTW, Skill-VRP, Multi-Shift with BKS tracking

Solver Strategies

ILS for single-vehicle, full ALNS for multi, decomposition for 3000+ jobs

Architecture Dependency rule: inner layers never import outer layers

api/

HTTP layer — Axum handlers, DTOs, distance matrix integration, Swagger

imports: all layers

↓

metaheuristic/

ALNS engine — search loop, simulated annealing, adaptive weights, parallel execution

imports: core, operators

↓

operators/

Search operators — 9 destroy, 5 repair, 5 intra-route LS, 6 inter-route LS

imports: core only

↓

core/

Domain model — Problem, Route, Solution, Objective, 7 constraints

imports: nothing

Performance Conventions

i64

Integer arithmetic No f64 in hot paths — distances, times, and loads are all integers

enum

Enum dispatch No Box<dyn Trait> in operator loops — zero virtual dispatch overhead

Zero allocation Pre-allocate and reuse buffers — no malloc in inner loops

Delta evaluation Incremental cost updates — every delta verified by a property test

06 — First Application

Field service scheduling

The first domain where ASR proves its value. Field service optimization is the ideal testbed: measurable metrics, rich constraints, and real production data.

Your problem

→

Technician scheduling — 20–80 jobs per day with variable service times

→

Skills matching — technicians have certifications, jobs need specific qualifications

→

Break compliance — lunch breaks, shift limits, labor regulations

→

Wide time windows — "morning" or "afternoon" slots, not exact times

→

Heterogeneous fleet — different vehicle types, home starts, equipment

What you get

♦

Custom solver tuned to your exact constraint set — not a generic framework

♦

Overnight research cycles on your actual customer data — goal-driven improvements, not random mutations

♦

Source code ownership — no vendor lock-in, no API dependency, full IP

♦

Per-archetype tuning — dense urban, rural spread, emergency replan each get their own profile

♦

No OR headcount — the system replaces the need for scarce optimization engineers

Software thatresearches itself

Software improves once,then stagnates