Automated Software Research

Software that
researches itself

Give ASR Labs any codebase with a measurable metric. It runs structured AI research cycles overnight — analysing literature, designing targeted changes, and rigorously evaluating results. You wake up to better software. First application: combinatorial optimization solvers.

0
Research Cycles / Night
0
AI Research Agents
0
Lines of Rust (Testbed)
0%
Vendor Lock-in

Software improves once,
then stagnates

Every performance-critical codebase follows the same arc: build, tune, ship, forget. Improvement requires scarce experts, and their knowledge leaves when they do.

The Status Quo

Human-driven optimization

Expert engineers hand-tune algorithms. Months of work. Good results — until the expert leaves, the codebase fossilizes, and the literature moves on without you.

$500K+/year to maintain
The Alternative

Off-the-shelf frameworks

Generic solutions get you 80% of the way. Performance plateaus. You can't tune what you don't own. The last 20% is where the competitive advantage lives.

5–50% gap to what's possible
Automated Software Research

What if your codebase could research its own improvements?

ASR Labs treats software improvement as a research problem. Given a codebase and a metric, AI agents run structured research cycles — reading literature, designing changes, testing rigorously, archiving what they learn. Continuous improvement without continuous headcount.

→ Research-grade improvement, zero expert headcount
First testbed: gap to best-known solution — VRPTW benchmarks (lower = better)
Timefold
10–25% gap
days
OR-Tools
5–15% gap
days
Hexaly
3–8% gap
hours
ASR Labs
~1% gap
hours → ∞
Custom (HGS)
<1% gap
months

Two loops, one outcome:
software that researches itself

Karpathy's autoresearch lets AI agents iterate on ML training code overnight. We generalise the pattern: any codebase with a measurable metric can be continuously improved through structured research cycles.

01

Genesis

Problem → Baseline in Hours

Describe your problem in structured YAML. ASR Labs classifies it, selects proven components from a catalog, assembles a working codebase, verifies correctness, and establishes a baseline metric.

The output is a project you own. Not an API call. Not an opaque runtime. Source code.

YAML PRD classify select components assemble verify baseline
02

Research

Structured Research Cycle

Not random mutation — a structured research cycle. AI agents study literature and production data to identify a goal, design a targeted change informed by domain knowledge and past reflections from the archive, implement and test it, benchmark with statistical rigor, then evaluate what was good and bad — feeding insights back into the archive.

~100 research cycles per night. ~$10–50 in API cost. Each cycle is smarter than the last because the archive accumulates.

analyse design experiment evaluate approve/reject archive repeat
Live Research Board agents are working — 0 cycles completed
Backlog
Analysis
Design
Experiment
Evaluation
Done
agent activity

40,000 lines of battle-tested Rust

The first codebase under ASR is a production VRP solver that already competes with state-of-the-art academic implementations. The platform is generic — this is where we prove it works.

25
Search Operators
9 destroy, 5 repair, 5 intra-route, 6 inter-route local search
7
Constraint Types
Capacity, time windows, skills, sequence, breaks — enum-dispatched, zero overhead
31
Property Tests
Delta-vs-recompute invariants for every operator via proptest
O(1)
TW Feasibility
Savelsbergh timing arrays — the hardest piece to get right
6
Benchmark Suites
TSP, TSPTW, CVRP, CVRPTW, Skill-VRP, Multi-Shift with BKS tracking
3
Solver Strategies
ILS for single-vehicle, full ALNS for multi, decomposition for 3000+ jobs
Architecture Dependency rule: inner layers never import outer layers
api/
HTTP layer — Axum handlers, DTOs, distance matrix integration, Swagger
imports: all layers
metaheuristic/
ALNS engine — search loop, simulated annealing, adaptive weights, parallel execution
imports: core, operators
operators/
Search operators — 9 destroy, 5 repair, 5 intra-route LS, 6 inter-route LS
imports: core only
core/
Domain model — Problem, Route, Solution, Objective, 7 constraints
imports: nothing
Performance Conventions
i64
Integer arithmetic No f64 in hot paths — distances, times, and loads are all integers
enum
Enum dispatch No Box<dyn Trait> in operator loops — zero virtual dispatch overhead
0
Zero allocation Pre-allocate and reuse buffers — no malloc in inner loops
Δ
Delta evaluation Incremental cost updates — every delta verified by a property test

Field service scheduling

The first domain where ASR proves its value. Field service optimization is the ideal testbed: measurable metrics, rich constraints, and real production data.

Your problem

Technician scheduling — 20–80 jobs per day with variable service times

Skills matching — technicians have certifications, jobs need specific qualifications

Break compliance — lunch breaks, shift limits, labor regulations

Wide time windows — "morning" or "afternoon" slots, not exact times

Heterogeneous fleet — different vehicle types, home starts, equipment

What you get

Custom solver tuned to your exact constraint set — not a generic framework

Overnight research cycles on your actual customer data — goal-driven improvements, not random mutations

Source code ownership — no vendor lock-in, no API dependency, full IP

Per-archetype tuning — dense urban, rural spread, emergency replan each get their own profile

No OR headcount — the system replaces the need for scarce optimization engineers

Your software gets better
every single night

Let's run a proof of concept on your codebase.