Quick News Spot

AI Coding Revolution | NextBigFuture.com

By Brian Wang

AI Coding Revolution | NextBigFuture.com

Super VC Marc Andreessen talks with Blake Masters and Amjad Masad, CEO and co-founder of Replit, a cloud-based coding platform. They talk about the transformative role of AI in democratizing programming, technical breakthroughs in agentic AI, historical parallels in computing evolution, debates on AGI timelines, economic implications, and Masad's personal journey from Jordan to Silicon Valley.

AI is magical but there are limitations and AI coding is at the vanguard of progress.

There is a paradoxical sentiment toward AI: it's the "most amazing technology ever," achieving feats unimaginable 5-10 years ago, yet users feel it's not moving fast enough and is on the verge of stalling.

Masad attributes this to mismatched expectations -- AI operates at person speed not instantaneous computer speed.

He vividly compares watching an AI agent code to observing John Carmack (legendary programmer behind Doom and id Software) on stimulants. Hyper-productive but deliberate, with pauses for reflection, tool use (web searches for compatibility issues), and self-verification.

This sets the stage for Replit's mission: making software creation as intuitive as describing an idea.

Last year AI passed the 3-5 minute reasoning time. Replit made a bet that long horizon reasoning time would increase and they would be able to use this to solve more complex programming problems with AI. Long context reasoning lets you roll out trajectories in AI. Trajectory -- a step by step reasoning chain to get to a solution.

Replit's Vision is to take us from Accidental Complexity to English as the Programming Language.

Masad recounts pitching Replit ~7-10 years ago with a vision of universal software creation, inspired by Fred Brooks' distinction between "essential complexity" (core business logic) and accidental complexity (setup drudgery like environments and packages).

Replit abstracted the latter over nearly a decade, supporting any language via robust infrastructure. The breakthrough came last year: code syntax itself was the remaining barrier.

Now, users -- novices or Excel macro tinkerers -- input plain English prompts (I want to sell crepes online or a paragraph-long startup idea.

The AI agent parses it, auto-selects the optimal stack (Python/Streamlit for data viz, JS/Postgres/Stripe for e-commerce), and builds iteratively

.For non-experts, the experience is seamless: no dev setup nonsense. Prompts can be casual ("I want to sell crepes") or specified (e.g., "in Python for school").

It supports major languages like Japanese, leveraging AI's multilingual prowess.

Masad ties this to Grace Hopper's 1950s compiler invention, which aimed to replace machine code with English-like programming (COBOL). Higher-level languages (Python, JS) were steps forward, but AI completes the arc: typing "thoughts" instead of syntax.

Resistance persists -- assembly coders scorned BASIC kids in the 1970s; JS purists hated React (which Masad helped build at Facebook). Now, veterans decry AI as "sloppy." Yet abstractions democratize, echoing Masad's JS revolution.

The Agentic Workflow: Building, Testing, and Deploying AppsOnce prompted, the agent builds a shared understanding via a task list (Set up Postgres DB with migrations and integrate Stripe for payments).

Users choose: iterate on UI design or full build (20-40 minutes). The agent executes autonomously -- writing SQL, provisioning resources, testing in a spun-up browser, and iterating on failures -- then notifies: "App ready; test on phone." Bugs? Describe in English; it fixes.

Publish with two clicks: cloud VM, production DB deployed.

What took days (local env, AWS signup, CI/CD pipelines) now takes minutes, empowering kids or laypeople.

Replit's IDE heritage shines: Inspect files, Git diffs, push to GitHub, or open in VS Code/Emacs.

A key shift: The agent is now the programmer, using tools like file edits, package installs, and DB/object storage provisioning -- mirroring humans but bot-like.

Anecdote: Post-launch, Asian latency worsened because U.S.-hosted AIs became the remote worker, routing requests across oceans.

Technical Deep Dive: Coherence, RL, and the Verification Loop

The holy grail is long-horizon reasoning: agents maintaining coherence over extended runs (5-200+ minutes) without derailing into errors, rabbit holes, or derangement (hallucinating in Chinese).

Metric: Real-user success (paid publishes signaling economic value), not just benchmarks.

Enablers: LLMs' context window (up to ~200K tokens effectively, despite 1M claims) with compression (summarizing logs/DB setups).

Internal "self-talk" for reasoning: "Need DB? Use Postgres tool.

read feedback

Core breakthrough: Reinforcement Learning (RL) from code execution.

Pre-training (predict next word) lacks reasoning.

RL rolls out "trajectories" (step-by-step chains) in envs like Replit

rewarding solutions (e.g., bug fixes verified by GitHub PRs/unit tests).

This extends chains, per nonprofit METR's benchmark (doubling coherent minutes every ~7 months -- faster in practice).

Verification loops amplify: Nvidia's 2025 paper showed verifiers enabling 20-min runs for optimized GPU kernels.

Replit's multi-agent relay:

Agent A builds (20 min),

Agent B tests (browser/computer use), flags bugs, compresses summary for Agent C's new trajectory.

Infinite via relay; coherent at 2-3 hours.

Speed: Faster than humans, with visible diffs/reflections -- fascinating to watch.

Evolution from "Stochastic Parrots" to Verifiable Reasoning

Early LLMs were stochastic parrots- Fluent at sonnets/conversation but failing basics (strawberry's 3 R's, simple math).

Critique: Mirroring inputs without logic.

RL + verifiers (AlphaGo-style: neural gen + discrete tree search) unlocks reasoning in verifiable domains -- where truth is binary (true/false, compiles/outputs correctly).

Coding surges: SWE-Bench from 5% (early 2024) to 82% (Claude 3.5).

Near-saturation via GitHub corpora/synthetic data.

Human experts generate verified tasks for RL loops.

softer fields (law/healthcare) too "squishy" (no auto-run diagnosis).

Concrete problems (math proofs in Lean, physics sims, protein folding, robotics outcomes) accelerate.

Transfer learning weak -- custom RL per domain.

Foundation firms hire experts for data; synthetic gen scales but finite.

By 2026, lay users match senior Google engineers via multi-agents (parallel: "Add social to storefront; refactor DB").

Multimodal UIs (visuals/charts) for creative oversight.

Transfer, Bitter Lesson, and Functional Equivalents

Hype vs. reality:

U.S. economy bets on AGI (human-level generality), but no cross-domain transfer (code gains don't auto-boost bio).

Bitter Lesson (scale compute/data infinitely) questioned -- Sutskever/Sutton: Human data exhaustion (fossil fuel analogy)

annotation dependency.

Humans suck at transfer (economists on fax vs. internet. Einstein's Stalinism).

AGI as above-human everything idealized; humans lack it.

Excels verifiable (40-page econ syntheses rival PhD work) but stalls on controversy (COVID origins, WTC7) -- RLHF censors.

Great for steelmanning but not first-principles truth amid propaganda.

True AGI: Efficient continual learning (drop in env, learn driving in months).

Masad bearish: Good enough economics (Replit thrives sans AGI) traps in local max, relieving pressure. RL exciting but known 10+ years

Advice on empowerment: AI tools let kids bypass traditional gates.

The 2025 AI Index Report by Stanford HAI (2025): Annual benchmark shows compute doubling every 5 months, datasets every 8 -- faster than Moore's Law -- validating Masad's RL trajectory claims. Economic angle: $33.9B genAI investment (up 18.7%), projecting $4.4T productivity gains by 2030, but warns of $1.4T U.S. power infra spend. Pace: Verifiable domains (code/math) lead, with 2026 as "agent year" for 90% synthetic content.

The Projected Impact of Generative AI on Future Productivity Growth by Wharton Budget Model (Sep 8, 2025). Models AI's deficit reduction ($400B, 2026-2035) via coding automation, aligning with Replit's "layperson = senior engineer" thesis. Explores scaling: Inference costs drop 30x via distillation, but energy bottlenecks (gigawatt-scale training) cap growth. 2026 prediction: $2T AI services market, with coding agents capturing 15% via verifiable ROI.

Exponential agent autonomy will have full-day runs by mid-2026 but transfer lags in soft domains. $100B AI software market by end-2025 (34.9% CAGR), with coding's verifiability driving 60% returns from 3% investments from VC data)

Previous articleNext article

POPULAR CATEGORY

misc

6670

entertainment

6991

corporate

5807

research

3510

wellness

5778

athletics

7339