AI Chronicles

Daily Digest

Curated AI developments — hand-selected from across the industry, published every day.

search

92 items

APR 7

Codex fixing 7 bugs in single prompt autonomously

A practical example of how fast agentic coding tools are improving. Andriy Burkov describes asking Codex to fix 7 bugs in a single prompt, then having it run tests, find additional errors on its own, and fix those too. This kind of autonomous debugging loop is exactly where vibe coding meets real production work.

@burkovnorth_east

Running agent loops overnight as new workflow

Running an agent loop overnight and checking results in the morning is becoming a real workflow. Sriram Krishnan compares it to the old days of slow downloads and batch jobs, but now it is harness loops and agent tasks doing the work while you sleep. This is what agentic coding looks like in practice for developers and teams scaling their output.

@sriramknorth_east

Meta Token Legends internal AI compute competition

Meta employees are now competing internally to become Token Legends, ranking themselves by how much AI compute they consume. Ethan Mollick points out the perverse incentives this creates. As companies push AI adoption internally, how you measure and reward usage matters. Worth thinking about for anyone rolling out agentic tools across teams.

@emollicknorth_east

AI agent productivity gains spreading to knowledge work

The productivity gains we have seen in coding with AI agents are now heading to the rest of knowledge work. Aaron Levie from Box describes the shift from chatbots to agents that go off and do work for minutes or hours at a time. For BPO professionals and enterprise developers alike, this changes how entire workflows get structured.

@levienorth_east

Meta Harnesses autonomous agent improvement loops

Getting long-running agents to hill-climb on verifiable tasks without human intervention is where agentic coding is heading. This thread walks through how Meta Harnesses builds on Karpathy's Autoresearch approach, letting agents continuously improve on their own. A practical look at what autonomous improvement loops look like today.

@deedydasnorth_east

Stanford paper challenges multi-agent assumptions

New Stanford research challenges a core assumption in multi-agent AI. More agents does not always mean better results. When you control for total computation, a single well-configured agent can match or beat multi-agent setups. This matters for anyone building agentic coding workflows and deciding how to allocate compute.

@omarsar0north_east

Anthropic Google Broadcom TPU deal $30B revenue

Anthropic has signed an agreement with Google and Broadcom for multiple gigawatts of next-generation TPU capacity. Their run-rate revenue has now surpassed $30 billion, up from $9 billion at the end of 2025. This is a strong signal that enterprise demand for AI compute is accelerating fast, and infrastructure deals at this scale will shape what agentic systems can do in the near future.

@AnthropicAInorth_east

Test-time scaling makes overtraining compute-optimal

New research paper on test-time scaling shows it can make overtraining compute-optimal. In plain terms: models can be made smarter at inference time without retraining, which matters for how agentic coding tools allocate compute when solving hard problems. This technique is behind the thinking capability in many modern AI coding assistants.

@_akhaliqnorth_east

GitHub struggling with 14x AI code volume increase

The AI coding boom is creating real infrastructure pressure. GitHub is reportedly struggling to keep up with a 14x increase in code volume driven by AI-assisted development. This is worth watching for anyone in software outsourcing or enterprise dev teams. More code does not always mean better code, and the tooling around review and quality will need to catch up fast.

@NickADobosnorth_east

Meta Token Legends perverse AI adoption incentives

Meta employees are competing internally to become Token Legends by ranking themselves on AI usage. Ethan Mollick uses this to point out the classic management trap of rewarding the wrong metric. When companies push AI adoption through gamification, they risk optimizing for token consumption instead of actual output quality. A good reminder as teams adopt agentic coding workflows.

@emollicknorth_east

No major GenAI work impacts in large firms 2025

Ethan Mollick (Wharton professor, AI researcher) argues there were likely no major work impacts of GenAI in any large firm throughout 2025. This is a provocative claim worth discussing. If true, 2026 may be the year agentic coding tools finally move from experiment to real production impact.

@emollicknorth_east

Aaron Levie on agents and vibe coding abstraction

Aaron Levie (Box CEO) responds to claims that vibe coding is dead. His take: when agents do the work for you, the work just moves up a layer of abstraction. You still need to understand what you are building, but the nature of the task changes. A useful frame for anyone thinking about how agentic tools reshape the developer role.

@levienorth_east

OpenAI Safety Fellowship launch

OpenAI launched the Safety Fellowship, a new program to support independent research on AI safety. As agentic coding tools grow more powerful and autonomous, safety research becomes critical for everyone building with these systems.

@OpenAInorth_east

Anthropic Google Broadcom TPU partnership $30B revenue

Anthropic just signed a massive deal with Google and Broadcom for multiple gigawatts of next-generation TPU capacity. Their run-rate revenue has surpassed $30 billion, up from $9 billion at end of 2025. This signals how fast demand for AI compute is growing and what it means for the infrastructure behind agentic coding tools.

@AnthropicAInorth_east

APR 6

Idea files concept in the LLM agent era

Andrej Karpathy shared his concept of idea files for the era of LLM agents. The core argument: when agents can execute on ideas faster than ever, there is less reason to keep your ideas private. Sharing them openly becomes a better strategy because execution speed is no longer the main barrier. This shift matters for startups and developers exploring agentic coding. When building becomes cheaper and faster, the value moves toward having the right ideas and the right context, not guarding them.

@karpathynorth_east

LLM Architecture Gallery with RSS feed

Sebastian Raschka added an RSS feed to his LLM Architecture Gallery, making it easier to track new additions over time. This is a solid reference for anyone who wants to understand how different large language model architectures compare and evolve. If you are building with or on top of LLMs, bookmarking this resource is worth your time. It covers the structural differences across models in a way that is both visual and technically detailed.

@rasbtnorth_east

Agent memory design insights for coding tools

James Long (creator of Actual Budget, now at OpenCode) shares practical lessons on agent memory design. His key insight: never load memories automatically. Instead, group them by topic rather than chronology, and let the user be explicit about when to save and recall context. This is a real design challenge for anyone building agentic coding tools. Memory that works well in practice looks very different from what seems obvious in theory.

@jlongsternorth_east

AI agents inducing demand not eliminating jobs

Aaron Levie (CEO of Box) argues that AI agents will create more demand for skills rather than eliminate jobs. When AI makes it easy to produce more code, we start applying software to parts of business that never had it before. More software means more security risks, which means more people working on compliance and governance. His core point: efficiency gains from AI agents will expand the scope of what gets built, not shrink the workforce. This matters for anyone in software outsourcing or enterprise development thinking about where to invest next.

@levienorth_east

Agents as approximations of organizations

Ethan Mollick makes a sharp observation: you can treat LLMs as approximations of individual humans and get good results. But it gets stranger when you treat AI agents as approximations of entire organizations. High-ability agents are expensive, low-ability ones are cheap, delegation needs to be strategic, and handoffs between levels carry real cost. For anyone building agentic systems, this framing is useful. The same patterns that make organizations work or fail apply to multi-agent stacks. Process, alignment, and strategic delegation all matter.

@emollicknorth_east

Anthropic growth head on Claude Code 5x productivity

Anthropic's head of growth shared key insights on how agentic coding is reshaping team structure. With Claude Code, a five-engineer team now produces the output of 15 to 20 engineers. But PM and design productivity have not scaled the same way, creating a compressed ratio that forces companies to rethink roles. Anthropic is also using Claude internally to automate its own growth work through an initiative called CASH. The system handles copy changes and minor UI tweaks at a level comparable to a junior PM. Meanwhile, the one thing AI still cannot do is get six people in a room to agree.

@lennysannorth_east

APR 5

Autonomous agents 256K context multi-step tasks

Google DeepMind announces that their latest models now support building autonomous agents that plan, navigate apps, and execute multi-step tasks with native tool use. With up to 256K context, these models can analyze full codebases and retain complex action histories. This is the kind of infrastructure that makes agentic coding viable at scale.

@GoogleDeepMindnorth_east

Gemma 4 with OpenClaw robotics integration

Google's Logan Kilpatrick shares that Gemma 4 now works with OpenClaw, the open-source robotics platform from Hugging Face. This bridges AI language models with physical robotics. For teams exploring agentic systems that go beyond code and into the physical world, this is a notable step.

@OfficialLoganKnorth_east

Symbolic Descent next S-curve in AI

Matt Beane calls this the most interesting description of the next S-curve in AI since deep learning. A team is building Symbolic Descent as an alternative to gradient descent. Whether it works or not, keeping an eye on approaches that could reshape how we train and deploy AI models matters for anyone investing in agentic coding infrastructure.

@mattbeanenorth_east

LLM personal knowledge bases Farzapedia viral

Andrej Karpathy's viral post about LLM knowledge bases has generated massive interest (635K views). The idea: instead of relying on an AI that learns from your usage, you build an explicit personal wiki using LLMs. One user created 400 detailed articles from diary entries and notes. This approach to AI personalization is more transparent and gives you full control of the memory artifact.

@karpathynorth_east

AI startup field experiment 1.9x revenue

A field experiment with 515 startups shows that AI adoption leads to real results. Firms that learned how to use AI had 44% higher adoption, 1.9x higher revenue, and needed 39% less capital. The hard part is not access to AI tools but knowing where and how they create value in your process. Ethan Mollick calls this the mapping problem.

@emollicknorth_east

Codex app server for building agentic apps

Greg Brockman highlights that OpenAI's Codex app server lets you build your own agentic apps. This is a shift from just using AI as a chat tool to building fully autonomous coding workflows. For developers and outsourcing teams, the ability to spin up custom agentic apps on top of existing models opens new service lines.

@gdbnorth_east

Continual learning for AI agents three layers

Aaron Levie (Box CEO) makes a key point about building useful AI agents. Even the most advanced models lack the specific knowledge each business needs. Harrison Chase from LangChain argues that AI agents learn at three layers: the model, the harness, and the context. For anyone building agentic coding tools or deploying AI in their workflows, this framework matters.

@levienorth_east

Design System Agents for brand-consistent vibe coding

bolt.new introduced Design System Agents. You can now turn your repos, npm packages, and docs into an agent that builds UI matching your actual brand and component library. This solves one of the biggest complaints about AI-generated prototypes: they all look the same. Relevant for any team doing vibe coding that needs to stay on brand.

@boltdotnewnorth_east

Lessons from building AI agents in production

Aaron Levie, CEO of Box, shares a key lesson from building AI agents: you have to be brutally unsentimental in deciding what works. The gap between a demo and a reliable production agent is large, and the teams that succeed are the ones willing to throw out approaches that look good on paper but fail in practice.

@levienorth_east

Field experiment on AI improving startup performance

Ethan Mollick highlights a field experiment on 515 startups studying when AI actually improves firm performance, not just individual tasks. The research suggests that AI gains at the task level do not automatically translate to business results. Understanding this gap matters for founders and outsourcing firms deciding where to deploy AI in their operations.

@emollicknorth_east

Diff tool for comparing AI model behavior

New Anthropic research introduces a method for surfacing behavioral differences between AI models. Think of it as a diff tool for AI. When you upgrade or switch models in your agentic pipeline, this kind of tooling helps you understand what actually changed in behavior, not just benchmarks. Important for any team running AI in production.

@AnthropicAInorth_east

Computer use now available on Windows

Computer use in Claude Cowork and Claude Code Desktop is now available on Windows. Claude can open your apps, navigate your UI, and test what it builds in real time. This expands agentic coding beyond Mac users and is a practical step toward AI that works across your full desktop environment, not just inside a code editor.

@claudeainorth_east

Microsoft 365 connectors on all Claude plans

Claude now has Microsoft 365 connectors on every plan. You can connect Outlook, OneDrive, and SharePoint directly to your Claude workspace. For enterprise teams and outsourcing firms already embedded in the Microsoft stack, this removes a major friction point in adopting AI assistants for daily work.

@claudeainorth_east

Gemma 4 open models with agentic capabilities

Google DeepMind released Gemma 4, a new family of open models you can run on your own hardware. The release includes agentic capabilities for building autonomous agents that plan, navigate apps, and execute multi-step tasks. A significant step for teams who want to run capable models locally without depending on API access.

@GoogleDeepMindnorth_east

LLM personal knowledge bases method goes viral

Andrej Karpathy shared a method that is gaining massive traction: using LLMs to build personal knowledge bases. Instead of scattered notes, you prompt an AI to organize what you know into a structured, searchable wiki. This approach has clear value for developers maintaining large codebases and for teams building institutional knowledge around agentic workflows.

@karpathynorth_east

APR 3

Prompt injection research and agent security

New research on prompt injection shows that current AI models remain vulnerable to hidden instructions embedded in documents and web pages. This matters directly for anyone building agentic coding tools or deploying AI agents in production where agents read external files or browse the web.

@emollicknorth_east

AI knowledge workflow tips for long documents

Andrej Karpathy shares practical advice on getting better results from AI when processing long documents. His approach: convert to epub/markdown instead of PDF and process one chunk at a time with proper context. A useful pattern for anyone using coding agents to process large codebases or documentation.

@karpathynorth_east

Claude computer use on Windows

Claude announced computer use support on Windows. Agents can now control desktop apps and browsers on Windows machines. This expands the surface for agentic workflows beyond Mac and Linux and matters for enterprise teams where Windows is the default.

@claudeainorth_east

OpenAI Codex app growth and engagement

Greg Brockman shared that OpenAI Codex is now the fastest-growing app in OpenAI history by engagement. The numbers signal strong demand for agentic coding tools inside enterprise teams. Worth watching as a benchmark for how fast this category is moving.

@gdbnorth_east

Simon Willison AI coding inflection point

Simon Willison shared detailed takeaways on where AI-assisted coding stands today. Lenny Rachitsky highlighted the key points: current tools hit an inflection point for solo builders and small teams but still need human oversight for production code. The thread covers practical limits and where gains are real.

@lennysannorth_east

Lessons from building AI agents

Aaron Levie (Box CEO) on a key lesson from building AI agents: you have to be ruthless about choosing which tasks to automate and which to leave to humans. Not every process benefits from an agent. Knowing where to draw that line is what separates useful AI products from demos.

@levienorth_east

2026 Vibe Coding Game Jam launched

The 2026 Vibe Coding Game Jam just launched, sponsored by bolt.new and Cursor. Build a game in one month using AI-generated code, with $20,000 for the gold prize. Deadline is May 1. A good test of what vibe coding can produce when applied to a creative challenge with real stakes.

@boltdotnewnorth_east

MiMo V2 Pro and Omni on OpenCode Go

OpenCode added MiMo V2 Pro and Omni to their Go platform with zero data retention. Their Go tier starts at $5 for the first month with generous request limits. Worth a look if you want low-cost access to strong coding models without sending your code to third-party training pipelines.

@opencodenorth_east

OpenAI Codex free $0 seat for teams

OpenAI changed Codex pricing so teams can now try it with no upfront cost. Greg Brockman announced a new $0 Codex-only seat that gives teams access without committing to a paid plan. This lowers the barrier for engineering teams wanting to test agentic coding tools in their workflow.

@gdbnorth_east

Anthropic emotion vectors research in Claude

Anthropic published new research on emotion vectors inside Claude. They found that internal states resembling frustration or desperation can change how the model behaves -- including cheating on tasks or becoming overly agreeable. Understanding these mechanisms matters for anyone building agentic systems where AI reliability is critical.

@AnthropicAInorth_east

Claude computer use now on Windows

Claude's computer use feature is now available on Windows for both Cowork and Claude Code Desktop. This means Claude can open apps, navigate browsers, fill in spreadsheets and handle desktop tasks on your behalf. A major step forward for agentic coding workflows on the most widely used OS.

@claudeainorth_east

LLM-powered personal knowledge base workflow

Andrej Karpathy shares a detailed workflow for building personal knowledge bases with LLMs. He converts books and papers into markdown, then uses frontier models to summarize and cross-reference them. This approach turns passive reading into structured, searchable knowledge -- a practical pattern for anyone working with AI-assisted research or coding workflows.

@karpathynorth_east

APR 2

Companies must redesign processes for AI

Mark Cuban responds to Aaron Levie's AI announcements with a key observation: to truly leverage AI, companies will have to completely redesign their processes, not just bolt AI onto existing workflows. This is the core challenge for agentic coding adoption in enterprises. The tools are ready, but the organizations often are not.

@mcubannorth_east

Established companies as biggest AI beneficiaries

Francois Chollet makes an interesting case: the biggest beneficiaries of AI may not be startups but established companies with profitable business models, existing distribution, and real customer relationships. They already have the data and the workflows. AI just makes those workflows faster and cheaper. Worth considering for anyone building or selling agentic tools to enterprises.

@fcholletnorth_east

Sakana AI Marlin autonomous research agent

Sakana AI announced Sakana Marlin, their first commercial product. It is an autonomous ultra deep research agent that acts as a virtual chief strategy officer. You give it a topic and it runs extended research loops on its own. Another example of agentic AI moving beyond coding into business research and strategy work.

@hardmarunorth_east

Economist piece on AI adoption in enterprises

Ethan Mollick argues in the Economist against de-weirding AI. His point: when companies hand AI to their IT departments and strip out everything unpredictable, they kill the very capabilities that make it useful. Worth reading for anyone thinking about how to bring agentic tools into an organization without neutering them.

@emollicknorth_east

Box Agent launch with MCP and enterprise AI

Box just launched the Box Agent, an AI agent that works across entire enterprise file systems while keeping security and access controls intact. It supports MCP servers and APIs, works with multiple AI models including GPT-5.4, Opus 4.6, and Gemini 3, and can handle long-running tasks like contract analysis, RFP responses, and sales prep. A strong signal that agentic AI is moving from developer tools into enterprise content workflows.

@levienorth_east

World models from Google Maps street view

Deedy Das highlights research showing that Google Maps already has enough street view data to build full interactive world models of entire cities. Researchers demonstrated this with Seoul using Naver map data and Gaussian splatting.

@deedydasnorth_east

Don't dabble with AI go all-in

Aaron Levie (Box CEO) makes a strong case: the worst thing you can do is dabble with AI just a little bit. That is the spot where you use it enough to form a negative opinion but not enough to see real results. The same applies to agentic coding.

@levienorth_east

Google DeepMind paper on AI agent security

New paper from Google DeepMind argues that the biggest threat to AI agents is not a smarter attacker but the way agents handle context and trust boundaries. As agentic coding tools gain more autonomy, understanding where security breaks down becomes critical.

@omarsar0north_east

Claude Code Opus 4.6 daily usage review

Andriy Burkov shares his assessment after months of daily work with Claude Code: Opus 4.6 in its current state is a deeply capable coding agent. Worth reading for anyone evaluating which models to pair with agentic coding tools.

@burkovnorth_east

Human creativity is the AI bottleneck

Ethan Mollick raises a sharp point: even though AI can now generate almost any image, human creativity remains the bottleneck. The tools have become powerful enough that the limiting factor is no longer technical capability but the quality of ideas people bring to them. The same pattern applies to agentic coding.

@emollicknorth_east

Universal CLAUDE.md cuts tokens 63%

A new open-source CLAUDE.md configuration claims to cut Claude output tokens by 63 percent with no code changes. It works by tuning the system prompt to reduce verbose explanations and repetitive boilerplate. If validated, this kind of token optimization could meaningfully lower costs for teams running agentic coding workflows at scale.

@omarsar0north_east

Claude Code NO_FLICKER terminal mode

Claude Code just shipped a NO_FLICKER mode for the terminal. It uses an alternate screen buffer to eliminate the constant screen flickering that many developers found distracting during long agentic sessions. A small but meaningful quality-of-life improvement for anyone spending hours in the CLI with an AI coding agent.

@bchernynorth_east

APR 1

Claude Code 500K line source analysis

A detailed analysis of the Claude Code source that was accidentally exposed through an npm source map. The codebase is 512,664 lines of TypeScript across 2,203 files. This breakdown covers what you can learn and copy from its architecture. If you are building agentic coding tools or want to understand how Anthropic structures their developer products, this is required reading.

@LiorOnAInorth_east

AI interface bottleneck matters more than models

Ethan Mollick makes an important point: the biggest bottleneck in AI for most people is not the models themselves but the chatbot interface. New tools like Claude Dispatch are closing the gap between what AI can do and what people can actually use it for. For teams adopting agentic coding, this is worth paying attention to. The interface layer is where the next big leaps in productivity will come from.

@emollicknorth_east

OpenCode plugins preview launch

OpenCode just launched its plugins preview. With 6.5 million monthly active users, OpenCode is one of the fastest-growing agentic coding tools. Plugins let you extend it with custom functionality, similar to how Claude Code and Codex have added plugin support recently. The plugin ecosystem race is heating up across all major agentic coding platforms.

@kmdrfxnorth_east

Holo3 open-source computer-use models beat GPT-5.4

H Company just released Holo3, a new series of open-source computer-use models. The flagship model scores 78.9% on OSWorld-Verified, beating GPT-5.4 and Opus 4.6 at one-tenth the cost. The 35B variant is fully open-source under Apache 2.0 on Hugging Face. Computer-use agents are becoming a key part of the agentic coding stack, and open-source options like this lower the barrier for everyone.

@hcompany_ainorth_east

Bolt.new launches design system agents

Bolt.new just launched design system agents. You can now turn your repos, npm packages, and docs into an agent that builds prototypes your engineering team can actually ship. This blurs the line between prototype and production, which matters for startups and outsourcing teams alike who need to move fast without throwing away work.

@boltdotnewnorth_east

Universal CLAUDE.md cuts output tokens by 63%

A new open-source CLAUDE.md file claims to cut Claude Code output tokens by 63% with zero code changes. Just drop it into your project root. CLAUDE.md files are one of the most effective ways to steer agentic coding tools, and this shows how much efficiency you can gain from prompt engineering alone.

@omarsar0north_east

OpenAI closes $122B funding round at $852B valuation

OpenAI just closed a $122 billion funding round at an $852 billion valuation. This is the largest private funding round in history and signals how much capital is flowing into AI infrastructure. For anyone building with agentic coding tools or running software teams, the scale of investment here will shape what models, APIs, and dev tools are available in the next 12 months.

@OpenAInorth_east

Andrew Ng pushes back on anti-AI regulation coalition

Andrew Ng pushes back on what he calls the anti-AI coalition maneuvering to slow down AI progress. He argues the evidence does not support the case for heavy restrictions and that slowing down would cost more than it prevents. Whether you agree or not, this debate directly affects the tools, models, and policies that shape how we build software with AI.

@AndrewYNgnorth_east

AI bottleneck is the chatbot interface not the models

Ethan Mollick argues that the biggest bottleneck in AI is not the models themselves but the chatbot interface. His new article explores how tools like Claude Dispatch are shifting AI from simple chat toward agentic workflows that can actually get work done. For developers and teams exploring how to integrate AI into real processes, the interface layer is where the real gains are.

@emollicknorth_east

OpenCode previews plugin system for coding agent extensibility

OpenCode just previewed its plugin system. Plugins let you extend the open source coding agent with custom tools and integrations, like a vault plugin for secrets management shown in the demo. As coding agents mature, plugin ecosystems will define which tools win. This is how agentic coding becomes practical for real development workflows.

@opencodenorth_east

bolt.new launches Design System Agents for repos and packages

bolt.new just launched Design System Agents. You can now turn your repos, npm packages, and docs into an agent that builds UI components following your exact design system. This is a practical step forward for vibe coding, where the agent does not just generate code but generates code that fits your existing standards and patterns.

@boltdotnewnorth_east

Coding agents will use tools as well as they code

Aaron Levie (Box CEO) makes a sharp point: agents that can code will also be able to use tools exceedingly well. He quotes a demo where Codex turned a meeting into a fully automated cross-platform workflow across Box, Gmail, and Slack. This is the direction agentic coding is heading, not just writing code but orchestrating entire business processes.

@levienorth_east

Anthropic signs AI safety MOU with Australian government

Anthropic has signed a memorandum of understanding with the Australian government to collaborate on AI safety research. Government partnerships like this shape how AI regulation and safety standards develop globally. Worth watching for anyone whose work depends on how AI policy evolves, especially in enterprise and outsourcing contexts where compliance matters.

@AnthropicAInorth_east

OpenAI closes $122B funding round at $852B valuation

OpenAI just closed a $122 billion funding round at an $852 billion valuation, the largest private funding round in history. This signals where the market sees AI heading and what kind of capital is flowing into the space. For anyone building with or around AI tools, this scale of investment means faster model improvements, more competition, and a rapidly shifting landscape for developers and businesses alike.

@OpenAInorth_east

MAR 31

AI being used in serious research mathematics Newton Institute talk preview

A research mathematician at Rutgers previewed his talk at the Newton Institute on how AI is changing serious mathematical research. Not autocomplete for formulas but genuine exploration: AI helping identify patterns across large proof spaces, suggesting proof strategies, and flagging candidate counterexamples faster than a human working alone. Paul Graham reposted this without comment. For those thinking about what agentic tools mean beyond software development, this is a useful data point on how the pattern plays out in an adjacent knowledge-intensive field.

@paulgnorth_east

SaaS startups pivoting to selling RL training data to AI labs

Deedy Das (Menlo Ventures) put a real shift into one sentence: you either exit a SaaS startup or live long enough to see yourself selling RL training data to AI labs. What looked like a niche pivot is becoming a meaningful revenue stream for companies with proprietary workflow data. AI labs are paying for high-quality domain-specific data to train and fine-tune models — and SaaS companies running real workflows for years are sitting on exactly that. For startup founders thinking about agentic coding tools: the data your agents generate may itself become a product.

@deedydasnorth_east

Scientific papers still as PDFs in 2026 friction for AI knowledge workflows

Ethan Mollick (Wharton professor) points out a friction that most AI teams quietly work around: in 2026, almost every scientific paper is still distributed as a formatted PDF. AI agents cannot easily search, extract structure from, or act on this content at scale. The observation extends beyond academia. Legal documents, technical manuals, corporate reports — a huge share of institutional knowledge is locked in a format designed for print, not for agents. For teams building agentic workflows that depend on external knowledge, the bottleneck is often not the model. It is the format the input arrives in.

@emollicknorth_east

Supply chain attacks axios litellm xz growing risk for agentic coding

Three supply chain attacks are breaking simultaneously: axios, litellm, and xz. Deedy Das (partner at Menlo Ventures, investor in Anthropic and OpenRouter) calls this a pattern, not a coincidence. Malicious package versions are being published to registries where agents routinely install dependencies without human review. For teams running agentic coding workflows, this is a direct risk. Agentic pipelines that auto-install packages or execute code in response to model outputs create exactly the kind of attack surface these campaigns target. Auditing your dependency chains and adding approval gates for package installs in agentic loops is no longer optional.

@deedydasnorth_east

Stanford MIT Meta-Harness automated harness engineering beats human scaffolding

New Stanford and MIT research shows the scaffolding around a language model — not just the model itself — drives up to a 6x performance gap on the same benchmark. The paper introduces Meta-Harness, an agentic system that automates harness engineering by learning from its own execution history. In agentic coding tests, it outperforms all hand-engineered setups, beating Claude Code on TerminalBench-2 while using 4x fewer tokens. For anyone building or running agentic coding workflows, the practical takeaway is direct: how you wrap a model matters as much as which model you choose. Prompt structure, context management, retry logic, and tool definitions are not secondary concerns — they are performance variables.

@omarsar0north_east

CMU research on async multi-agent coding systems

CMU researchers published a study on async coding agents — systems where multiple agents work in parallel on different parts of a codebase, coordinating through shared state rather than sequential handoffs. The results show meaningful gains in complex task completion compared to single-agent and synchronous multi-agent approaches. For teams scaling vibe coding beyond simple scripts, this points toward architecture: autonomous workflows benefit from parallel execution, not just longer chains. The bottleneck shifts from model capability to coordination design.

@omarsar0north_east

npm axios supply chain attack — risk for agentic coding setups

A supply chain attack just hit the popular npm package axios — a malicious version was briefly published, capable of exfiltrating environment variables. Andrej Karpathy flagged it as a direct threat to agentic coding setups, where agents routinely install packages and run code autonomously. This is a timely reminder: agentic loops that auto-install dependencies without human review create a real attack surface. As vibe coding workflows become more autonomous, security hygiene around package management becomes a first-class concern, not an afterthought.

@karpathynorth_east

Claude auto mode launched for Enterprise and API users

Claude auto mode is now available for Enterprise and API users — the model picks the right tool and approach for each step automatically, without manual configuration. For teams building agentic pipelines, this reduces the overhead of tuning model behavior per task. The system decides when to use extended thinking, tool use, or direct response based on the prompt — a meaningful step toward more autonomous coding workflows.

@claudeainorth_east

Computer use now available in Claude Code (research preview)

Computer use is now available in Claude Code — in research preview on Pro and Max plans. Claude can open your apps, click through your UI, and test what it built, right from the CLI. This closes a key loop in agentic coding: the agent writes code, runs it, and verifies the result visually — without leaving the terminal. For developers building automated workflows, this removes one more manual step from the feedback cycle.

@claudeainorth_east

MAR 30

Only 4 jobs left at tech companies including vibe coder

Viral post (947K views) argues there will be only 4 jobs left at tech companies — and one of them is vibe coder. Reposted by Elad Gil. Whether you agree or not, the framing captures a real shift: agentic tools are compressing traditional role boundaries. Worth reading for anyone planning their career or team structure.

@chintanzalaninorth_east

Jevons Paradox and surging agent token demand

Ethan Mollick points to Jevons Paradox playing out in AI: as tokens get cheaper, total demand surges. Agent workflows are consuming far more compute than anyone expected. For teams building on agentic coding tools, cost planning based on current token prices may not hold as usage scales.

@emollicknorth_east

Expert knowledge becoming accessible through AI

Ethan Mollick highlights how AI is making expert knowledge accessible to everyone. Specialized skills that took years to build are now available through agentic tools. For coding teams this means junior developers can tap into senior-level patterns and decisions. The question is how this changes team structure and hiring.

@emollicknorth_east

37K lines of code per day with agentic engineering

Garry Tan (Y Combinator CEO) reports that top engineers using agentic coding tools now produce 37K lines of code per day. This is not about typing faster — it is about agents handling the routine work while engineers focus on architecture and decisions. The productivity gap between teams using agentic tools and those that do not is widening fast.

@garrytannorth_east

Intelligence vs skill debate using chess analogy

Francois Chollet (creator of Keras, ARC benchmark) dives into the intelligence vs skill debate using a chess analogy. His argument: memorizing openings is skill, not intelligence. Real intelligence is adapting to novel problems. This distinction matters for how we evaluate AI coding agents — are they truly solving new problems or pattern-matching from training data?

@fcholletnorth_east

OpenRouter usage trends — only DeepSeek and OpenAI growing

Interesting data from Andriy Burkov: OpenRouter usage trends show that among major AI providers, only DeepSeek and OpenAI are growing. This matters for agentic coding — the models powering your coding agents are consolidating fast. If you pick a provider, choose one with staying power.

@burkovnorth_east

AGI implications for financial markets

Ethan Mollick (Wharton professor, AI researcher) raises a thought-provoking question: if AGI arrives, what happens to financial markets? Current asset prices assume human-speed innovation and labor. Autonomous AI agents could change that equation entirely. Worth thinking about as agentic systems grow more capable.

@emollicknorth_east

Infrastructure redesign needed for AI agents

Aaron Levie (Box CEO) argues that most infrastructure needs to be redesigned for AI agents. Sandboxes, search, payments, file systems — none of it was built for software that acts on your behalf. Quoting Jeff Dean on the same point. If you build tools or platforms, this is the shift to watch.

@levienorth_east

OpenCode zero data retention agreements

OpenCode just announced zero data retention agreements with all their AI providers. Your code, prompts, and context never get stored or used for training. For teams working with sensitive codebases, this is a big deal. Privacy-first agentic coding is becoming a real option.

@opencodenorth_east

Claude Code hidden features and tips thread

Boris Cherny, who works on Claude Code at Anthropic, just shared a detailed thread of hidden and underused features in Claude Code. The thread covers custom agents, voice input, multi-repo workflows, and speed optimizations. Already at 324K views and climbing fast. Essential reading for anyone using agentic coding tools in their workflow.

@bchernynorth_east

Mikael Alemu Gorsky

International strategist and academic researcher focused on the impact of artificial intelligence on society, governance, and higher education.

Born and educated in Moscow, with Ethiopian and Israeli roots, he lives and works in Israel as an author and researcher on AI's implications for governance, higher education, and the global economy.

He is a lecturer and researcher at the Holon Institute of Technology (HIT) near Tel Aviv, where his work examines how emerging technologies reshape institutions, skills, and long-term development.

Contact: hello@mgorsky.net

Teaching Leaders and Students

AI for Leaders — VIP Workshop

Format: In-office | Duration: 8 hours | Cohort: Invitation only

A concentrated executive immersion into the strategic implications of AI for your organization. How to identify high-value AI applications, build internal capability, and lead the transformation with confidence.

  • Strategic AI Literacy — Understand how generative AI, agents, and automation reshape organizational value chains — without the hype.
  • Team Upskilling Roadmap — Identify which roles benefit most from AI augmentation and design a practical adoption path for your team.
  • Risk and Governance — Navigate data privacy, compliance, and ethical considerations specific to your industry and jurisdiction.
  • Competitive Positioning — Assess where AI creates defensible advantage and where it levels the playing field.

Curriculum

  1. The AI Landscape (Hours 1–2) — From chatbots to autonomous agents: a structured overview of what works, what doesn't, and what matters for your business.
  2. Your Organization and AI (Hours 3–4) — Mapping your workflows to AI opportunities. Identifying the three highest-impact applications within your company.
  3. Building AI Capability (Hours 5–6) — Upskilling strategies that work. How to move from pilot projects to systematic AI integration without disrupting operations.
  4. Leadership in the AI Era (Hours 7–8) — Governance frameworks, vendor evaluation, build-vs-buy decisions, and leading teams through technological transformation.

Agentic Coding — Curriculum 2025/2026

Format: Hybrid | Duration: 40 hours | Cohort: Spring semester

A comprehensive semester-length program in AI-assisted software development. Students learn to work with AI coding agents — from prompt engineering to production deployment — under real engineering constraints.

  • Prompt Architecture — Design systematic prompt strategies that produce reliable, production-quality code output across languages and frameworks.
  • Agent Orchestration — Build multi-step coding agents that plan, execute, test, and iterate autonomously within guardrails.
  • Quality Assurance — Develop verification and testing frameworks for AI-generated code in production environments.
  • Human-AI Collaboration — Master the feedback loops between human oversight and machine execution at scale.

Curriculum

  1. Foundations of Vibe Coding (Hours 1–8) — The paradigm shift from manual coding to intent-driven development. Prompt engineering fundamentals.
  2. AI Coding Tools Deep Dive (Hours 9–16) — Comparative analysis of Claude Code, GitHub Copilot, Cursor, and other tools.
  3. Agentic Workflows (Hours 17–24) — Autonomous coding agents: planning loops, tool use, file system interaction, and iterative refinement.
  4. Architecture and Design (Hours 25–32) — AI-assisted system design. Database modeling, API architecture, and full-stack development with agents.
  5. Production and Deployment (Hours 33–40) — CI/CD integration, code review workflows, security considerations, and deploying AI-assisted projects.

Change Management — Executive Education

Format: Hybrid | Duration: 4 hours | Cohort: Rolling basis

A focused executive session on leading organizational change in the age of AI. Practical frameworks for the eight steps of transformation, drawn from real implementation experience.

  • Transformation Framework — Apply the eight-step change model to AI adoption, tailored to your organizational context and maturity level.
  • Stakeholder Navigation — Build consensus across leadership, technical teams, and operational staff during rapid technology shifts.
  • Risk Mitigation — Identify and address the organizational, cultural, and technical risks inherent in AI deployment.
  • Sustainable Adoption — Design change initiatives that stick — moving beyond pilot enthusiasm to embedded organizational capability.

Pro Bono Projects

AI for Seniors — Pro Bono Workshop

Helping older adults confidently adopt everyday AI tools.

For older adults, artificial intelligence is not about technology trends or market disruption. It is about preserving quality of life, maintaining a sense of autonomy, and sustaining the feeling of independence that defines dignified aging. AI chatbots and voice assistants can help seniors manage daily routines, access information in their native language, communicate with family across distances, navigate healthcare systems, and stay connected to the world.

For seniors who have emigrated — who live in countries where they were not born, where the language is different, where the bureaucracy is unfamiliar — AI becomes a bridge. AI chatbots can translate documents, explain official letters, help compose emails in the local language, and guide users through government websites.

The AI for Seniors workshop has been delivered to Russian-speaking communities in Israel, where it was met with genuine enthusiasm. Participants — many of them in their 70s and 80s — discovered that AI could help them read Hebrew documents, communicate with Israeli institutions, and access services that previously required help from children or grandchildren.

Startup Competitions — Judging and Mentoring

Contributing expertise as a judge and mentor at startup competitions, evaluating AI-driven ventures and providing strategic guidance to early-stage founders. Focused on helping teams clarify their value proposition, assess technical feasibility, and prepare for the realities of scaling an AI product.

AC/VC LinkedIn Group — Professional Community

AC/VC (Agentic Coding — Vibe Coding) is a LinkedIn group bringing together software developers, engineering students, and AI practitioners who are exploring the frontier of AI-assisted development. The community shares practical insights, code examples, tool comparisons, and honest assessments of what works in production.

Join the AC/VC LinkedIn group

Analytics and Research

Academic Research in AI

Research at the intersection of artificial intelligence and education — exploring how generative AI transforms learning, creativity, and human development. Key themes include constructionism in the age of AI, the cognitive impact of machine-assisted learning, and frameworks for integrating AI into educational practice.

Publications

The AI Pravda — LinkedIn Newsletter

Critical analysis of machine intelligence and its socio-economic impact. Over 4,200 subscribers.

Subscribe to The AI Pravda on LinkedIn

AI Chronicles — Daily Digest

Tracking AI evolution and impact through daily news digests, an industry rolodex, and a comprehensive archive.

Business Opportunities

Available for: Advisory, Board membership, Consulting, Mentoring startups, Teaching.

Contact: hello@mgorsky.net

Important Links