📊 Full opportunity report: Engineering Is Automated. Research Is the Residual. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
AI has achieved near-complete automation of core engineering tasks in AI development, according to recent research. Research itself remains less automated, but progress suggests it may soon become the next target for automation. The development could accelerate AI innovation and reshape research practices.
Recent research indicates that AI systems can now automate the core engineering tasks involved in AI development, leaving research as the remaining unautomated component. This marks a significant shift in the AI development landscape, with potential to accelerate innovation and reduce reliance on human engineering efforts.
Thorsten Meyer’s analysis of Jack Clark’s recent essay highlights that six key benchmarks measuring AI’s capabilities in AI R&D tasks are approaching saturation. For example, the CORE-Bench, which tests AI’s ability to reproduce research papers, has achieved a 95.5% success rate, with the author of the benchmark declaring it ‘solved.’ Similarly, the MLE-Bench, assessing performance on Kaggle competitions, has reached a 64.4% success rate, comparable to mid-tier human performance. These progress markers suggest that AI can now handle most engineering tasks involved in research and development, such as reproducing experiments and optimizing kernels.
Clark’s framing distinguishes between engineering—building and implementing systems—and research—creating new knowledge. The evidence indicates that engineering is effectively automated, while research remains less so, though the structural pattern suggests that research may itself be a form of large-scale engineering, potentially making the residual challenge smaller than previously thought.
Engineering is automated.
Research is the residual.
Six skill benchmarks. Edison’s framing. The question Clark leaves open is whether research is just engineering at scale.
Jack Clark’s Import AI #455 catalogs six benchmarks measuring AI capability on AI R&D tasks and concludes “AI can today automate vast swatches, perhaps the entirety, of AI engineering.” The residual question is research. The structural read on the residual: it may not be a permanent moat.
Six skills. One trajectory.
Clark catalogs six benchmarks measuring AI capability on AI R&D-relevant tasks. Each individual benchmark could be noise. Six benchmarks moving together is a curve. The pattern is the cascade observed across the broader Clark series — visible here in the specific R&D-skill domain.

AI Tools for Finance and Accounting Professionals: Automate Tasks, Save Hours, Work Smarter
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three data points. Mixed signal.
Clark provides three data points on the creative-spark question. Yes-evidence: Erdős-1051, centaur math discovery, sporadic Move-37-style moments. No-evidence: low yield, framing dependence, absence of acceleration. The mixed signal is the honest read.
The data supports two readings. Pessimistic: rare moments suggest creative insight is qualitatively distinct from engineering work. Optimistic: rare moments are an artifact of low-volume exploration; more shots on goal yields more discoveries. Both readings are consistent with Clark’s “vast swatches, perhaps the entirety” claim. They differ on the residual.

AI Engineering: Building Applications with Foundation Models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five dimensions Clark gestures at but leaves underdeveloped.
Clark’s section is rigorous on the empirical evidence. Five strategic dimensions matter for the institutional response that the Clark series synthesis argues is structurally inadequate.

Ai Automation Kit PLC Programming Software, Logic Function HMI, Run Simulator
1 PLC Controller
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Two readings. Different equilibria.
The structural question Clark leaves open: is research a permanent moat that bounds automated AI R&D, or is it engineering at scale that dissolves with more shots on goal? Both readings are consistent with the current data. They differ by orders of magnitude in consequences.
Productivity multiplier years
Recursive loop operational

Yahboom Robotic Arm ROS Industrial Grade 1kg Payload 7 DOF AI Collaboration MyCobot 320 M5 for Education and Research
Enhance your project capabilities with myCobot: The M5 version of the robot arm uses Esp32 as the core…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five audiences. Asymmetric cost of being wrong.
The institutional response should not bet on inspiration being a permanent moat. If the distinction holds, capacity built is still useful. If it closes, capacity is necessary. Asymmetric cost-of-being-wrong points toward building now.
IN INDUSTRY
IN ACADEMIA
POLICYMAKERS
INVESTORS
EVERYONE ELSE
Engineering is automated. The residual is the question. The institutional response should not bet on inspiration being a permanent moat.
Implications for AI Development Speed and Research Practices
The automation of AI engineering tasks could dramatically accelerate AI development cycles, reducing costs and dependence on human engineers. This shift may also influence research methodologies, favoring iterative, automated experimentation over traditional exploratory approaches. However, it raises questions about the future role of human researchers and the nature of innovation in AI.
Recent Advances in AI Capabilities and Benchmark Saturation
Over the past 18 months, multiple independent benchmarks—CORE-Bench, MLE-Bench, and kernel design research—have shown rapid progress toward saturation. CORE-Bench, which measures research reproduction, has seen a 4.4× improvement, with the latest systems handling dependencies and hardware issues at a post-doc level. MLE-Bench, involving Kaggle competitions, has reached a performance level comparable to mid-tier human practitioners, prompting the leaderboard to pause submissions for fairness adjustments. Concurrently, advances in kernel design through automated tools demonstrate that AI is moving toward production-grade capabilities in infrastructure optimization.
This convergence suggests that AI’s engineering capabilities are nearing or at the point of full automation, with research likely to follow as a natural extension.
“The pattern across multiple benchmarks indicates that AI can now automate most engineering tasks involved in AI research, leaving research as the residual challenge.”
— Thorsten Meyer
Uncertainties About the Automation of AI Research
It remains unclear how much of AI research—beyond engineering—can be automated, especially aspects involving creativity, hypothesis generation, and theoretical insight. While engineering tasks are nearing full automation, the residual research component may still require significant human input, and the timeline for this transition is uncertain.
Next Steps in AI Automation and Research Development
Researchers and industry observers will monitor ongoing benchmark progress, with particular attention to whether research tasks beyond engineering begin to saturate. AI systems may soon take on more creative and theoretical roles, potentially transforming research workflows. Continued development in automated kernel design and experiment reproduction will likely accelerate, with the next 32 months critical for observing whether research automation becomes feasible at scale.
Key Questions
What specific engineering tasks has AI automated?
AI has automated tasks such as reproducing research experiments, optimizing hardware kernels, and handling dependencies and code execution at a level comparable to experienced researchers.
Does this mean AI can now do research independently?
Not yet. While engineering tasks are approaching full automation, the creative and hypothesis-driven aspects of research remain less automated, though progress suggests they may soon follow.
What are the implications for human researchers?
Automation could reduce the need for manual engineering, potentially freeing researchers to focus on higher-level theory and innovation, but it may also reshape job roles and research workflows.
When might AI fully automate the research process?
It is uncertain; current trends suggest significant automation could occur within the next few years, but the timeline depends on breakthroughs in automating creative and theoretical tasks.
Are there risks associated with this automation?
Potential risks include over-reliance on AI for research, reduced human oversight, and challenges in ensuring AI-generated research maintains quality and safety standards.
Source: ThorstenMeyerAI.com