Daily News
AI research, safety, product, and engineering links
Latest AI Reading
Source-dated posts from the last 14 days
Updated 2026-05-10 06:52 UTC
Sawtooth Problems
Red Button, Blue Button On April 24th, 2026, Tim Urban put forth the following poll on Twitter/X : Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the...
Read originalControl Debt
Notes on the gap: what control evaluations assume <> implementation in labs. It is 2027, and a frontier lab grew suspicions: plausibly, their model is scheming. Not a surprise for the control team. For more than a year,...
Read originalCould Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism
Overview This is a project proposal and early research on the question of how and whether Frontier AI researchers (not companies themselves) might take on personal risk and pledge to conditionally pause AI development....
Read originalThe Goblins Are the Paperclips
Last week OpenAI published Where the goblins came from , explaining why their models started slipping creature metaphors into unrelated outputs. The story has been treated as a quirky anecdote: endearing, slightly embar...
Read originalAvoid alienating the marginal audience member
I urge everyone reading this to improve their public speaking and presentation skills. If you think your presentations have even a small chance of reducing existential risk, whether directly or indirectly, it is worth i...
Read originalSomerville Porchfest 2026
This afternoon Cecilia and I played for Somerville Porchfest , with Harris calling and Danner running sound. There was rain, but not enough keep us from playing, or to keep folks from dancing: We were originally plannin...
Read originalThe AI Industrial Explosion — Part 2: Transition Dynamics
This is Part 2 of a series on post-AGI economic growth. Part 1 established that a fully automated economy could double roughly every year using current technology. But the US economy does not currently look like a self-...
Read originalInternational Law Cannot Prevent Extinction Either
The context for this post is primarily Only Law Can Prevent Extinction , but after first drafting a half-assed comment, I decided to get off my ass and write a whole-assed post. I agree with Eliezer's main thesis that i...
Read originalDo capabilities generalize across propensities?
Thanks to Alex Mallen, Arjun Khandelwal, Arun Jose, Keshav Shenoy, & Sam Marks for helpful discussion on this experiment. This experiment is inspired by a proposal by Sam Marks. Summary These are some results from an ex...
Read originalNeural Networks learn Bloom Filters
Overview: We train a tiny ReLU network to output sparse top- mjx-math { display: inline-block; text-align: left; line-height: 0; text-indent: 0; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust...
Read original"OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology Clinical Decision Support"
Read original[AINews] Anthropic growing 10x/year while everyone else is laying off >10% of their workforce
A quiet day lets us reflect on an interesting dichotomy in the economy.
Read originalBuilding realistic electric transmission grid dataset at scale: a pipeline from open dataset
Microsoft Research is excited to release an open dataset of approximate transmission topology of the U.S. power grid derived from publicly available data. The ability to study transmission-level power grid behavior is e...
Read originalImproving Bash Generation in Small Language Models with Grammar-Constrained Decoding
Bash is one of the most flexible and powerful interfaces exposed to AI agents. In the right system, a model that emits grep, curl, tar, or a shell pipeline is...
Read originalStreaming Tokens and Tools: Multi-Turn Agentic Harness Support in NVIDIA Dynamo
An agentic exchange must preserve a structured interaction: assistant turns interleave reasoning with one or more tool calls, and subsequent user turns return...
Read originalOptimize Supply Chain Decision Systems Using NVIDIA cuOpt Agent Skills
Modern supply chains operate under the constant pressures of fluctuating demand, volatile costs, constrained capacity, and interdependent decision-making....
Read originalEMO: Pretraining mixture of experts for emergent modularity
Read originalHalliburton enhances seismic workflow creation with Amazon Bedrock and Generative AI
In this post, we'll explore how we built a proof-of-concept that converts natural language queries into executable seismic workflows while providing a question-answering capability for Halliburton's Seismic Engine tools...
Read originalAdaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling
.apr-fig { text-align: center; margin: 1.35em 0; line-height: 1.4; } .apr-fig--wide img { display: inline-block; width: 100%; max-width: 100%; height: auto; vertical-align: middle; } .apr-fig--wide-0-8 { max-width: 80%;...
Read original[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs
OpenAI continues deploying GPT-5 everywhere
Read originalModel Quantization: Post-Training Quantization Using NVIDIA Model Optimizer
Model quantization is an effective method to reduce VRAM usage and improve inference performance on consumer devices such as NVIDIA GeForce RTX GPUs. By...
Read originalAchieving Peak System and Workload Efficiency on NVIDIA GB200 NVL72 with Slurm Block Scheduling
NVIDIA GB200 NVL72 introduces a fundamentally new way to build GPU clusters by extending NVIDIA NVLink coherence across an entire rack. This design enables...
Read originalPowering the Next American Century: US Energy Secretary Chris Wright and NVIDIA’s Ian Buck on the Genesis Mission
AI will help build the energy it needs. That’s the case U.S. Energy Secretary Chris Wright and NVIDIA Vice President of Hyperscale and High-Performance Computing Ian Buck made Thursday morning at the SCSP AI+ Expo. The...
Read originalA review of “Investigating the consequences of accidentally grading CoT during RL”
Last week, OpenAI staff shared an early draft of Investigating the consequences of accidentally grading CoT during RL with Redwood Research staff.
Read originalReal-Time Performance Monitoring and Faster Debugging with NCCL Inspector and Prometheus
Distributed deep learning depends on fast, reliable GPU-to-GPU communication using the NVIDIA Collective Communication Library (NCCL). When training slows down,...
Read originalSecure short-term GPU capacity for ML workloads with EC2 Capacity Blocks for ML and SageMaker training plans
In this post, you will learn how to secure reserved GPU capacity for short-term workloads using Amazon Elastic Compute Cloud (Amazon EC2) Capacity Blocks for ML and Amazon SageMaker training plans. These solutions can a...
Read originalOvercoming reward signal challenges: Verifiable rewards-based reinforcement learning with GRPO on SageMaker AI
In this post, you will learn how to implement reinforcement learning with verifiable rewards (RLVR) to introduce verification and transparency into reward signals to improve training performance. This approach works bes...
Read originalNotes from inside China's AI labs
Lessons from my trip to talk to most of the leading AI labs in China.
Read originalLinked and Loaded: Gaijin Single Sign-On Now Available on GeForce NOW
Less typing, more tanking. Faster logins mean more time in the gaming action — and this week provides GeForce NOW members with a smoother path straight into the battlefield. Cloud gaming is all about instant access to t...
Read originalAgents that transact: Introducing Amazon Bedrock AgentCore payments, built with Coinbase and Stripe
Today, we're announcing a preview of Amazon Bedrock AgentCore Payments, a new set of features in Amazon Bedrock AgentCore that enables AI agents to instantly access and pay for what they use. AgentCore Payments was deve...
Read originalThe World Inside Neural Networks
How neural geometry will unlock understanding and control of AI
Read originalSteering Along Manifolds to Control Neural Networks
Read originalThe Neural Geometry Series
Read original[AINews] Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized
And the kingmaker picks a side.
Read originalInvestigating the consequences of accidentally grading CoT during RL
We found limited accidental CoT grading in some released models, fixed the affected reward pathways, and found no clear evidence that monitorability degraded.
Read originalvLLM V0 to V1: Correctness Before Corrections in RL
Read originalPaper Summary: Interpreting Language Model Parameters
Read originalCost effective deployment of vision-language models for pet behavior detection on AWS Inferentia2
Tomofun, the Taiwan-headquartered pet-tech startup behind the Furbo Pet Camera, is redefining how pet owners interact with their pets remotely. To reduce costs and maintain accuracy, Tomofun turned to EC2 Inf2 instances...
Read originalNVIDIA Spectrum-X — the Open, AI-Native Ethernet Fabric — Sets the Standard for Gigascale AI, Now With MRC
The race to build the world’s most powerful AI factories demands networking that keeps pace with the ambitions of AI itself. NVIDIA Spectrum-X Ethernet scale-out infrastructure stands at the forefront of that race as th...
Read originalAlphaEvolve: How our Gemini-powered coding agent is scaling impact across fields
Explore how AlphaEvolve's Gemini-powered algorithms are driving impact across business, infrastructure, and science.
Read original[AINews] Silicon Valley gets Serious about Services
A series of announcements line up to a big theme: Services are the next big opportunity.
Read originalAdding Benchmaxxer Repellant to the Open ASR Leaderboard
Read originalInterpreting Language Model Parameters
Read original🔬Doing Vibe Physics — Alex Lupsasca, OpenAI
The full story of how GPT‑5.x derived new results in theoretical physics and quantum gravity.
Read originalNVIDIA and ServiceNow Partner on New Autonomous AI Agents for Enterprises
Enterprise AI has learned to generate. It has learned to reason. Now companies are asking the next question: How should AI act? Early agent systems have shown what’s possible, moving beyond simple prompts to take on mor...
Read originalHow Hapag-Lloyd uses Amazon Bedrock to transform customer feedback into actionable insights
Hapag-Lloyd's Digital Customer Experience and Engineering team, distributed between Hamburg and Gdańsk, drives digital innovation by developing and maintaining customer-facing web and mobile products. In this post, we w...
Read originalStreamlining generative AI development with MLflow v3.10 on Amazon SageMaker AI
Today, we’re excited to announce that Amazon SageMaker AI MLflow Apps now support MLflow version 3.10, bringing enhanced capabilities for generative AI development and streamlined experiment tracking to your generative...
Read originalIntroducing OS Level Actions in Amazon Bedrock AgentCore Browser
We’re announcing OS Level Actions for AgentCore Browser. This new capability unblocks these scenarios by exposing direct OS control through the InvokeBrowser API, so agents can interact with content visible on the scree...
Read originalMicrosoft at NSDI 2026: Advances in large-scale networked systems
Microsoft researchers share advances in building and operating large-scale distributed systems, spanning datacenters, networking, and the growing intersection with AI during NSDI ’26. The post Microsoft at NSDI 2026: Ad...
Read originalHow to Build In-Vehicle AI Agents with NVIDIA: From Cloud to Car
The automotive cockpit is undergoing a fundamental shift from rule-based interfaces to agentic, multimodal AI systems capable of reasoning, planning, and...
Read originalBuilding for the Rising Complexity of Agentic Systems with Extreme Co-Design
Generative AI’s explosive first chapter was defined by humans sending requests and models responding. The agentic chapter is different. Agents don't...
Read originalSecure AI agents with Amazon Bedrock AgentCore Identity on Amazon ECS
AI agents in production require secure access to external services. Amazon Bedrock AgentCore Identity, available as a standalone service, secures how your AI agents access external services whether they run on compute p...
Read originalIntelligence-driven message defense and insights using Amazon Bedrock
In this post, you will learn how you can use Amazon Nova Foundation Models in Amazon Bedrock to apply generative AI techniques for both business protection and enhancement. You can identify obvious and disguised attempt...
Read original[AINews] The Other vs The Utility
a quiet day lets us reflect on the nature of AI "character" in the Clippy vs Anton debate
Read originalVerbalized Eval Awareness Inflates Measured Safety
Read originalThe distillation panic
‘Distillation attacks’ is a horrible term for what is happening right now.
Read originalImport AI 455: AI systems are about to start building themselves.
The first step towards recursive self improvement
Read originalImport AI 455: Automating AI Research
Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now AI systems are about to start building themselves....
Read original[AINews] AI Engineer World's Fair — Autoresearch, Memory, World Models, Tokenmaxxing, Agentic Commerce, and Vertical AI Call for Speakers
a quiet day lets us make a call for speakers!
Read originalRisk from fitness-seeking AIs: mechanisms and mitigations
Fitness-seeking is increasingly what misalignment looks like in practice—how should we respond?
Read originalCatalyzing scientific impact through global partnerships and open resources
Data Mining & Modeling
Read originalHow to Build, Run, and Scale High-Quality Creator Workflows in ComfyUI
Creative and visualization teams today produce more assets, in more formats, with leaner teams. Generative AI can accelerate that work – compressing tasks...
Read original[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work
a quiet day lets us reflect on coding agents "breaking containment"
Read originalRed-teaming a network of agents: Understanding what breaks when AI agents interact at scale
Safe agents don’t guarantee a safe ecosystem of interconnected agents. Microsoft Research examines what breaks when AI agents interact and why network-level risks require new approaches. The post Red-teaming a network o...
Read originalNemotron Labs: What OpenClaw Agents Mean for Every Organization
By early 2026, the open source project OpenClaw had become a phenomenon. In January, its GitHub star count crossed 100,000 as developer interest surged.
Read originalAuto-review of agent actions without synchronous human oversight
Auto-review offers a safer default for deploying coding agents, using a separate agent to approve or deny boundary-crossing actions.
Read originalNVIDIA Nemotron 3 Nano Omni Powers Multimodal Agent Reasoning in a Single Efficient Open Model
Agentic systems often reason across screens, documents, audio, video, and text within a single perception‑to‑action loop. However, they still rely on...
Read originalIt’s Gonna Be May: 16 Games Hit the Cloud This Month, With More NVIDIA GeForce RTX 5080 Power
[Editor’s note] The blog has been updated to note that GeForce RTX 5080-power expansion also extends to the Install-to-Play library. It’s gonna be May — and the cloud’s in full festival mode. 16 games are joining GeForc...
Read originalEnabling a new model for healthcare with AI co-clinician
Researching the path to AI-augmented care and development of an AI co-clinician.
Read original[AINews] The Inference Inflection
a quiet day lets us reflect on the growing implications of the inference age
Read originalFour ways Google Research scientists have been using Empirical Research Assistance
Data Mining & Modeling
Read originalProbe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training
Read originalGranite 4.1 LLMs: How They’re Built
Read original[AINews] not much happened today
a quiet day.
Read originalResearch Sabotage in ML Codebases
One of the main hopes for AI safety is using AIs to automate AI safety research. However, if models are misaligned, then they may sabotage the safety research. For example, misaligned AIs may try to:
Read originalDeepInfra on Hugging Face Inference Providers 🔥
Read originalRecursive forecasting
Eliciting long-term forecasts from myopic fitness-seekers
Read originalNVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents
AI agent systems today juggle separate models for vision, speech and language — losing time and context as they pass data from one model to the other. Unveiled today, NVIDIA Nemotron 3 Nano Omni is an open multimodal mo...
Read originalIntroducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
Read originalInto the Omniverse: Manufacturing’s Simulation-First Era Has Arrived
Manufacturing’s traditional design-build-test cycle rested on a single assumption: Real-world testing was the only reliable test environment.
Read originalSources
Frontier Labs
OpenAI Alignment Research Blog
OpenAI alignment research.
OpenAI Engineering
OpenAI engineering articles and system-building notes.
Anthropic Research
Anthropic research on alignment, interpretability, evaluations, and societal impacts.
Anthropic Engineering
Anthropic engineering team posts and system-building articles.
Anthropic Alignment Science
Anthropic alignment science, interpretability, and risk evaluation articles.
Google DeepMind Blog
Google DeepMind official blog.
Google Research Blog
Official Google Research blog.
Meta AI Blog
Meta AI official blog.
Microsoft Research Blog
Microsoft Research blog.
AI Safety and Governance
LessWrong
AI alignment, rationality, and AI risk discussion.
AI Alignment Forum
Technical AI alignment research community.
AI Alignment
Alignment essays and research posts.
MIRI Blog
Machine Intelligence Research Institute blog.
METR
Model evaluation and frontier-risk research.
METR Evaluations
METR evaluation reports.
Apollo Research Blog
Frontier AI risk, scheming, and evaluations.
Redwood Research Blog
AI risk and safety research.
FAR.AI Blog
AI safety and alignment research.
CAIS Blog
Center for AI Safety updates.
Goodfire Blog
Interpretability and model control.
Goodfire Research
Goodfire research index.
Epoch AI Blog
AI trends, compute, data, economics, and forecasting.
Epoch AI Latest
Unified stream for papers, newsletters, data insights, and podcasts.
AI Companies and Research Labs
Hugging Face Blog
Open-source models, Transformers, applications, and research.
Hugging Face Daily Papers
Daily AI paper discovery.
NVIDIA Technical Blog
GPU, CUDA, AI systems, and technical engineering posts.
NVIDIA Blog
NVIDIA news and applications.
Amazon Science Blog
Amazon research posts, including AI and machine learning.
AWS Machine Learning Blog
AWS ML engineering, product, and practice posts.
Apple Machine Learning Research
Apple machine learning research.
Cohere Blog
Cohere official blog.
Cohere Research
Cohere research posts.
Mistral AI News
Mistral official news and releases.
xAI News
xAI official news.
Academic Labs
BAIR Blog
Berkeley AI Research blog.
Stanford AI Lab Blog
Stanford AI Lab blog.
Stanford HAI News
Stanford HAI news and blog posts.
MIT CSAIL News
MIT CSAIL news.
MIT LINGO Blog
MIT Language and Intelligence group blog.
Personal Blogs and Newsletters
Lilian Weng - Lil'Log
Long-form posts on reinforcement learning, LLMs, agents, and alignment.
Jay Alammar Blog
Visual explanations for machine learning and Transformers.
Andrej Karpathy - Bear Blog
Andrej Karpathy's newer blog.
Andrej Karpathy - Old Blog
Older Karpathy blog posts.
Sebastian Ruder Blog
NLP and machine learning blog.
Sebastian Ruder Newsletter
NLP news and newsletter.
Import AI
Jack Clark's AI research and industry newsletter.
Import AI Substack
Substack version of Import AI.
Interconnects
Nathan Lambert's frontier AI research and industry newsletter.
Latent Space
AI Engineer newsletter and podcast.
DeepLearning.AI - The Batch
AI news digest.
Distill
Classic visual and explanatory machine learning articles.
The Gradient
AI research, society, and commentary.
