Product Launch · Xiaohu Explains

Anthropic Launches Claude Science: An AI Workbench for Scientists, with 60+ Built-in Research Skills

Now in beta. A coordinating agent marshals a team of specialists to do the work, and a reviewer agent at the end hunts down errors in citations and numbers; compute is outsourced to the AI, while raw data never leaves your machine.

One-Minute Overview

Anthropic has launched Claude Science, an AI workbench app for scientists, now in beta for Pro, Max, Team, and Enterprise users, usable locally on macOS/Linux or remotely via SSH/HPC login nodes.
The app ships with 60+ pre-configured skills and connectors spanning genomics, single-cell, proteomics, structural biology, and cheminformatics, wired into hundreds of specialized data sources (UniProt, PDB, Ensembl, and more) plus journals and preprint resources.
It can autonomously draft compute jobs and, with the user's consent, submit them to the user's own HPC cluster or Modal cloud GPUs, scaling analyses from a single GPU to hundreds — while raw data always stays inside the user's own systems.
A built-in reviewer agent continuously checks whether citations in the generated output are real, whether the numbers trace back to the computation, and whether charts match the code that produced them, fixing problems automatically when it finds them.
Real cases already exist: an Allen Institute researcher produced about ten reviews (several over 100 pages) that used to take two years each; a UCSF team cut its end-to-end germline variant analysis to one-tenth of the original time, independently verified by the lab.

⚑Editorial note: This piece is based on Anthropic's official announcement and reflects the vendor's own account. The performance figures cited (one-tenth the time, about ten reviews, etc.) come from partner labs and the vendor; the UCSF case is said to be independently verified by the lab, while the other numbers have not been checked by a third party.

1Scientists now have their own workbench

Scientists Now Have Their Own AI Workbench

Anthropic recently launched Claude Science, an AI workbench app that pulls scientists' everyday tools, databases, and compute resources into a single environment, now in beta for Pro, Max, Team, and Enterprise users.

It's an app that runs on your own computer or server: you ask an AI a scientific question in plain language, and it marshals dozens of specialized tools to pull data, run analyses, draw charts, and write manuscripts — with every output traceable back to how it was made. You can use it locally (macOS/Linux) much like Jupyter Notebook, or on a remote machine via SSH or an HPC login node.

⚡Why it matters: the UCSF Brain Tumor Center team put it to the test, running an end-to-end germline variant analysis for glioma and cutting the time to one-tenth of before — with the lab independently double-checking that the results were both fast and dependable.

Signature diagram: a central coordinating agent gathers the once-scattered skills, databases, journals, specialized models, and three tiers of compute (laptop/HPC/GPU cloud) into a single conversation.

2How fragmented the old way was

Just How Much of a Headache Science Really Is

Everyday research is full of tedious chores. Researchers hop between dozens of databases, each with its own data structure (schema); the file formats they run into often need dedicated processing pipelines and viewers; and the tools are a long list — PubMed, Jupyter, R, cluster terminals — switched one after another.

PubMedJupyterRCluster TerminalPer-DB schemasDedicated File ViewersCustom Data Pipelines

Just wiring these tools together and getting data to flow between them eats up a huge amount of a researcher's energy. What Claude Science aims to do is gather these scattered pieces into one environment, where you can go from searching the literature all the way to producing a manuscript.

3One coordinator, a team of specialists

One Lead Coordinating Agent, a Team of Specialists Behind It

The one you talk to is a general-purpose coordinating agent. It holds those 60+ pre-configured skills and connectors, can spin up domain-specialist sub-agents, and can also call the custom agents you've created yourself. They fan out to pull data, run analyses, and produce results.

What are skills and connectors

Think of it as an app store plus shortcuts for your phone: whichever database or software you need to work with, you install the matching "skill pack" on the coordinating agent and it knows how to drive that tool. Connectors, meanwhile, bring in the tools your lab already uses.

The key piece is that at the end of the chain stands a role dedicated to catching errors — the reviewer agent. It watches the other agents' output, checks it item by item, and fixes problems on its own when it finds them.

You
ask in plain language

→

Coordinating agent

→

Specialist sub-agent

Your custom agent

→

Reviewer agent checksChecks whether citations are real, whether numbers trace back to the computation, and whether charts match the code that produced them

→

Finds a problem
fixes it automatically

Core Innovation · 1

This reviewer agent is like a peer reviewer on call the whole way through: it fixates on whether citations really have a source, whether the numbers given can be traced back to the original computation, and whether charts line up with the code that generated them. Find an error, and it fixes it itself rather than leaving the problem to you. This takes direct aim at that old flaw where AI-generated content loves to make things up with a straight face.

This pairing is called actor-critic

One agent generates content — the "actor"; the other is dedicated to checking accuracy and the credibility of citations — the "critic," and the two divide the labor and keep each other in check. It's like one reporter writing the story and a dedicated fact-checking editor going over it line by line, with neither vouching for the other.

4Every chart traces back

Every Chart It Generates Traces Back to Its Code

Research leans heavily on visuals, so when Claude Science produces charts and manuscripts, it hands over the code that generated them alongside. It can also natively render research-specific visualization formats — 3D protein structures, genome browser tracks, chemical structures, and more — without opening a separate dedicated viewer.

Claude Science natively rendering proteins, structures, and molecules

Claude Science natively renders proteins, molecules, and structures; every result is reproducible and traceable to the code that generated it. Image: Anthropic

When it generates a chart, it attaches, all together: the exact code and runtime environment that produced it, a one-line note on how it came to be, and the full conversation record. That means when you look back months later, you can still figure out what was fed in, how the result was verified, and how to reproduce it.

A chart carries with it

The code that produced it
Runtime environment
A one-line note on its origin
The full conversation record

So you can

See every input
Verify anytime
Reproduce it months later

Changing a chart doesn't mean touching code yourself either. Tell it in plain language to "remove the gridlines" or "switch the y-axis to a log scale," and it goes and edits the code it wrote and re-renders the chart.

Original Tell it: remove the gridlines

Bar chart with gridlines

Gridlines removed — the agent edited its own plotting code directly

5The AI drives your supercomputer

The AI Drives Your Supercomputer Itself — Yet the Data Never Moves

Big analyses are a hassle: folding a protein, running a genomics pipeline over a massive dataset — researchers often have to drop the scientific question at hand to configure the compute job, wait for it to queue onto the cluster, watch whether it succeeds, and then pull the results back. Claude Science takes this whole routine off your hands.

Claude Science setting up environments and scheduling compute on a laptop, cluster, or on-demand GPUs

Claude Science automatically sets up runtime environments and schedules compute on your laptop, cluster, or on-demand GPUs. Image: Anthropic

It first drafts a plan and asks you before drawing on new resources — you can review or even revoke any decision. Only with your consent does it write the job and submit it to the compute your lab already uses: your own HPC cluster over SSH, or on-demand cloud GPUs through your Modal account. The scale can flex from a single GPU to hundreds.

Draft a plan

→

Ask consent
review/revoke

→

Write & submit job

→

Your HPC (SSH)
/ Modal GPU

→

Return only
needed context

Core Innovation · 2

The whole process runs on your lab's own infrastructure — your laptop, a Linux machine, or an HPC login node. So large, sensitive datasets never have to leave the systems they already live on; each step passes Claude only the sliver of context that step's analysis needs. Compute can be outsourced to the AI to schedule, while the raw data stays put.

Because these agents work inside a session that holds context in memory, even a massive dataset only has to be loaded once. As jobs run, that reviewer agent checks the output in step, catching wrong citations, numbers that can't be traced, and charts that don't match the code — self-correcting as it goes.

What is a forked session

Halfway through a job, you can spin off a parallel branch and run each with a different method, the two sides not affecting each other, and the original conversation thread isn't lost either. It's like saving one document as two versions and editing each separately — botch one and the original stays untouched.

6Pre-loaded domain firepower

Expert Out of the Box: Databases and Domain Models Already Wired In

Scientific knowledge is scattered across hundreds of specialized sources. In biology alone, the relevant data may be spread across UniProt, PDB, Ensembl, Reactome, ClinVar, ChEMBL, and GEO — each with its own structure and query language — plus journals, preprint servers, and domain-specific open-source models. You ask one question in plain language, and specialist agents query and synthesize across these sources, sparing you from poking at them one by one.

Claude Science pre-configured for research, with 60+ skills wiring you into various data sources

Built-in pre-configuration for genomics, single-cell, proteomics, and cheminformatics, with 60+ skills wiring you into a range of scientific data sources. Image: Anthropic

60+

Pre-loaded research skills and connectors, covering genomics, single-cell, proteomics, structural biology, and cheminformatics

Hundreds

Specialized data sources where scientific knowledge is scattered (UniProt, PDB, Ensembl, and more), plus journals and preprints

1 → hundreds

GPUs a compute job can elastically scale to

UniProtPDBEnsemblReactomeClinVarChEMBLGEO

It also plugs into NVIDIA's BioNeMo Agent Toolkit, natively connecting to the life-science models and libraries in BioNeMo, including Evo 2, Boltz-2, and OpenFold3. And the models, datasets, and pipelines scientists already trust can be brought in too: any pipeline can be saved as a reusable skill, any go-to tool hooked up with a connector, and later sessions inherit them automatically. You don't have to ditch the toolchain you already trust just to use AI.

7What three labs found

What Three Labs Have Already Gotten Out of It

Over the past few months, researchers have used the beta for single-cell RNA sequencing analysis, CRISPR screen design, protein structure prediction, cheminformatics, and more. Three cases best show what it does in practice.

Lab	What they did with it	Quantified result
Manifold Bio	End-to-end screening of targets for tissue-targeted drugs, assessing surface expression, in vivo transport, and safety one by one, ranking by criteria learned from their own proprietary data	Ran the whole pipeline in one go; the key difference from a general-purpose coding assistant is that it finds the right data itself and makes judgments carrying experience from past projects
Allen Institute neuroscientist Jérôme Lecoq	Built a multi-agent "computational review template" of about 20 custom skills: sub-agents read thousands of papers, extract core arguments and key quantitative findings into an evidence store, then write the review section by section, each section handed to a dedicated sub-agent that writes and checks in tandem using an actor-critic pairing	A single review used to take up to two years; he has now produced about 10, several over 100 pages, with citations all checked by the reviewer agent
UCSF Brain Tumor Center epidemiology associate professor Stephen Francis	Studying the molecular epidemiology of glioma: how thousands of small-effect germline variants add up to shape individual susceptibility, running a comprehensive germline analysis across multiple methods	Cut the time to roughly one-tenth; his team independently double-checked the results, confirming they were both fast and solid

UCSF run time
before

10×

UCSF run time
after

1× · 1/10

Allen reviews
before / 2 yrs

Allen reviews
now

~10

8How to get it now

Who Can Use It Now, and How

The Claude Science app is now available in beta on macOS and Linux for Pro, Max, Team, and Enterprise users. Team and Enterprise users need an admin to enable it. Anthropic says it released early so scientists can get hands-on with real problems first and then feed back on how to refine it.

For active labs at academic and nonprofit research institutions, there's also a Team plan with discounted seats.

There's also funding for science projects (open for details)

Anthropic will support up to 50 Claude Science "AI for Science" projects, each with up to $30,000 in credits; Modal separately provides up to $2,000 in compute for selected projects. Priority goes to biology and biomedical research. Applications are open until July 15, 2026, with notifications by July 31, and a project period running September 1 to December 1, 2026.

Every output carries an auditable record of how it was generated, so you can verify and reproduce the results. Anthropic, "Claude Science, an AI workbench for scientists"

This piece is an interpretation based on Anthropic's official announcement, "Claude Science, an AI workbench for scientists" (claude.com/science). The data, cases, and product capabilities cited all come from that announcement and reflect the vendor's own account; the UCSF case is said to be independently verified by the lab. Images are from Anthropic's official announcement page.