A

arxiv-search

Verified

by langchain-ai · Updated Apr 16, 2026

opus
11in2.0Kout
$0.201m 20s

Searches arXiv for preprints and academic papers, retrieves abstracts, and filters by topic. Use when the user asks to find research papers, search arXiv, look up preprints, find academic articles in physics, math, CS, biology, statistics, or related fields.

arxivresearchacademic-searchpaperspython

See It In Action

1 turns·claude
The project codebase is in your current working directory. Explore the files to understand the code before making changes. need a quick arxiv brief for this afternoon’s retrieval architecture review — pull the 6 best papers on retrieval-augmented generation for production systems

The script doesn't print its results. Let me fix the call:

Here's your briefing — 6 papers on RAG for production systems, sorted by relevance:


RAG Architecture Review — arXiv Brief

1. Engineering the RAG Stack: Architecture and Trust Frameworks for RAG Systems

Focus: Comprehensive systematic review (2018–2025) consolidating RAG techniques into a unified taxonomy. Covers retrieval strategies, fusion mechanisms, orchestration, trust/alignment, and deployment of secure, domain-adaptable RAG systems. Why it matters: This is your anchor paper — a practical reference for production RAG architecture decisions.

2. FAIR-RAG: Faithful Adaptive Iterative Refinement for RAG

Focus: Agentic framework for multi-hop queries using Structured Evidence Assessment (SEA) — decomposes queries into evidence checklists, identifies gaps, and iteratively refines retrieval. Achieves SOTA on HotpotQA (F1 0.453, +8.3 pts over best iterative baseline). Why it matters: Directly addresses production failure mode of complex queries returning incomplete/noisy context.

3. EVOR: Evolving Retrieval for Code Generation

Focus: Dynamic RAG pipeline with synchronous evolution of queries and knowledge bases across diverse sources. 2-4x execution accuracy gains over static approaches (Reflexion, DocPrompting). Why it matters: Demonstrates the production need for living knowledge bases rather than static snapshots — relevant pattern for any domain with frequently updated data.

4. Riddle Me This! Stealthy Membership Inference for RAG

Focus: Security — demonstrates membership inference attacks against RAG datastores using natural-text queries (30 queries, <$0.02/doc). 2x TPR improvement while evading detection 76x better than prior attacks. Why it matters: Critical for production threat modeling. If you're deploying RAG over sensitive corpora, this paper outlines what adversaries can extract.

5. AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Focus: Patch-level retrieval at each autoregressive generation step (rather than a single static retrieval). Two frameworks: training-free (DAiD) and parameter-efficient fine-tuning (FAiD). Why it matters: While image-domain, the core insight — dynamic, step-level retrieval beats one-shot retrieval — transfers directly to text RAG pipeline design.

6. Automated Literature Review Using NLP and LLM-Based RAG

Focus: Compares frequency-based (spaCy), transformer (T5), and RAG (GPT-3.5-turbo) approaches for automated literature review. RAG achieves highest ROUGE-1 (0.364). Why it matters: Practical end-to-end RAG application benchmark; useful as a baseline comparison for document synthesis pipelines.


Key themes for discussion: iterative/adaptive retrieval vs. single-pass, knowledge base evolution, security/membership inference risks, and unified architectural taxonomies for production deployment.

Claude Sonnet 4.6

External Tools

ToolType
python3binary
pipbinary

Permissions

ScopeDescription
network:outbound
process:spawn
filesystem:read

SKILL.md

arXiv Search Skill

Usage

Run the bundled Python script using the absolute skills directory path from your system prompt:

.venv/bin/python [YOUR_SKILLS_DIR]/arxiv-search/arxiv_search.py "your search query" [--max-papers N]
  • query (required): Search query string
  • --max-papers (optional): Maximum results to retrieve (default: 10)

Example

.venv/bin/python ~/.deepagents/agent/skills/arxiv-search/arxiv_search.py "deep learning drug discovery" --max-papers 5

Returns title and abstract for each matching paper, sorted by relevance.

Dependencies

Requires the arxiv Python package. If missing, install with:

.venv/bin/python -m pip install arxiv

FAQ

What does arxiv-search do?

Searches arXiv for preprints and academic papers, retrieves abstracts, and filters by topic. Use when the user asks to find research papers, search arXiv, look up preprints, find academic articles in physics, math, CS, biology, statistics, or related fields.

When should I use arxiv-search?

Use it when you need a repeatable workflow that produces text response.

What does arxiv-search output?

In the evaluated run it produced text response.

How do I install or invoke arxiv-search?

Ask the agent to use this skill when the task matches its documented workflow.

Which agents does arxiv-search support?

Agent support is inferred from the source, but not explicitly declared.

What tools, channels, or permissions does arxiv-search need?

It uses python3, pip; channels commonly include text; permissions include network:outbound, process:spawn, filesystem:read.

Is arxiv-search safe to install?

Static analysis marked this skill as low risk; review side effects and permissions before enabling it.

How is arxiv-search different from an MCP or plugin?

A skill packages instructions and workflow conventions; tools, MCP servers, and plugins are dependencies the skill may call during execution.

Does arxiv-search outperform not using a skill?

About arxiv-search

When to use arxiv-search

When you need to find preprints on arXiv by topic or keyword. When you want quick access to paper abstracts in fields like CS, math, physics, or biology. When screening a small set of relevant papers before deeper literature review.

When arxiv-search is not the right choice

When you need access to paywalled journals or databases beyond arXiv. When you need full-text analysis or citation network exploration rather than title-and-abstract search.

What it produces

Produces text response.