M

ml-paper-writing

Verified

by Orchestra Research · Updated Apr 16, 2026

opus
17in6.5Kout
$4.512m 35s

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verification workflows.

academic-writingmachine-learninglatexcitationsresearchpaper-writing

See It In Action

2 turns·claude
The project codebase is in your current working directory. Explore the files to understand the code before making changes. need a neurips-ready draft from this anomaly repo asap, abstract deadline is today — use the repo results, pull real citations, flag anything you can't verify

Let me explore the repo first to understand the research, then draft the paper.

Good, I have a thorough understanding of the repo. Let me read the references file and then invoke the paper-writing skill.

Now let me invoke the paper-writing skill with all the repo context.

Let me check the NeurIPS template and then build the full paper.

No templates shipped with the skill — I'll create the NeurIPS submission from scratch using the standard format. Let me search for additional citations while I prepare the paper.

Now let me read the model code for precise architectural details needed for the methods section.

Good — I have all the code details. Now let me write the complete NeurIPS paper. I'll create the style file and main LaTeX document.

Let me check on the citation search agent, then write the paper.

The citation agent is still running — I'll proceed writing the paper now with the 8 verified citations and mark additional ones as needing verification. Let me write the full draft.

Now the main paper:

Now let me try to compile it and create the verification report.

PDF compiles successfully (11 pages). Let me check the page count for the main body (NeurIPS limit is 9 pages excluding references/appendix).

Now let me write the verification report.

Done. Here's what was produced in paper/:

main.tex -- Complete NeurIPS 2026 submission draft (11 pages, compiles to PDF):

  • Abstract (5-sentence formula: contribution, why it's hard, how, evidence, best number)
  • Introduction with 3 bullet contributions
  • Related Work (4 paragraphs: deep AD, frequency methods, contrastive learning, threshold calibration)
  • Methods (spectral branch, temporal branch, cross-attention gate, training objective, EVT calibration -- all equations match src/model.py)
  • Experiments: main results table, precision/recall table, component ablation, hyperparameter sensitivity, efficiency comparison
  • Limitations section (honest about point-adjust inflation, single-seed, fixed top-k)
  • NeurIPS checklist + full appendix table
  • All numbers pulled directly from results/benchmarks.json and results/ablations.json

references.bib -- 14 citations:

  • 8 verified (from repo)
  • 6 marked PLACEHOLDER -- web search suggests they exist but BibTeX not fetched from DOI

VERIFICATION_REPORT.md -- Flags everything I couldn't verify:

Critical flags before you submit:

  1. 6 PLACEHOLDER citations need DOI-verified BibTeX (Park 2018, Tuli 2022, Siffer 2017, Hundman 2018, Mathur 2016, Abdulaal 2021)
  2. No Figure 1 -- you need an architecture diagram; reviewers will look for it first
  3. Baseline F1 numbers are from your benchmarks.json -- cross-check against original papers to ensure they match published results
  4. Single-seed results (seed=42) -- no error bars. Reviewers will flag this
  5. chen2020simclr has a possible typo: "Norber" may be "Norbert"
  6. Efficiency claims (2.8M params, 1.2ms/window) -- no profiling script in repo to reproduce

The paper is ready to iterate on. The highest-ROI next steps are the architecture figure and citation verification.

<task-notification> <task-id>a1b633f56607b0c6a</task-id> <tool-use-id>toolu_018hQ7zAmnLuywSJcoekBrZ5</tool-use-id> <output-file>/private/tmp/claude-501/-private-var-folders-t6--sx-03q50-v9w2jpnq683jmw0000gn-T-claude-run-lb8fr6y4/3e22c752-106a-4f6e-8abe-5c7f3c75790f/tasks/a1b633f56607b0c6a.output</output-file> <status>completed</status> <summary>Agent "Search and verify citations" completed</summary> <result>Now I have all the information needed. Let me compile the comprehensive report. --- Here is the complete citation verification report. I searched for each requested paper and several additional foundational references. For each, I indicate verification status and provide a suggested BibTeX entry. --- ## VERIFIED CITATIONS ### 1. LSTM-VAE (Park et al., 2018) -- VERIFIED - **Title:** A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-Based Variational Autoencoder - **Authors:** Daehyung Park, Yuuna Hoshi, Charles C. Kemp - **Venue:** IEEE Robotics and Automation Letters, Vol. 3, No. 3, pp. 1544-1551, 2018 - **DOI:** 10.1109/LRA.2018.2801475 - **Verification:** Confirmed via IEEE Xplore, Semantic Scholar, and KAIST institutional page. arXiv: 1711.00614. ```bibtex @article{park2018multimodal, title={A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an {LSTM}-Based Variational Autoencoder}, author={Park, Daehyung and Hoshi, Yuuna and Kemp, Charles C.}, journal={IEEE Robotics and Automation Letters}, volume={3}, number={3}, pages={1544--1551}, year={2018}, doi={10.1109/LRA.2018.2801475} } ``` ### 2. Li et al., 2021 (InterFusion) -- VERIFIED - **Title:** Multivariate Time Series Anomaly Detection and Interpretation using Hierarchical Inter-Metric and Temporal Embedding - **Authors:** Zhihan Li, Youjian Zhao, Jiaqi Han, Ya Su, Rui Jiao, Xidao Wen, Dan Pei - **Venue:** KDD 2021 (Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp; Data Mining), pp. 3220-3230 - **Verification:** Confirmed via ACM Digital Library, GitHub repo (zhhlee/InterFusion), Semantic Scholar. ```bibtex @inproceedings{li2021multivariate, title={Multivariate Time Series Anomaly Detection and Interpretation using Hierarchical Inter-Metric and Temporal Embedding}, author={Li, Zhihan and Zhao, Youjian and Han, Jiaqi and Su, Ya and Jiao, Rui and Wen, Xidao and Pei, Dan}, booktitle={Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery \&amp; Data Mining}, pages={3220--3230}, year={2021} } ``` **Note:** The repo comment says "survey" but this is actually a methods paper (InterFusion), not a survey. If a survey is what was intended, see the Blazquez-Garcia et al. 2021 entry below under "Additional Foundational Papers." ### 3. TranAD (Tuli et al., 2022) -- VERIFIED - **Title:** TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data - **Authors:** Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings - **Venue:** Proceedings of the VLDB Endowment (PVLDB), Vol. 15, No. 6, pp. 1201-1214, 2022 - **Verification:** Confirmed via ACM Digital Library, arXiv (2201.07284), GitHub (imperial-qore/TranAD). ```bibtex @article{tuli2022tranad, title={Tran{AD}: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data}, author={Tuli, Shreshth and Casale, Giuliano and Jennings, Nicholas R.}, journal={Proceedings of the VLDB Endowment}, volume={15}, number={6}, pages={1201--1214}, year={2022} } ``` ### 4. Extreme Value Theory for Threshold Calibration (Siffer et al., 2017) -- VERIFIED - **Title:** Anomaly Detection in Streams with Extreme Value Theory - **Authors:** Alban Siffer, Pierre-Alain Fouque, Alexandre Termier, Christine Largouet - **Venue:** KDD 2017 (Proceedings of the 23rd ACM SIGKDD), pp. 1067-1075 - **DOI:** 10.1145/3097983.3098144 - **Verification:** Confirmed via ACM Digital Library, DBLP, HAL archive. Proposes SPOT and DSPOT algorithms for automatic threshold selection. This is the canonical EVT-for-anomaly-detection reference; it is also used by OmniAnomaly for threshold calibration. ```bibtex @inproceedings{siffer2017anomaly, title={Anomaly Detection in Streams with Extreme Value Theory}, author={Siffer, Alban and Fouque, Pierre-Alain and Termier, Alexandre and Largou{\"e}t, Christine}, booktitle={Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining}, pages={1067--1075}, year={2017}, doi={10.1145/3097983.3098144} } ``` ### 5. Spectral / FFT in Deep Learning for Time Series -- VERIFIED (TWO PAPERS) Since your repo already cites FEDformer (zhou2022fedformer) and TimesNet (wu2023timesnet), the most relevant additional citation for spectral methods is **Autoformer**, which introduced decomposition + auto-correlation based on frequency-domain analysis: **5a. Autoformer (Wu et al., 2021) -- VERIFIED** - **Title:** Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting - **Authors:** Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long - **Venue:** NeurIPS 2021 (Advances in Neural Information Processing Systems, Vol. 34) - **Verification:** Confirmed via NeurIPS proceedings, arXiv (2106.13008), OpenReview, GitHub (thuml/Autoformer). ```bibtex @inproceedings{wu2021autoformer, title={Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting}, author={Wu, Haixu and Xu, Jiehui and Wang, Jianmin and Long, Mingsheng}, booktitle={Advances in Neural Information Processing Systems}, volume={34}, year={2021} } ``` ### 6. Point-Adjust F1 Evaluation Protocol -- VERIFIED - **Title:** Towards a Rigorous Evaluation of Time-Series Anomaly Detection - **Authors:** Siwon Kim, Kukjin Choi, Hyun-Soo Choi, Byunghan Lee, Sungroh Yoon - **Venue:** AAAI 2022 (Proceedings of the 36th AAAI Conference on Artificial Intelligence) - **Verification:** Confirmed via AAAI proceedings, arXiv (2109.05257), GitHub (tuslkkk/tadpak). - **Relevance:** This is the paper that formally analyzes and critiques the point-adjustment (PA) protocol. It shows PA can overestimate performance and proposes alternatives. This is the standard citation when discussing point-adjust F1. ```bibtex @inproceedings{kim2022towards, title={Towards a Rigorous Evaluation of Time-Series Anomaly Detection}, author={Kim, Siwon and Choi, Kukjin and Choi, Hyun-Soo and Lee, Byunghan and Yoon, Sungroh}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={36}, number={7}, pages={7062--7070}, year={2022} } ``` --- ## ADDITIONAL FOUNDATIONAL PAPERS (VERIFIED) These are highly relevant papers for a NeurIPS STAN paper that I found and verified during the search: ### 7. Hundman et al., 2018 (LSTM + NASA SMAP/MSL datasets) -- VERIFIED This paper introduced the SMAP and MSL benchmark datasets widely used in time series anomaly detection. - **Title:** Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding - **Authors:** Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, Tom Soderstrom - **Venue:** KDD 2018, DOI: 10.1145/3219819.3219845 - arXiv: 1802.04431 ```bibtex @inproceedings{hundman2018detecting, title={Detecting Spacecraft Anomalies Using {LSTMs} and Nonparametric Dynamic Thresholding}, author={Hundman, Kyle and Constantinou, Valentino and Laporte, Christopher and Colwell, Ian and Soderstrom, Tom}, booktitle={Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery \&amp; Data Mining}, pages={387--395}, year={2018}, doi={10.1145/3219819.3219845} } ``` ### 8. Deng &amp; Hooi, 2021 (GDN -- Graph Neural Network for Anomaly Detection) -- VERIFIED - **Title:** Graph Neural Network-Based Anomaly Detection in Multivariate Time Series - **Authors:** Ailin Deng, Bryan Hooi - **Venue:** AAAI 2021, Vol. 35, No. 5, pp. 4027-4035 ```bibtex @inproceedings{deng2021graph, title={Graph Neural Network-Based Anomaly Detection in Multivariate Time Series}, author={Deng, Ailin and Hooi, Bryan}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={35}, number={5}, pages={4027--4035}, year={2021} } ``` ### 9. Blazquez-Garcia et al., 2021 (Anomaly Detection in Time Series Survey) -- VERIFIED If you need a true survey paper (vs. the InterFusion methods paper): - **Title:** A Review on Outlier/Anomaly Detection in Time Series Data - **Authors:** Ane Blazquez-Garcia, Angel Conde, Usue Mori, Jose A. Lozano - **Venue:** ACM Computing Surveys, Vol. 54, No. 3, Article 56, pp. 1-33, 2021 - DOI: 10.1145/3444690 ```bibtex @article{blazquez2021review, title={A Review on Outlier/Anomaly Detection in Time Series Data}, author={Bl{\'a}zquez-Garc{\'i}a, Ane and Conde, Angel and Mori, Usue and Lozano, Jose A.}, journal={ACM Computing Surveys}, volume={54}, number={3}, pages={1--33}, year={2021}, doi={10.1145/3444690} } ``` ### 10. Malhotra et al., 2015 (First LSTM for Anomaly Detection) -- VERIFIED - **Title:** Long Short Term Memory Networks for Anomaly Detection in Time Series - **Authors:** Pankaj Malhotra, Lovekesh Vig, Gautam Shroff, Puneet Agarwal - **Venue:** ESANN 2015 (European Symposium on Artificial Neural Networks) ```bibtex @inproceedings{malhotra2015long, title={Long Short Term Memory Networks for Anomaly Detection in Time Series}, author={Malhotra, Pankaj and Vig, Lovekesh and Shroff, Gautam and Agarwal, Puneet}, booktitle={European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning}, year={2015} } ``` --- ## VERIFICATION SUMMARY | # | Citation | Verified? | Confidence | |---|----------|-----------|------------| | 1 | Park et al. 2018 (LSTM-VAE) | YES | High -- confirmed on IEEE Xplore with volume/pages/DOI | | 2 | Li et al. 2021 (InterFusion, KDD) | YES | High -- confirmed on ACM DL. Note: methods paper, not a survey | | 3 | Tuli et al. 2022 (TranAD, VLDB) | YES | High -- confirmed on ACM DL with volume/pages | | 4 | Siffer et al. 2017 (EVT, KDD) | YES | High -- confirmed on ACM DL with DOI | | 5 | Wu et al. 2021 (Autoformer, NeurIPS) | YES | High -- confirmed on NeurIPS proceedings | | 6 | Kim et al. 2022 (Point-Adjust, AAAI) | YES | High -- confirmed on AAAI proceedings | | 7 | Hundman et al. 2018 (SMAP/MSL, KDD) | YES | High -- confirmed on ACM DL | | 8 | Deng &amp; Hooi 2021 (GDN, AAAI) | YES | High -- confirmed on AAAI proceedings | | 9 | Blazquez-Garcia et al. 2021 (Survey) | YES | High -- confirmed on ACM Computing Surveys | | 10 | Malhotra et al. 2015 (LSTM-AD, ESANN) | YES | High -- confirmed on DBLP and Semantic Scholar | **All 10 citations were successfully verified via web search.** I could not find any that I was unable to verify. ## KEY NOTES FOR YOUR PAPER 1. **"Li et al., 2021" ambiguity:** The repo comment says "survey" but the verified Li et al. 2021 KDD paper (InterFusion) is a methods paper, not a survey. If you want a proper survey, use Blazquez-Garcia et al. 2021 (ACM Computing Surveys) instead, or use both. 2. **Point-adjust F1:** Kim et al. 2022 (AAAI) is the canonical reference for critiquing the PA protocol. If you use PA in your evaluation, you should cite this paper and acknowledge its limitations. You may also want to report results with alternative metrics. 3. **EVT for thresholds:** Siffer et al. 2017 (SPOT/DSPOT) is the canonical reference. OmniAnomaly (already in your bib) also uses EVT internally, so citing Siffer et al. gives the theoretical foundation. 4. **Spectral methods:** Your repo already has FEDformer and TimesNet. Adding Autoformer (Wu et al. 2021) completes the lineage since FEDformer builds on Autoformer's decomposition ideas. 5. **Existing bib file location:** `/private/var/folders/t6/_sx_03q50_v9w2jpnq683jmw0000gn/T/claude-run-lb8fr6y4/references.bib`</result> <usage><total_tokens>43175</total_tokens><tool_uses>29</tool_uses><duration_ms>190422</duration_ms></usage> </task-notification>

The citation agent completed and verified all 10 additional references. Let me update the paper's bib file with the verified entries and fix the placeholders.

Now let me update the paper to use the verified citation keys instead of placeholders.

Now I'll replace all PLACEHOLDER citation keys with verified ones.

Also add the point-adjust critique citation and Autoformer to the paper where relevant.

Now recompile to verify everything resolves.

The warnings on first pass are normal (bibtex needs to run first). The final two passes compiled cleanly — 11 pages, no errors. Let me update the verification report to reflect the citation improvements.

All done. Here's the summary:


Deliverables in paper/

FileWhat
main.texComplete NeurIPS 2026 draft, compiles to 11-page PDF
main.pdfCompiled PDF
references.bib16 citations (8 from repo + 8 web-verified)
neurips_2026.styStyle file
VERIFICATION_REPORT.mdFull audit of what's verified vs. flagged

Paper structure (fits 9-page body limit)

  1. Abstract -- one-sentence contribution, 92.54% avg F1, +2.71 over SOTA, efficiency gains
  2. Introduction -- 3 bullet contributions (spectral branch, temporal branch, cross-attention gate)
  3. Related Work -- 4 paragraphs covering deep AD, frequency methods, contrastive learning, threshold calibration
  4. Methods -- full equations matching src/model.py exactly (FFT selection, spectral attention, causal transformer, gated fusion, NT-Xent loss, EVT calibration)
  5. Experiments -- 6 tables: main results, precision/recall, component ablation, hyperparameter sensitivity, efficiency
  6. Limitations -- honest about point-adjust inflation, single-seed, fixed top-k
  7. Appendix -- NeurIPS checklist + full per-dataset P/R/F1

What you must do before submitting

  1. Figure 1 -- architecture diagram (highest priority; reviewers check this first)
  2. Cross-check baseline numbers against original papers
  3. Spot-check BibTeX fields -- all 8 new citations were web-verified but not DOI-fetched
  4. Fix "Norber" typo in chen2020simclr (likely "Norbert")
  5. Error bars if you can run 3+ seeds before deadline
Claude Sonnet 4.6

Permissions

ScopeDescription
filesystem:read
filesystem:write
network:outbound

SKILL.md

ML Paper Writing for Top AI Conferences

Expert-level guidance for writing publication-ready papers targeting NeurIPS, ICML, ICLR, ACL, AAAI, and COLM. This skill combines writing philosophy from top researchers (Nanda, Farquhar, Karpathy, Lipton, Steinhardt) with practical tools: LaTeX templates, citation verification APIs, and conference checklists.

Core Philosophy: Collaborative Writing

Paper writing is collaborative, but Claude should be proactive in delivering drafts.

The typical workflow starts with a research repository containing code, results, and experimental artifacts. Claude's role is to:

  1. Understand the project by exploring the repo, results, and existing documentation
  2. Deliver a complete first draft when confident about the contribution
  3. Search literature using web search and APIs to find relevant citations
  4. Refine through feedback cycles when the scientist provides input
  5. Ask for clarification only when genuinely uncertain about key decisions

Key Principle: Be proactive. If the repo and results are clear, deliver a full draft. Don't block waiting for feedback on every section—scientists are busy. Produce something concrete they can react to, then iterate based on their response.


⚠️ CRITICAL: Never Hallucinate Citations

This is the most important rule in academic writing with AI assistance.

The Problem

AI-generated citations have a ~40% error rate. Hallucinated references—papers that don't exist, wrong authors, incorrect years, fabricated DOIs—are a serious form of academic misconduct that can result in desk rejection or retraction.

The Rule

NEVER generate BibTeX entries from memory. ALWAYS fetch programmatically.

Action✅ Correct❌ Wrong
Adding a citationSearch API → verify → fetch BibTeXWrite BibTeX from memory
Uncertain about a paperMark as [CITATION NEEDED]Guess the reference
Can't find exact paperNote: "placeholder - verify"Invent similar-sounding paper

When You Can't Verify a Citation

If you cannot programmatically verify a citation, you MUST:

% EXPLICIT PLACEHOLDER - requires human verification
\cite{PLACEHOLDER_author2024_verify_this}  % TODO: Verify this citation exists

Always tell the scientist: "I've marked [X] citations as placeholders that need verification. I could not confirm these papers exist."

Recommended: Install Exa MCP for Paper Search

For the best paper search experience, install Exa MCP which provides real-time academic search:

Claude Code:

claude mcp add exa -- npx -y mcp-remote "https://mcp.exa.ai/mcp"

Cursor / VS Code (add to MCP settings):

{
  "mcpServers": {
    "exa": {
      "type": "http",
      "url": "https://mcp.exa.ai/mcp"
    }
  }
}

Exa MCP enables searches like:

  • "Find papers on RLHF for language models published after 2023"
  • "Search for transformer architecture papers by Vaswani"
  • "Get recent work on sparse autoencoders for interpretability"

Then verify results with Semantic Scholar API and fetch BibTeX via DOI.


Workflow 0: Starting from a Research Repository

When beginning paper writing, start by understanding the project:

Project Understanding:
- [ ] Step 1: Explore the repository structure
- [ ] Step 2: Read README, existing docs, and key results
- [ ] Step 3: Identify the main contribution with the scientist
- [ ] Step 4: Find papers already cited in the codebase
- [ ] Step 5: Search for additional relevant literature
- [ ] Step 6: Outline the paper structure together
- [ ] Step 7: Draft sections iteratively with feedback

Step 1: Explore the Repository

# Understand project structure
ls -la
find . -name "*.py" | head -20
find . -name "*.md" -o -name "*.txt" | xargs grep -l -i "result\|conclusion\|finding"

Look for:

  • README.md - Project overview and claims
  • results/, outputs/, experiments/ - Key findings
  • configs/ - Experimental settings
  • Existing .bib files or citation references
  • Any draft documents or notes

Step 2: Identify Existing Citations

Check for papers already referenced in the codebase:

# Find existing citations
grep -r "arxiv\|doi\|cite" --include="*.md" --include="*.bib" --include="*.py"
find . -name "*.bib"

These are high-signal starting points for Related Work—the scientist has already deemed them relevant.

Step 3: Clarify the Contribution

Before writing, explicitly confirm with the scientist:

"Based on my understanding of the repo, the main contribution appears to be [X]. The key results show [Y]. Is this the framing you want for the paper, or should we emphasize different aspects?"

Never assume the narrative—always verify with the human.

Step 4: Search for Additional Literature

Use web search to find relevant papers:

Search queries to try:
- "[main technique] + [application domain]"
- "[baseline method] comparison"
- "[problem name] state-of-the-art"
- Author names from existing citations

Then verify and retrieve BibTeX using the citation workflow below.

Step 5: Deliver a First Draft

Be proactive—deliver a complete draft rather than asking permission for each section.

If the repo provides clear results and the contribution is apparent:

  1. Write the full first draft end-to-end
  2. Present the complete draft for feedback
  3. Iterate based on scientist's response

If genuinely uncertain about framing or major claims:

  1. Draft what you can confidently
  2. Flag specific uncertainties: "I framed X as the main contribution—let me know if you'd prefer to emphasize Y instead"
  3. Continue with the draft rather than blocking

Questions to include with the draft (not before):

  • "I emphasized X as the main contribution—adjust if needed"
  • "I highlighted results A, B, C—let me know if others are more important"
  • "Related work section includes [papers]—add any I missed"

When to Use This Skill

Use this skill when:

  • Starting from a research repo to write a paper
  • Drafting or revising specific sections
  • Finding and verifying citations for related work
  • Formatting for conference submission
  • Resubmitting to a different venue (format conversion)
  • Iterating on drafts with scientist feedback

Always remember: First drafts are starting points for discussion, not final outputs.


Balancing Proactivity and Collaboration

Default: Be proactive. Deliver drafts, then iterate.

Confidence LevelAction
High (clear repo, obvious contribution)Write full draft, deliver, iterate on feedback
Medium (some ambiguity)Write draft with flagged uncertainties, continue
Low (major unknowns)Ask 1-2 targeted questions, then draft

Draft first, ask with the draft (not before):

SectionDraft AutonomouslyFlag With Draft
AbstractYes"Framed contribution as X—adjust if needed"
IntroductionYes"Emphasized problem Y—correct if wrong"
MethodsYes"Included details A, B, C—add missing pieces"
ExperimentsYes"Highlighted results 1, 2, 3—reorder if needed"
Related WorkYes"Cited papers X, Y, Z—add any I missed"

Only block for input when:

  • Target venue is unclear (affects page limits, framing)
  • Multiple contradictory framings seem equally valid
  • Results seem incomplete or inconsistent
  • Explicit request to review before continuing

Don't block for:

  • Word choice decisions
  • Section ordering
  • Which specific results to show (make a choice, flag it)
  • Citation completeness (draft with what you find, note gaps)

The Narrative Principle

The single most critical insight: Your paper is not a collection of experiments—it's a story with one clear contribution supported by evidence.

Every successful ML paper centers on what Neel Nanda calls "the narrative": a short, rigorous, evidence-based technical story with a takeaway readers care about.

Three Pillars (must be crystal clear by end of introduction):

PillarDescriptionExample
The What1-3 specific novel claims within cohesive theme"We prove that X achieves Y under condition Z"
The WhyRigorous empirical evidence supporting claimsStrong baselines, experiments distinguishing hypotheses
The So WhatWhy readers should careConnection to recognized community problems

If you cannot state your contribution in one sentence, you don't yet have a paper.


Paper Structure Workflow

Workflow 1: Writing a Complete Paper (Iterative)

Copy this checklist and track progress. Each step involves drafting → feedback → revision:

Paper Writing Progress:
- [ ] Step 1: Define the one-sentence contribution (with scientist)
- [ ] Step 2: Draft Figure 1 → get feedback → revise
- [ ] Step 3: Draft abstract → get feedback → revise
- [ ] Step 4: Draft introduction → get feedback → revise
- [ ] Step 5: Draft methods → get feedback → revise
- [ ] Step 6: Draft experiments → get feedback → revise
- [ ] Step 7: Draft related work → get feedback → revise
- [ ] Step 8: Draft limitations → get feedback → revise
- [ ] Step 9: Complete paper checklist (required)
- [ ] Step 10: Final review cycle and submission

Step 1: Define the One-Sentence Contribution

This step requires explicit confirmation from the scientist.

Before writing anything, articulate and verify:

  • What is the single thing your paper contributes?
  • What was not obvious or present before your work?

"I propose framing the contribution as: '[one sentence]'. Does this capture what you see as the main takeaway? Should we adjust the emphasis?"

Step 2: Draft Figure 1

Figure 1 deserves special attention—many readers skip directly to it.

  • Convey core idea, approach, or most compelling result
  • Use vector graphics (PDF/EPS for plots)
  • Write captions that stand alone without main text
  • Ensure readability in black-and-white (8% of men have color vision deficiency)

Step 3: Write Abstract (5-Sentence Formula)

From Sebastian Farquhar (DeepMind):

1. What you achieved: "We introduce...", "We prove...", "We demonstrate..."
2. Why this is hard and important
3. How you do it (with specialist keywords for discoverability)
4. What evidence you have
5. Your most remarkable number/result

Delete generic openings like "Large language models have achieved remarkable success..."

Step 4: Write Introduction (1-1.5 pages max)

Must include:

  • 2-4 bullet contribution list (max 1-2 lines each in two-column format)
  • Clear problem statement
  • Brief approach overview
  • Methods should start by page 2-3 maximum

Step 5: Methods Section

Enable reimplementation:

  • Conceptual outline or pseudocode
  • All hyperparameters listed
  • Architectural details sufficient for reproduction
  • Present final design decisions; ablations go in experiments

Step 6: Experiments Section

For each experiment, explicitly state:

  • What claim it supports
  • How it connects to main contribution
  • Experimental setting (details in appendix)
  • What to observe: "the blue line shows X, which demonstrates Y"

Requirements:

  • Error bars with methodology (standard deviation vs standard error)
  • Hyperparameter search ranges
  • Compute infrastructure (GPU type, total hours)
  • Seed-setting methods

Step 7: Related Work

Organize methodologically, not paper-by-paper:

Good: "One line of work uses Floogledoodle's assumption [refs] whereas we use Doobersnoddle's assumption because..."

Bad: "Snap et al. introduced X while Crackle et al. introduced Y."

Cite generously—reviewers likely authored relevant papers.

Step 8: Limitations Section (REQUIRED)

All major conferences require this. Counter-intuitively, honesty helps:

  • Reviewers are instructed not to penalize honest limitation acknowledgment
  • Pre-empt criticisms by identifying weaknesses first
  • Explain why limitations don't undermine core claims

Step 9: Paper Checklist

NeurIPS, ICML, and ICLR all require paper checklists. See references/checklists.md.


Writing Philosophy for Top ML Conferences

This section distills the most important writing principles from leading ML researchers. These aren't optional style suggestions—they're what separates accepted papers from rejected ones.

"A paper is a short, rigorous, evidence-based technical story with a takeaway readers care about." — Neel Nanda

The Sources Behind This Guidance

This skill synthesizes writing philosophy from researchers who have published extensively at top venues:

SourceKey ContributionLink
Neel Nanda (Google DeepMind)The Narrative Principle, What/Why/So What frameworkHow to Write ML Papers
Sebastian Farquhar (DeepMind)5-sentence abstract formulaHow to Write ML Papers
Gopen & Swan7 principles of reader expectationsScience of Scientific Writing
Zachary LiptonWord choice, eliminating hedgingHeuristics for Scientific Writing
Jacob Steinhardt (UC Berkeley)Precision, consistent terminologyWriting Tips
Ethan Perez (Anthropic)Micro-level clarity tipsEasy Paper Writing Tips
Andrej KarpathySingle contribution focusVarious lectures

For deeper dives into any of these, see:

Time Allocation (From Neel Nanda)

Spend approximately equal time on each of:

  1. The abstract
  2. The introduction
  3. The figures
  4. Everything else combined

Why? Most reviewers form judgments before reaching your methods. Readers encounter your paper as: title → abstract → introduction → figures → maybe the rest.

Writing Style Guidelines

Sentence-Level Clarity (Gopen & Swan's 7 Principles)

These principles are based on how readers actually process prose. Violating them forces readers to spend cognitive effort on structure rather than content.

PrincipleRuleExample
Subject-verb proximityKeep subject and verb close❌ "The model, which was trained on..., achieves" → ✅ "The model achieves... after training on..."
Stress positionPlace emphasis at sentence ends❌ "Accuracy improves by 15% when using attention" → ✅ "When using attention, accuracy improves by 15%"
Topic positionPut context first, new info after✅ "Given these constraints, we propose..."
Old before newFamiliar info → unfamiliar infoLink backward, then introduce new
One unit, one functionEach paragraph makes one pointSplit multi-point paragraphs
Action in verbUse verbs, not nominalizations❌ "We performed an analysis" → ✅ "We analyzed"
Context before newSet stage before presentingExplain before showing equation

Full 7 principles with detailed examples: See references/writing-guide.md

Micro-Level Tips (Ethan Perez)

These small changes accumulate into significantly clearer prose:

  • Minimize pronouns: ❌ "This shows..." → ✅ "This result shows..."
  • Verbs early: Position verbs near sentence start
  • Unfold apostrophes: ❌ "X's Y" → ✅ "The Y of X" (when awkward)
  • Delete filler words: "actually," "a bit," "very," "really," "basically," "quite," "essentially"

Full micro-tips with examples: See references/writing-guide.md

Word Choice (Zachary Lipton)

  • Be specific: ❌ "performance" → ✅ "accuracy" or "latency" (say what you mean)
  • Eliminate hedging: Drop "may" and "can" unless genuinely uncertain
  • Avoid incremental vocabulary: ❌ "combine," "modify," "expand" → ✅ "develop," "propose," "introduce"
  • Delete intensifiers: ❌ "provides very tight approximation" → ✅ "provides tight approximation"

Precision Over Brevity (Jacob Steinhardt)

  • Consistent terminology: Different terms for same concept creates confusion. Pick one and stick with it.
  • State assumptions formally: Before theorems, list all assumptions explicitly
  • Intuition + rigor: Provide intuitive explanations alongside formal proofs

What Reviewers Actually Read

Understanding reviewer behavior helps prioritize your effort:

Paper Section% Reviewers Who ReadImplication
Abstract100%Must be perfect
Introduction90%+ (skimmed)Front-load contribution
FiguresExamined before methodsFigure 1 is critical
MethodsOnly if interestedDon't bury the lede
AppendixRarelyPut only supplementary details

Bottom line: If your abstract and intro don't hook reviewers, they may never read your brilliant methods section.


Conference Requirements Quick Reference

ConferencePage LimitExtra for Camera-ReadyKey Requirement
NeurIPS 20259 pages+0Mandatory checklist, lay summary for accepted
ICML 20268 pages+1Broader Impact Statement required
ICLR 20269 pages+1LLM disclosure required, reciprocal reviewing
ACL 20258 pages (long)variesLimitations section mandatory
AAAI 20267 pages+1Strict style file adherence
COLM 20259 pages+1Focus on language models

Universal Requirements:

  • Double-blind review (anonymize submissions)
  • References don't count toward page limit
  • Appendices unlimited but reviewers not required to read
  • LaTeX required for all venues

LaTeX Templates: See templates/ directory for all conference templates.


Using LaTeX Templates Properly

Workflow 4: Starting a New Paper from Template

Always copy the entire template directory first, then write within it.

Template Setup Checklist:
- [ ] Step 1: Copy entire template directory to new project
- [ ] Step 2: Verify template compiles as-is (before any changes)
- [ ] Step 3: Read the template's example content to understand structure
- [ ] Step 4: Replace example content section by section
- [ ] Step 5: Keep template comments/examples as reference until done
- [ ] Step 6: Clean up template artifacts only at the end

Step 1: Copy the Full Template

# Create your paper directory with the complete template
cp -r templates/neurips2025/ ~/papers/my-new-paper/
cd ~/papers/my-new-paper/

# Verify structure is complete
ls -la
# Should see: main.tex, neurips.sty, Makefile, etc.

⚠️ IMPORTANT: Copy the ENTIRE directory, not just main.tex. Templates include:

  • Style files (.sty) - required for compilation
  • Bibliography styles (.bst) - required for references
  • Example content - useful as reference
  • Makefiles - for easy compilation

Step 2: Verify Template Compiles First

Before making ANY changes, compile the template as-is:

# Using latexmk (recommended)
latexmk -pdf main.tex

# Or manual compilation
pdflatex main.tex
bibtex main
pdflatex main.tex
pdflatex main.tex

If the unmodified template doesn't compile, fix that first. Common issues:

  • Missing TeX packages → install via tlmgr install <package>
  • Wrong TeX distribution → use TeX Live (recommended)

Step 3: Keep Template Content as Reference

Don't immediately delete all example content. Instead:

% KEEP template examples commented out as you write
% This shows you the expected format

% Template example (keep for reference):
% \begin{figure}[t]
%   \centering
%   \includegraphics[width=0.8\linewidth]{example-image}
%   \caption{Template shows caption style}
% \end{figure}

% Your actual figure:
\begin{figure}[t]
  \centering
  \includegraphics[width=0.8\linewidth]{your-figure.pdf}
  \caption{Your caption following the same style.}
\end{figure}

Step 4: Replace Content Section by Section

Work through the paper systematically:

Replacement Order:
1. Title and authors (anonymize for submission)
2. Abstract
3. Introduction
4. Methods
5. Experiments
6. Related Work
7. Conclusion
8. References (your .bib file)
9. Appendix

For each section:

  1. Read the template's example content
  2. Note any special formatting or macros used
  3. Replace with your content following the same patterns
  4. Compile frequently to catch errors early

Step 5: Use Template Macros

Templates often define useful macros. Check the preamble for:

% Common template macros to use:
\newcommand{\method}{YourMethodName}  % Consistent method naming
\newcommand{\eg}{e.g.,\xspace}        % Proper abbreviations
\newcommand{\ie}{i.e.,\xspace}
\newcommand{\etal}{\textit{et al.}\xspace}

Step 6: Clean Up Only at the End

Only remove template artifacts when paper is nearly complete:

% BEFORE SUBMISSION - remove these:
% - Commented-out template examples
% - Unused packages
% - Template's example figures/tables
% - Lorem ipsum or placeholder text

% KEEP these:
% - All style files (.sty)
% - Bibliography style (.bst)
% - Required packages from template
% - Any custom macros you're using

Template Pitfalls to Avoid

PitfallProblemSolution
Copying only main.texMissing .sty, won't compileCopy entire directory
Modifying .sty filesBreaks conference formattingNever edit style files
Adding random packagesConflicts, breaks templateOnly add if necessary
Deleting template content too earlyLose formatting referenceKeep as comments until done
Not compiling frequentlyErrors accumulateCompile after each section

Quick Template Reference

ConferenceMain FileKey Style FileNotes
NeurIPS 2025main.texneurips.styHas Makefile
ICML 2026example_paper.texicml2026.styIncludes algorithm packages
ICLR 2026iclr2026_conference.texiclr2026_conference.styHas math_commands.tex
ACLacl_latex.texacl.styStrict formatting
AAAI 2026aaai2026-unified-template.texaaai2026.styVery strict compliance
COLM 2025colm2025_conference.texcolm2025_conference.stySimilar to ICLR

Conference Resubmission & Format Conversion

When a paper is rejected or withdrawn from one venue and resubmitted to another, format conversion is required. This is a common workflow in ML research.

Workflow 3: Converting Between Conference Formats

Format Conversion Checklist:
- [ ] Step 1: Identify source and target template differences
- [ ] Step 2: Create new project with target template
- [ ] Step 3: Copy content sections (not preamble)
- [ ] Step 4: Adjust page limits and content
- [ ] Step 5: Update conference-specific requirements
- [ ] Step 6: Verify compilation and formatting

Step 1: Key Template Differences

From → ToPage ChangeKey Adjustments
NeurIPS → ICML9 → 8 pagesCut 1 page, add Broader Impact if missing
ICML → ICLR8 → 9 pagesCan expand experiments, add LLM disclosure
NeurIPS → ACL9 → 8 pagesRestructure for NLP conventions, add Limitations
ICLR → AAAI9 → 7 pagesSignificant cuts needed, strict style adherence
Any → COLMvaries → 9Reframe for language model focus

Step 2: Content Migration (NOT Template Merge)

Never copy LaTeX preambles between templates. Instead:

# 1. Start fresh with target template
cp -r templates/icml2026/ new_submission/

# 2. Copy ONLY content sections from old paper
# - Abstract text
# - Section content (between \section{} commands)
# - Figures and tables
# - Bibliography entries

# 3. Paste into target template structure

Step 3: Adjusting for Page Limits

When cutting pages (e.g., NeurIPS 9 → AAAI 7):

  • Move detailed proofs to appendix
  • Condense related work (cite surveys instead of individual papers)
  • Combine similar experiments into unified tables
  • Use smaller figure sizes with subfigures
  • Tighten writing: eliminate redundancy, use active voice

When expanding (e.g., ICML 8 → ICLR 9):

  • Add ablation studies reviewers requested
  • Expand limitations discussion
  • Include additional baselines
  • Add qualitative examples

Step 4: Conference-Specific Adjustments

Target VenueRequired Additions
ICMLBroader Impact Statement (after conclusion)
ICLRLLM usage disclosure, reciprocal reviewing agreement
ACL/EMNLPLimitations section (mandatory), Ethics Statement
AAAIStrict adherence to style file (no modifications)
NeurIPSPaper checklist (appendix), lay summary if accepted

Step 5: Update References

% Remove self-citations that reveal identity (for blind review)
% Update any "under review" citations to published versions
% Add new relevant work published since last submission

Step 6: Addressing Previous Reviews

When resubmitting after rejection:

  • Do address reviewer concerns in the new version
  • Do add experiments/clarifications reviewers requested
  • Don't include a "changes from previous submission" section (blind review)
  • Don't reference the previous submission or reviews

Common Conversion Pitfalls:

  • ❌ Copying \usepackage commands (causes conflicts)
  • ❌ Keeping old conference header/footer commands
  • ❌ Forgetting to update \bibliography{} path
  • ❌ Missing conference-specific required sections
  • ❌ Exceeding page limit after format change

Citation Workflow (Hallucination Prevention)

⚠️ CRITICAL: AI-generated citations have ~40% error rate. Never write BibTeX from memory.

The Golden Rule

IF you cannot programmatically fetch a citation:
    → Mark it as [CITATION NEEDED] or [PLACEHOLDER - VERIFY]
    → Tell the scientist explicitly
    → NEVER invent a plausible-sounding reference

Workflow 2: Adding Citations

Citation Verification (MANDATORY for every citation):
- [ ] Step 1: Search using Exa MCP or Semantic Scholar API
- [ ] Step 2: Verify paper exists in 2+ sources (Semantic Scholar + arXiv/CrossRef)
- [ ] Step 3: Retrieve BibTeX via DOI (programmatically, not from memory)
- [ ] Step 4: Verify the claim you're citing actually appears in the paper
- [ ] Step 5: Add verified BibTeX to bibliography
- [ ] Step 6: If ANY step fails → mark as placeholder, inform scientist

Step 0: Use Exa MCP for Initial Search (Recommended)

If Exa MCP is installed, use it to find relevant papers:

Search: "RLHF language model alignment 2023"
Search: "sparse autoencoders interpretability"
Search: "attention mechanism transformers Vaswani"

Then verify each result with Semantic Scholar and fetch BibTeX via DOI.

Step 1: Search Semantic Scholar

from semanticscholar import SemanticScholar

sch = SemanticScholar()
results = sch.search_paper("attention mechanism transformers", limit=5)
for paper in results:
    print(f"{paper.title} - {paper.paperId}")
    print(f"  DOI: {paper.externalIds.get('DOI', 'N/A')}")

Step 2: Verify Existence

Confirm paper appears in at least two sources (Semantic Scholar + CrossRef/arXiv).

Step 3: Retrieve BibTeX via DOI

import requests

def doi_to_bibtex(doi: str) -> str:
    """Get verified BibTeX from DOI via CrossRef."""
    response = requests.get(
        f"https://doi.org/{doi}",
        headers={"Accept": "application/x-bibtex"}
    )
    response.raise_for_status()
    return response.text

# Example
bibtex = doi_to_bibtex("10.48550/arXiv.1706.03762")
print(bibtex)

Step 4: Verify Claims

Before citing for a specific claim, access the paper and confirm the attributed claim actually appears.

Step 5: Handle Failures Explicitly

If you cannot verify a citation at ANY step:

% Option 1: Explicit placeholder
\cite{PLACEHOLDER_smith2023_verify}  % TODO: Could not verify - scientist must confirm

% Option 2: Note in text
... as shown in prior work [CITATION NEEDED - could not verify Smith et al. 2023].

Always inform the scientist:

"I could not verify the following citations and have marked them as placeholders:

  • Smith et al. 2023 on reward hacking - could not find in Semantic Scholar
  • Jones 2022 on scaling laws - found similar paper but different authors Please verify these before submission."

Summary: Citation Rules

SituationAction
Found paper, got DOI, fetched BibTeX✅ Use the citation
Found paper, no DOI✅ Use arXiv BibTeX or manual entry from paper
Paper exists but can't fetch BibTeX⚠️ Mark placeholder, inform scientist
Uncertain if paper exists❌ Mark [CITATION NEEDED], inform scientist
"I think there's a paper about X"NEVER cite - search first or mark placeholder

🚨 NEVER generate BibTeX from memory—always fetch programmatically. 🚨

See references/citation-workflow.md for complete API documentation.


Common Issues and Solutions

Issue: Abstract too generic

Delete first sentence if it could be prepended to any ML paper. Start with your specific contribution.

Issue: Introduction exceeds 1.5 pages

Split background into Related Work. Front-load contribution bullets. Methods should start by page 2-3.

Issue: Experiments lack explicit claims

Add sentence before each experiment: "This experiment tests whether [specific claim]..."

Issue: Reviewers find paper hard to follow

  • Add explicit signposting: "In this section, we show X"
  • Use consistent terminology throughout
  • Include figure captions that stand alone

Issue: Missing statistical significance

Always include:

  • Error bars (specify: std dev or std error)
  • Number of runs
  • Statistical tests if comparing methods

Reviewer Evaluation Criteria

Reviewers assess papers on four dimensions:

CriterionWhat Reviewers Look For
QualityTechnical soundness, well-supported claims
ClarityClear writing, reproducible by experts
SignificanceCommunity impact, advances understanding
OriginalityNew insights (doesn't require new method)

Scoring (NeurIPS 6-point scale):

  • 6: Strong Accept - Groundbreaking, flawless
  • 5: Accept - Technically solid, high impact
  • 4: Borderline Accept - Solid, limited evaluation
  • 3: Borderline Reject - Solid but weaknesses outweigh
  • 2: Reject - Technical flaws
  • 1: Strong Reject - Known results or ethics issues

See references/reviewer-guidelines.md for detailed reviewer instructions.


Tables and Figures

Tables

Use booktabs LaTeX package for professional tables:

\usepackage{booktabs}
\begin{tabular}{lcc}
\toprule
Method & Accuracy ↑ & Latency ↓ \\
\midrule
Baseline & 85.2 & 45ms \\
\textbf{Ours} & \textbf{92.1} & 38ms \\
\bottomrule
\end{tabular}

Rules:

  • Bold best value per metric
  • Include direction symbols (↑ higher is better, ↓ lower is better)
  • Right-align numerical columns
  • Consistent decimal precision

Figures

  • Vector graphics (PDF, EPS) for all plots and diagrams
  • Raster (PNG 600 DPI) only for photographs
  • Use colorblind-safe palettes (Okabe-Ito or Paul Tol)
  • Verify grayscale readability (8% of men have color vision deficiency)
  • No title inside figure—the caption serves this function
  • Self-contained captions—reader should understand without main text

References & Resources

Reference Documents (Deep Dives)

DocumentContents
writing-guide.mdGopen & Swan 7 principles, Ethan Perez micro-tips, word choice
citation-workflow.mdCitation APIs, Python code, BibTeX management
checklists.mdNeurIPS 16-item, ICML, ICLR, ACL requirements
reviewer-guidelines.mdEvaluation criteria, scoring, rebuttals
sources.mdComplete bibliography of all sources

LaTeX Templates

Templates in templates/ directory: ICML 2026, ICLR 2026, NeurIPS 2025, ACL/EMNLP, AAAI 2026, COLM 2025.

Compiling to PDF:

  • VS Code/Cursor: Install LaTeX Workshop extension + TeX Live → Save to auto-compile
  • Command line: latexmk -pdf main.tex or pdflatex + bibtex workflow
  • Online: Upload to Overleaf

See templates/README.md for detailed setup instructions.

Key External Sources

Writing Philosophy:

APIs: Semantic Scholar | CrossRef | arXiv

Venues: NeurIPS | ICML | ICLR | ACL

FAQ

What does ml-paper-writing do?

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verification workflows.

When should I use ml-paper-writing?

Use it when you need a repeatable workflow that produces text report, downloadable file, pdf document.

What does ml-paper-writing output?

In the evaluated run it produced text report, downloadable file, pdf document.

How do I install or invoke ml-paper-writing?

npx skills add https://github.com/davila7/claude-code-templates --skill ml-paper-writing

Which agents does ml-paper-writing support?

Claude Code

What tools, channels, or permissions does ml-paper-writing need?

It uses no extra tools; channels commonly include text, file, pdf; permissions include filesystem:read, filesystem:write, network:outbound.

Is ml-paper-writing safe to install?

Static analysis marked this skill as medium risk; review side effects and permissions before enabling it.

How is ml-paper-writing different from an MCP or plugin?

A skill packages instructions and workflow conventions; tools, MCP servers, and plugins are dependencies the skill may call during execution.

Does ml-paper-writing outperform not using a skill?

About ml-paper-writing

When to use ml-paper-writing

You have a research repo and want the agent to produce a full first draft of a paper. You need help finding and verifying citations for related work without fabricating references. You want to adapt or polish a LaTeX paper for venues like NeurIPS, ICML, ICLR, ACL, AAAI, or COLM.

When ml-paper-writing is not the right choice

You need automated submission or publishing to conference systems. You need domain-specific experimental execution rather than writing and paper structuring support.

What it produces

Produces text report, downloadable file and pdf document.

Install

npx skills add https://github.com/davila7/claude-code-templates --skill ml-paper-writing

Invoke: Ask Claude Code to use ml-paper-writing for the task.