Scrape arXiv with AI
Extract arXiv research papers, abstracts, author data, and citation info using Apify and Claude Code.
How scraped arXiv data flows across your company
One scrape generates intelligence for every department — automatically
- → Identify prospects from scraped data
- → Track competitor activity
- → Source outreach targets
- → Build lead lists
- → Content research and ideation
- → Competitor strategy analysis
- → Trend monitoring
- → Audience insights
- → Market sizing and analysis
- → Engagement benchmarking
- → Growth opportunity identification
- → Platform trend tracking
- → Data records stored
- → Engagement metrics indexed
- → Source attribution tagged
- → Historical data tracked
- → Identify prospects from scraped data
- → Track competitor activity
- → Source outreach targets
- → Build lead lists
- → Content research and ideation
- → Competitor strategy analysis
- → Trend monitoring
- → Audience insights
- → Market sizing and analysis
- → Engagement benchmarking
- → Growth opportunity identification
- → Platform trend tracking
- → Data records stored
- → Engagement metrics indexed
- → Source attribution tagged
- → Historical data tracked
Cancel your Semantic Scholar Pro subscription
Semantic Scholar Pro
- × Subscription fees
- × Data locked in their dashboard
- × Per-seat pricing
- × Export limits
SoloStack + Claude Code
- ✓ Pay-per-use, no subscription
- ✓ Your data in your repo
- ✓ Zero vendor lock-in
- ✓ Unlimited exports
What this skill file teaches Claude
Drop one markdown file into your repo. Claude Code learns how to run this entire workflow.
Data Extraction
Pull key data points from arXiv including profiles, content, and metadata.
Search & Filter
Search by keywords, categories, or specific URLs to target exactly what you need.
Engagement Metrics
Capture engagement signals — views, likes, shares, and comments for every item.
Bulk Processing
Process hundreds or thousands of records in a single run with automatic pagination.
Export & Integration
Output clean JSON ready for CRM import, analysis, or integration with other tools.
epctex/arxiv-scraper · ~$0.10 per 1,000 papers Build it with plain English
Tell Claude Code what to do. It handles the rest.
Processing arXiv data... ✓ Data extracted successfully ✓ 234 records collected ✓ Cleaned and deduplicated ✓ Ready for CRM import Data saved to scrape-arxiv-results.json
Processing arXiv data... ✓ Data extracted successfully ✓ 567 records collected ✓ Cleaned and deduplicated ✓ Ready for CRM import Data saved to scrape-arxiv-results.json
Processing arXiv data... ✓ Data extracted successfully ✓ 89 records collected ✓ Cleaned and deduplicated ✓ Ready for CRM import Data saved to scrape-arxiv-results.json
What you can build with this
Research trend monitoring
Track new paper volume by topic to identify emerging research areas before they hit mainstream.
Competitive R&D tracking
Monitor papers from competitor company researchers to understand their R&D direction.
AI/ML trend analysis
arXiv is ground zero for AI research. Track model announcements, benchmark results, and new techniques.
Content creation
Summarize trending research for non-technical audiences in blog posts and newsletters.
Things to know
arXiv has a public API and bulk data access. Use official channels when possible.
Paper quality varies — arXiv is a preprint server with no peer review.
Citation counts lag publication by months. Use for trend detection, not impact measurement.
Get the full skill file
Everything above is 80% of the skill file. Download the complete version with full implementation details, agent prompts, and ready-to-run scripts.
Common questions
Keep building your stack
Related Solutions
More tools and workflows from across SoloStack
Free CRM
Unlimited contacts, zero per-seat pricing. AI-managed CRM in your repo.
Free ToolFree Email Marketing
Send campaigns with Resend API. No monthly fees, no subscriber limits.
Free ToolFree Scheduling
Booking pages with Google Calendar sync. Replace Cal.com for $0/mo.
Free ToolFree Website Builder
Build with Astro + AI. Static, fast, SEO-optimized, fully customizable.
Ready to automate?
SoloStack gives you every skill pre-installed — scraping, marketing, sales, CRM, and more. One repo. Every department.
Book a Call →