Zotadata

A Zotero plugin that enhances your research workflow with intelligent metadata discovery and automated file management.

⚠️ This version is specifically designed for Zotero 7.x and will not work with Zotero 6.x

Demo

Features

🔍 Intelligent Reference Management

Attachment Validation: Automatically detect and remove broken file links while preserving valid PDFs and weblinks
Smart Cleanup: Bulk processing to maintain clean, working attachments across your library

📚 Advanced Metadata Discovery

Multi-API Metadata Fetching: Comprehensive metadata updates using 6+ APIs (CrossRef, OpenAlex, Semantic Scholar, OpenLibrary, Google Books, DBLP)
Automatic DOI/ISBN Discovery: Find missing identifiers through intelligent title and author matching
Support for Multiple Item Types: Journal articles, conference papers, preprints, and books
Fallback Strategies: Multiple search approaches when primary methods fail

📄 Comprehensive PDF Retrieval

Multi-Source File Search: Access content from 8+ sources including:
- Open Access: Unpaywall, CORE, Internet Archive
- Preprint Servers: arXiv with high reliability
- Academic Repositories: Library Genesis, Sci-Hub
- Custom Resolvers: Multiple mirror support with automatic fallback
Smart Download Logic: Only downloads when needed, avoids duplicates
Stored File Creation: All downloads create local stored files (never links)

Retrieval Flow Diagram

The retrieval flow is based on the following diagram:

This diagram was inspired by this Reddit post about accessing scientific papers.

🧬 arXiv & Preprint Intelligence

Published Version Discovery: Automatically find journal publications of arXiv preprints
Smart Type Conversion: Convert arXiv journal articles to proper preprint format
Version Management: Handle transitions from preprint to published versions
Metadata Synchronization: Update bibliographic information when published versions are found

⚡ Efficient Batch Operations

Concurrent Processing: Handle multiple items simultaneously with intelligent rate limiting
Progress Tracking: Real-time progress dialogs for large batch operations
Error Resilience: Continue processing even when individual items fail
Detailed Reporting: Comprehensive success/failure summaries with actionable insights

🛠️ User Experience

One-Click Access: Right-click context menu integration
Email Configuration: Simple setup for API access requirements
Minimal Configuration: Works out-of-the-box with optional email for enhanced features
Multilingual Support: English and Chinese locales included

Installation

From XPI File (Zotero 7.x)

Download the latest release XPI file
In Zotero 7, go to Tools → Add-ons
Click the gear icon and select "Install Add-on From File..."
Select the downloaded XPI file
Restart Zotero

Note: This extension requires Zotero 7.0 or later. For Zotero 6.x compatibility, use an earlier version of this extension.

Manual Installation (Development)

Clone or download this repository
Install dependencies: npm install
Run ./build.sh to create the XPI package
Install as described above

Configuration

Right-click on any item in your Zotero library
Select Zotadata → Configure Email
Enter your email address (required for Unpaywall API)

Note: Your email is stored locally in Zotero preferences and only used for API requests to services like Unpaywall. The plugin will prompt you for an email the first time you use features that require it.

Usage

Context Menu

Right-click on selected items in your Zotero library to access:

Validate References: Check and clean up attachments for selected items - removes broken file links while preserving valid PDFs and weblinks
Update Metadata: Fetch and update metadata for journal articles, conference papers, preprints, and books using multiple APIs (CrossRef, OpenAlex, Semantic Scholar, OpenLibrary, Google Books) - can auto-discover missing DOIs/ISBNs
Retrieve Files: Search and download missing PDF files from multiple sources (Unpaywall, arXiv, CORE, Library Genesis, Sci-Hub, Internet Archive) - only processes items without existing PDFs
Process Preprints: Handle arXiv papers by finding published versions, updating metadata, downloading published PDFs, or converting to proper preprint format when no published version exists

Batch Operations

Select multiple items to process them all at once. A progress dialog will show the status of each operation.

Success Rates & Expectations

PDF Retrieval Reality

File retrieval success varies significantly by source type:

High Success Rate:

arXiv Preprints: Very reliable due to arXiv's open access mandate and stable infrastructure
Open Access Articles: Good success via Unpaywall for legitimately open access content

Moderate to Low Success Rate:

Paywalled Journal Articles: More challenging due to publisher restrictions and legal considerations
Books: Particularly difficult to obtain, especially recent publications
Recent Papers: Sci-Hub has significantly reduced new uploads due to ongoing legal challenges

Alternative Workflows

For difficult-to-find content, consider these community-recommended approaches:

Anna's Archive: A promising source with about 5-minute wait time for link generation, but it is free.
Google: Google is always our friend as the resource might be shared in reddit, github or some niche forums.

Note: This plugin automates the search across legitimate and widely-used academic sources. For content not available through these channels, manual research through additional academic resources may be necessary.

API Integration

This plugin integrates with several external APIs and services:

Metadata APIs

CrossRef API

Purpose: Fetch metadata for DOIs
Rate Limit: 50 requests/second (polite pool)
Authentication: None required (email recommended)

OpenAlex API

Purpose: Comprehensive academic work metadata and DOI discovery
Rate Limit: Very generous, no authentication required
Authentication: None required

Semantic Scholar API

Purpose: AI-powered paper search and metadata
Rate Limit: Reasonable limits for academic use
Authentication: None required

OpenLibrary & Google Books APIs

Purpose: Book metadata and ISBN discovery
Rate Limit: Standard API limits
Authentication: None required for basic use

PDF Sources

Unpaywall API

Purpose: Find open access PDF links
Rate Limit: 100,000 requests/day
Authentication: Email address required

arXiv API

Purpose: Search and download arXiv papers
Rate Limit: 3 seconds between requests
Authentication: None required

CORE API

Purpose: Search academic papers for full-text access
Rate Limit: 10,000 requests/month (free tier)
Authentication: API key required for higher limits (Not implemented yet)

Library Genesis

Purpose: Academic paper and book repository
Rate Limit: Subject to site availability
Authentication: None required

Sci-Hub

Purpose: Academic paper access service
Rate Limit: Subject to site availability and blocking
Authentication: None required

Internet Archive

Purpose: Open access books and historical documents
Rate Limit: Standard API limits
Authentication: None required

File Structure

zotero-zotadata/
├── manifest.json            # Plugin metadata (Zotero 7 format)
├── bootstrap.js             # Plugin bootstrap for Zotero 7
├── prefs.js                 # Default preferences
├── assets/                  # Documentation assets
│   ├── images/             # Screenshots and diagrams
│   └── workflows/          # Workflow diagrams and flowcharts
├── content/
│   └── zotadata.js          # Main logic
├── chrome/content/
│   ├── preferences.xul      # Settings dialog
│   └── progress.xul         # Progress window
├── locale/
│   ├── en-US/               # English translations
│   └── zh-CN/               # Chinese translations
├── skin/default/
│   └── zotadata.css         # Styles
└── README.md                # This file

Development

Requirements

Node.js (for build tools and dependencies)
Zotero 7.0 or later
Firefox 115+ based platform

Setup

Clone the repository
Install dependencies: npm install
Make your changes to the source files

Building

Make changes to the source files
Run ./build.sh to create XPI package
Test in Zotero 7 development environment

Testing

Unit test the API integration functions
Test with various item types and DOI formats
Verify UI responsiveness and error handling
Test with both Zotero 7 stable and beta versions

Zotero 7 Migration

This version has been completely rewritten for Zotero 7 compatibility:

Extension Format: Migrated from install.rdf to manifest.json
Architecture: Changed from XUL overlays to bootstrapped extension
APIs: Updated to use Zotero 7 compatible APIs
Window Management: Adapted to new Zotero 7 window lifecycle
Preferences: Moved to root-level prefs.js file

Zotero 7 Compatibility Notes

When developing this plugin for Zotero 7, ensure the following in your manifest.json:

manifest_version: Must be set to 2. Despite Zotero 7 being based on a newer Firefox core that uses Manifest V3 for web extensions, Zotero's own bootstrapped plugins still expect manifest_version: 2.
applications key: Zotero-specific properties (like id, strict_min_version, strict_max_version, and update_url) must be within an applications.zotero object.
update_url: This field within applications.zotero is mandatory for Zotero 7.0.15+ (and possibly earlier Zotero 7 versions). Even for local development, a placeholder URL (e.g., "https://example.com/update.json") must be provided, otherwise the plugin installation will fail with an "Extension is invalid" error.

Failure to include update_url will result in an error message in the Zotero debug log similar to: ERROR Loading extension '[email protected]': Reading manifest: applications.zotero.update_url not provided

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly with Zotero 7
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
addon		addon
assets		assets
content		content
locale		locale
skin		skin
src		src
tests		tests
typings		typings
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bootstrap.js		bootstrap.js
build.sh		build.sh
chrome.manifest		chrome.manifest
eslint.config.mjs		eslint.config.mjs
manifest.json		manifest.json
package-lock.json		package-lock.json
package.json		package.json
prefs.js		prefs.js
run-tests.js		run-tests.js
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
zotadata.js		zotadata.js
zotero-plugin.config.ts		zotero-plugin.config.ts

License

ydeng11/zotero-zotadata

Folders and files

Latest commit

History

Repository files navigation

Zotadata

Demo

Features

🔍 Intelligent Reference Management

📚 Advanced Metadata Discovery

📄 Comprehensive PDF Retrieval

🧬 arXiv & Preprint Intelligence

⚡ Efficient Batch Operations

🛠️ User Experience

Installation

From XPI File (Zotero 7.x)

Manual Installation (Development)

Configuration

Usage

Context Menu

Batch Operations

Success Rates & Expectations

PDF Retrieval Reality

Alternative Workflows

API Integration

Metadata APIs

CrossRef API

OpenAlex API

Semantic Scholar API

OpenLibrary & Google Books APIs

PDF Sources

Unpaywall API

arXiv API

CORE API

Library Genesis

Sci-Hub

Internet Archive

File Structure

Development

Requirements

Setup

Building

Testing

Zotero 7 Migration

Zotero 7 Compatibility Notes

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages