Nugget Annotator User Guide
Getting Started
1. Configure Your LLM API Key
Before creating nuggets, you need an OpenRouter API key for LLM-powered features.
- Look for the key icon
🔑 in the top-right corner
- Click it to open the API key prompt
- Enter your OpenRouter API key
- The indicator turns green when the key is set
The key is stored in your browser's localStorage and persists across sessions.
2. Enter Your Username
Enter your name in the Username field (top-right). This is recorded as the creator of any nuggets you create.
Navigation
Queries Panel (Left Sidebar)
- Lists all evaluation queries/topics
- Shows completion status (checkmark if annotated)
- Shows nugget count badge for each query
- Click a query to select it
Reports Panel (Creation Phase)
- Shows all system reports for the selected query
- Click a report to view its content in the middle panel
- In Creation phase, the first report auto-selects when you choose a query
Creating Your First Nugget
Step 1: Select Text Spans
In Creation phase:
- Select text in the report with your mouse
- The draft dialog opens automatically with your selection as a "span"
- Select additional spans if needed - each appears as a chip in the draft card
- Remove unwanted spans by clicking the x on their chip
You can also select text from the Query box at the top.
Step 2: Add Notes (Optional)
Use the Freetext notes area to describe what this nugget should capture. This helps the LLM generate a better nugget question.
Step 3: Canonicalize
Click Canonicalize to have the LLM generate a formal nugget question from your spans and notes.
- The button shows progress while working
- Once complete, the generated question appears
- Click Re-canonicalize to regenerate if unsatisfied
Step 4: Choose Category
| Category | Meaning |
| Must Have | Critical information - reports lacking this are poor |
| Should Have | Important but not critical |
| Avoid | Information that should NOT appear (e.g., hallucinations) |
Step 5: Check Impact (Recommended)
Click Check Impact to preview how this nugget grades across all reports:
- Progress bar shows completion (e.g., "5/8")
- Results show each report with its grade and supporting quote
- Click a quote to navigate to that report while keeping the draft open
- This helps you verify the nugget is discriminative before committing
Step 6: Commit
Click Commit to save the nugget permanently. It appears in the Nuggets panel on the right.
Editing and Deleting Nuggets
Edit a Nugget
- Find the nugget in the right panel
- Click the Edit button (pencil icon)
- Modify spans, notes, or category
- Click Re-canonicalize if you changed the text
- Click Commit to save changes
Viewing Nugget Quotes
Find Quote Button States
| State | Action |
| Find Quote (green) | Click to extract quote from current report |
| Show Quote (blue) | Click to highlight the quote in the source |
| Hide Quote (blue) | Click to remove highlight |
| No Quote (gray) | No supporting quote found |
Grade All
The Grade All button (top-right) grades all enabled nuggets against all reports for the current query.
- Click Grade All
- Watch progress (e.g., "12/32")
- When complete, shows brief "Done!" confirmation
- Grades appear next to each nugget in the right panel
QC Phase: Quality Control
Switch to QC phase using the tab at the top.
Adjust Category Weights
| Category | Range | Default | Effect |
| Must Have | 0-10 | 1.0 | Higher = more important in ranking |
| Should Have | 0-10 | 1.0 | Higher = more important in ranking |
| Avoid | -10-0 | -1.0 | Negative = penalizes reports containing this |
Enable/Disable Nuggets
Use the checkbox next to each nugget to include or exclude it from ranking calculations.
Solo Mode
Click the dot button next to a nugget to "solo" it:
- Only that nugget is used for ranking
- Useful for testing individual nugget discriminativeness
- Click again or click Unsolo to exit solo mode
Ranking Table Columns
| Column | Meaning |
| # | Rank position |
| System | Report/system name |
| Nug | Satisfied/Total nuggets |
| Avg | Average grade (0-5) |
| Cov | Coverage % (grade 4+ count) |
| Score | Weighted score |
Observe Phase: Analysis
Switch to Observe phase for read-only analysis.
Toggle All Queries
Click All Queries / This Query button to switch between single-query and cross-query rankings.
Overview Panel Metrics
| Metric | Meaning | Ideal |
| Discriminative | Covered by 10-80% of systems | High count |
| Universal | Covered by >80% of systems | May be too easy |
| Hard | Covered by <10% of systems | May need refinement |
Good nugget bank: Mostly discriminative nuggets, few universal/hard extremes.
Exporting Your Work
Click Export to download a JSONL file containing all nuggets, grades, and metadata.
The exported file integrates with nugget-based judges and other evaluation tooling.
Tips for Effective Annotation
- Start with obvious nuggets - Key facts that good reports should cover
- Use Check Impact - Verify nuggets discriminate before committing
- Mix categories - Include Must Have, Should Have, and Avoid nuggets
- Iterate in QC - Adjust weights to see ranking changes
- Watch for extremes - Too many universal or hard nuggets indicates problems
- Export regularly - Save your work to avoid browser data loss