SCIENTIST - Research Agent
The SCIENTIST agent conducts research on the selected competition. It analyzes leaderboards, searches for relevant papers and notebooks, and summarizes strategies.
Role
- Analyze leaderboard score distribution
- Surface relevant notebooks and research
- Summarize dataset characteristics
- Recommend modeling approaches
Tools
| Tool | Purpose |
|---|---|
analyze_leaderboard |
Summarize leaderboard stats |
get_kaggle_notebooks |
Find top notebooks for the competition |
analyze_data_characteristics |
Inspect dataset structure |
compute_baseline_estimate |
Estimate a baseline score |
kaggle_* toolset |
Kaggle API helper tools |
web_search (builtin) |
Retrieve papers and discussions |
memory (builtin) |
Shared notes (Anthropic only) |
Basic Usage
from agent_k.agents.scientist import ScientistDeps, scientist_agent
competition = await kaggle_adapter.get_competition("titanic")
deps = ScientistDeps(
http_client=http,
platform_adapter=kaggle_adapter,
competition=competition,
)
run_result = await scientist_agent.run(
"Research the provided competition and summarize approaches",
deps=deps,
)
output = run_result.output
print(output.recommended_approaches)
Dependencies
from dataclasses import dataclass, field
from typing import Any
import httpx
@dataclass
class ScientistDeps:
"""Dependencies for the SCIENTIST agent."""
http_client: httpx.AsyncClient
platform_adapter: PlatformAdapter
competition: Competition
leaderboard: list[LeaderboardEntry] = field(default_factory=list)
research_cache: dict[str, Any] = field(default_factory=dict)
Output Model
class ResearchFinding(BaseModel):
"""Individual research finding."""
category: str
title: str
summary: str
relevance_score: float
sources: list[str]
class LeaderboardAnalysis(BaseModel):
"""Leaderboard statistics summary."""
top_score: float
median_score: float
score_distribution: str
common_approaches: list[str]
improvement_opportunities: list[str]
class ResearchReport(BaseModel):
"""Output from SCIENTIST research."""
competition_id: str
domain_findings: list[ResearchFinding]
technique_findings: list[ResearchFinding]
leaderboard_analysis: LeaderboardAnalysis | None
recommended_approaches: list[str]
estimated_baseline_score: float | None
key_challenges: list[str]
Notes
- The SCIENTIST output is converted into a simplified
ResearchFindingsobject by the mission graph. - The memory tool is only available for Anthropic models.