Scientific Graph Interpreter
Interpret and explain scientific graphs, charts, and data visualizations for research publications, clinical presentations, and academic communications with precision and clarity.
Quick Start
from scripts.graph_interpreter import GraphInterpreter
interpreter = GraphInterpreter()
# Comprehensive graph analysis
analysis = interpreter.interpret(
image_path="figure_1.png",
graph_type="kaplan_meier",
context="oncology_phase3_trial",
audience="clinicians"
)
print(analysis.statistical_summary)
print(analysis.clinical_significance)
print(analysis.suggested_caption)
Core Capabilities
1. Multi-Type Graph Analysis
analysis = interpreter.analyze(
graph_type="forest_plot",
data={
"studies": ["Study A", "Study B", "Study C"],
"effect_sizes": [1.2, 0.8, 1.5],
"confidence_intervals": [[1.0, 1.4], [0.6, 1.0], [1.2, 1.8]],
"overall_effect": 1.15,
"heterogeneity_p": 0.04
}
)
Supported Graph Types:
| Graph Type | Common Use | Key Elements to Extract | |------------|------------|------------------------| | Kaplan-Meier | Survival analysis | Median survival, HR, 95% CI, log-rank p | | Forest Plot | Meta-analysis | Effect size, CI, heterogeneity (I²), weights | | ROC Curve | Diagnostic accuracy | AUC, sensitivity, specificity, optimal cutoff | | Box Plot | Distribution comparison | Median, IQR, outliers, whiskers | | Scatter Plot | Correlation | R², p-value, trend line, outliers | | Bar Chart | Group comparisons | Means, SEM/SD, significance indicators | | Heatmap | Expression/omics | Scale, clustering, row/column annotations | | Volcano Plot | Differential analysis | Fold change, p-value, FDR threshold |
2. Statistical Interpretation
stats = interpreter.extract_statistics(
graph_data,
extract=[
"p_values",
"confidence_intervals",
"effect_sizes",
"sample_sizes",
"statistical_tests"
]
)
Statistical Reporting Standards:
# Example output structure
{
"primary_outcome": {
"measure": "Hazard Ratio",
"value": 0.72,
"ci_95": [0.58, 0.89],
"p_value": 0.003,
"interpretation": "32% risk reduction"
},
"secondary_outcomes": [...],
"significance_level": 0.05,
"multiple_comparison_adjusted": True
}
3. Audience-Specific Explanations
explanations = interpreter.generate_multi_audience(
analysis,
audiences=["researchers", "clinicians", "patients", "policy_makers"]
)
Explanation Templates:
For Researchers:
"The Kaplan-Meier analysis demonstrates a statistically significant survival advantage for the experimental arm (HR 0.72, 95% CI 0.58-0.89, p=0.003). Median survival improved from 14.2 to 19.6 months. The proportional hazards assumption was verified (p=0.42)."
For Clinicians:
"This trial shows patients on the new treatment lived about 5 months longer on average compared to standard care. The 32% reduction in death risk is significant and clinically meaningful. Consider this option for eligible patients."
For Patients:
"The study found that people taking the new treatment lived longer than those on standard treatment. About 1 in 3 patients benefited from the new treatment. Side effects were manageable."
4. Figure Caption Generation
caption = interpreter.generate_caption(
analysis,
style="journal", # or "presentation", "poster"
word_limit=250,
include_statistics=True
)
Caption Structure:
Figure X. [Brief title]. [What is shown: X-axis shows..., Y-axis shows...,
lines/bars represent...]. [Key finding: Group A showed... compared to
Group B...]. [Statistics: HR 0.72 (95% CI 0.58-0.89), p=0.003].
[Conclusion: This demonstrates...].
5. Critical Appraisal
appraisal = interpreter.critical_appraisal(
graph_data,
check=[
"appropriate_graph_type",
"axis_scaling",
"error_bars_present",
"sample_size_adequate",
"confounding_controlled",
"generalizability"
]
)
Common Graph Pitfalls:
| Issue | Problem | Better Approach | |-------|---------|-----------------| | Truncated y-axis | Exaggerates differences | Start at 0 or clearly indicate break | | No error bars | Hides variability | Include SD, SEM, or 95% CI | | 3D effects | Distorts perception | Use 2D with clear labels | | Dual y-axes | Confusing comparison | Separate graphs or normalized scale | | p-hacking indicators | Multiple comparisons | Adjusted p-values, Bonferroni |
CLI Usage
# Comprehensive analysis
python scripts/graph_interpreter.py \
--image survival_curve.png \
--type kaplan_meier \
--context "phase_3_oncology" \
--audience clinicians \
--output analysis.json
# Generate publication caption
python scripts/graph_interpreter.py \
--image forest_plot.png \
--type forest_plot \
--generate caption \
--journal-style nature \
--word-limit 200
# Batch process figures
python scripts/graph_interpreter.py \
--batch figures/ \
--output report.html \
--template comprehensive
Common Patterns
Pattern 1: Clinical Trial Primary Endpoint
# Analyze survival curve
analysis = interpreter.interpret(
graph_type="kaplan_meier",
primary_endpoint="overall_survival",
treatment_arms=["Experimental", "Control"],
key_metrics=["median_os", "hr", "ci", "p_value"]
)
# Generate regulatory-ready summary
regulatory_summary = interpreter.generate_regulatory_summary(
analysis,
guideline="ICH_E3"
)
Pattern 2: Meta-Analysis Forest Plot
# Interpret meta-analysis
analysis = interpreter.interpret_forest_plot(
studies=included_studies,
check_heterogeneity=True,
assess_publication_bias=True
)
# Generate GRADE assessment
grade_rating = interpreter.generate_grade_rating(analysis)
Pattern 3: Diagnostic Accuracy ROC
# Analyze diagnostic test
analysis = interpreter.interpret_roc(
curves=["Test A", "Test B", "Combined"],
optimal_cutoffs=True,
clinical Utility=True
)
# Clinical decision support
decision_aid = interpreter.generate_decision_aid(analysis)
Quality Checklist
Before Interpretation:
- [ ] Graph type appropriate for data
- [ ] Axes clearly labeled with units
- [ ] Sample sizes indicated
- [ ] Statistical tests specified
- [ ] Confidence intervals present
During Interpretation:
- [ ] Effect size calculated
- [ ] Clinical significance assessed
- [ ] Confidence intervals interpreted
- [ ] Limitations noted
- [ ] Generalizability considered
After Interpretation:
- [ ] Explanation appropriate for audience
- [ ] Statistical terms explained
- [ ] Uncertainty communicated
- [ ] Actionable insights highlighted
Best Practices
Statistical Communication:
- Always report confidence intervals with point estimates
- Distinguish statistical from clinical significance
- Note limitations and generalizability
- Avoid causal language in observational studies
Visual Analysis:
- Check axis scales for distortion
- Note truncated axes or breaks
- Identify outliers and their impact
- Verify error bar representation (SD vs SEM)
Common Pitfalls
❌ Correlation = Causation: "X causes Y because they're correlated" ✅ Cautious Interpretation: "X is associated with Y; other factors may explain this"
❌ Overstating Significance: "Highly significant (p<0.001)" as meaning large effect ✅ Proper Framing: "Statistically significant but modest effect size (d=0.2)"
❌ Ignoring Confidence Intervals: Reporting point estimate only ✅ Interval Reporting: "Effect: 1.5 (95% CI: 0.9-2.4), suggesting uncertainty"
Skill ID: 209 | Version: 1.0 | License: MIT
微信扫一扫