developer-assessment-evaluator
Developer Assessment Evaluator
Evaluate technical assessments (take-homes, live coding, system design) using structured rubrics while detecting over-indexing on trivia vs. core engineering skills.
Purpose
Provide objective, fair evaluation of candidate technical work that:
- Assesses problem-solving ability over memorization
- Detects over-reliance on algorithmic trivia
- Evaluates code quality, testing, documentation
- Ensures consistent scoring across candidates
- Identifies signal vs. noise in technical performance
When to Use
Invoke when:
- Reviewing completed take-home assignments
- Scoring live coding interviews
- Evaluating system design discussions
- Detecting assessment anti-patterns
- Training interviewers on evaluation criteria
- Calibrating scoring across interview team
Core Evaluation Framework
1. Problem-Solving Process (40%)
Assess HOW candidate approaches problem:
- Asks clarifying questions before coding
- Breaks problem into manageable steps
- Considers edge cases and constraints
- Iterates when hitting obstacles
- Explains thought process clearly
Scoring:
- 5: Methodical approach, clear reasoning, handles ambiguity well
- 3: Adequate approach, some structure
- 1: Jumps to code without planning, struggles with unknowns
Anti-Pattern Detection: If candidate knows optimal algorithm immediately → may be memorized, probe deeper with follow-up
2. Code Quality (30%)
Evaluate production-readiness:
- Readable variable/function names
- Appropriate abstractions
- No obvious bugs or edge case failures
- Handles errors gracefully
- Follows language idioms
Scoring:
- 5: Production-ready code, well-organized
- 3: Functional code, some quality issues
- 1: Poor structure, hard to maintain
3. Testing & Validation (20%)
Check verification approach:
- Writes test cases (unit, integration)
- Tests edge cases (empty input, large input, null)
- Validates assumptions
- Handles error conditions
Scoring:
- 5: Comprehensive tests, edge cases covered
- 3: Basic tests present
- 1: No testing or validation
4. Communication (10%)
Assess explanation clarity:
- Explains design decisions
- Articulates trade-offs
- Responds to feedback
- Documents approach
Scoring:
- 5: Clear, thorough explanations
- 3: Adequate communication
- 1: Unclear or defensive
Detecting Trivia Over-Indexing
Warning Signs:
- Assessment requires obscure algorithm knowledge
- Solution depends on memorizing specific pattern
- Time pressure favors memorization over problem-solving
- No partial credit for good process but wrong algorithm
Example - Bad Assessment: "Implement Dijkstra's algorithm from memory in 45 minutes" → Tests memorization, not problem-solving
Example - Good Assessment: "Design a route-finding system for our delivery app. Consider real-world constraints." → Tests applied problem-solving
Rebalancing:
- Allow candidates to look up algorithms
- Value process over perfect solution
- Give hints/guidance during interview
- Accept multiple valid approaches
Evaluation Rubric Template
{
"candidate": "Name",
"assessment_type": "take-home|live-coding|system-design",
"evaluation": {
"problem_solving": {
"score": 4,
"weight": 0.40,
"evidence": "Methodical approach, asked good clarifying questions, handled edge cases"
},
"code_quality": {
"score": 3,
"weight": 0.30,
"evidence": "Functional code, some naming could be clearer"
},
"testing": {
"score": 5,
"weight": 0.20,
"evidence": "Comprehensive unit tests, tested edge cases thoroughly"
},
"communication": {
"score": 4,
"weight": 0.10,
"evidence": "Clear explanations, good documentation"
}
},
"weighted_score": 4.0,
"recommendation": "Strong Hire",
"feedback": "Excellent problem-solving and testing. Could improve variable naming.",
"trivia_concerns": false
}
Providing Constructive Feedback
Feedback Structure:
## Strengths
- [Specific strength with example]
- [Specific strength with example]
## Areas for Growth
- [Constructive suggestion with example]
- [Constructive suggestion with example]
## Overall Assessment
[Summary and recommendation]
Best Practices:
- Be specific (cite code examples)
- Balance positive and constructive
- Focus on behaviors, not person
- Suggest improvements, don't just criticize
Using Supporting Resources
Templates
templates/rubric-template.json- Assessment rubric schematemplates/feedback-template.md- Candidate feedback structure
References
references/anti-patterns.md- Common assessment anti-patternsreferences/trivia-vs-skills.md- Distinguishing memorization from ability
Scripts
scripts/detect-trivia.py- Analyze assessment for trivia over-indexingscripts/score-assessment.py- Calculate weighted scores
Progressive Disclosure: Detailed anti-patterns, trivia detection techniques, and feedback examples in references/.