Description:
Mercor, a global AI talent network headquartered in San Francisco, connects elite technical and creative professionals with leading AI research labs across the world. Backed by visionary investors including Benchmark, General Catalyst, Peter Thiel, Adam D’Angelo, Larry Summers, and Jack Dorsey, Mercor empowers skilled individuals to contribute to next-generation artificial intelligence systems.
Mercor is seeking an AI Research Evaluator to join its research validation and auditing team. The successful candidate will play a key role in ensuring the quality, accuracy, and logical consistency of AI-generated outputs and research data. This role offers a unique opportunity to contribute directly to AI model evaluation, ensuring reliable reasoning and performance across a variety of academic and technical domains.
Review and audit AI-generated data, including prompts, solutions, references, final outputs, and metadata, ensuring factual and logical correctness.
Evaluate AI task quality using structured rubrics at both the turn-level and task-level, maintaining strict standards for clarity, completeness, and accuracy.
Verify complex reasoning processes—including chain-of-thought and solution steps—to identify logical flaws, gaps, or inconsistencies.
Assess coding-related tasks, validating test cases, evaluating code performance, and confirming the correctness and efficiency of algorithmic solutions.
Provide concise and well-reasoned written feedback to justify evaluations and support continuous improvement of AI systems.
Work independently and asynchronously within Mercor’s remote collaboration frameworks, following provided tools and workflows.
PhD, current PhD candidacy, or Master’s degree in Mathematics, Computer Science, Applied Mathematics, or a related STEM discipline.
Deep understanding of one or more domains such as:
Mathematics: algebra, geometry, number theory, probability, combinatorics, graph theory
Coding: data structures, algorithms, application frameworks, testing, and QA
General AI domains: summarization, reasoning tasks, open/closed Q&A, classification
Strong analytical and logical evaluation skills, particularly for multi-step problem-solving and reasoning validation.
Excellent written communication skills with precision and clarity in feedback documentation.
Ability to work independently and maintain consistency in a fully remote environment.
Preferred:
Familiarity with Python and testing frameworks such as Pytest or JUnit.
Type: Independent Contractor
Compensation: $50–$60 per hour (based on expertise)
Commitment: Approximately 30 hours per week
Location: Fully Remote
Payment: Weekly via Stripe Connect
Start Date: Immediate (applications reviewed daily)
Mercor’s streamlined evaluation process is designed to identify skilled professionals capable of performing advanced analytical evaluations.
Upload your resume to Mercor’s AI platform.
Complete an AI-led interview tailored to your professional background.
Submit the application form for review.
Estimated total time: 20–30 minutes
Learn more about Mercor’s platform and hiring process:
https://talent.docs.mercor.com/welcome/welcome
For support or technical assistance, contact support@mercor.com
Mercor is a cutting-edge technology platform that connects elite professionals with the most innovative AI research organizations globally. With a strong commitment to advancing artificial intelligence reliability, research validation, and model interpretability, Mercor provides a collaborative space for data scientists, engineers, and researchers to contribute to frontier-level AI projects.
Working at Mercor means joining a global network of AI experts driving advancements in reasoning, accuracy, and research integrity for next-generation models.
| Organization | Mercor |
| Industry | IT / Telecom / Software Jobs |
| Occupational Category | AI Research Evaluator |
| Job Location | Montreal,Canada |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Intermediate |
| Experience | 2 Years |
| Posted at | 2025-10-31 3:20 pm |
| Expires on | 2025-12-15 |