Ai Research Evaluator

 

Description:

Mercor, a global AI talent network headquartered in San Francisco, connects elite technical and creative professionals with leading AI research labs across the world. Backed by visionary investors including Benchmark, General Catalyst, Peter Thiel, Adam D’Angelo, Larry Summers, and Jack Dorsey, Mercor empowers skilled individuals to contribute to next-generation artificial intelligence systems.

Mercor is seeking an AI Research Evaluator to join its research validation and auditing team. The successful candidate will play a key role in ensuring the quality, accuracy, and logical consistency of AI-generated outputs and research data. This role offers a unique opportunity to contribute directly to AI model evaluation, ensuring reliable reasoning and performance across a variety of academic and technical domains.


Key Responsibilities:

  • Review and audit AI-generated data, including prompts, solutions, references, final outputs, and metadata, ensuring factual and logical correctness.

  • Evaluate AI task quality using structured rubrics at both the turn-level and task-level, maintaining strict standards for clarity, completeness, and accuracy.

  • Verify complex reasoning processes—including chain-of-thought and solution steps—to identify logical flaws, gaps, or inconsistencies.

  • Assess coding-related tasks, validating test cases, evaluating code performance, and confirming the correctness and efficiency of algorithmic solutions.

  • Provide concise and well-reasoned written feedback to justify evaluations and support continuous improvement of AI systems.

  • Work independently and asynchronously within Mercor’s remote collaboration frameworks, following provided tools and workflows.


Required Skills & Qualifications:

  • PhD, current PhD candidacy, or Master’s degree in Mathematics, Computer Science, Applied Mathematics, or a related STEM discipline.

  • Deep understanding of one or more domains such as:

    • Mathematics: algebra, geometry, number theory, probability, combinatorics, graph theory

    • Coding: data structures, algorithms, application frameworks, testing, and QA

    • General AI domains: summarization, reasoning tasks, open/closed Q&A, classification

  • Strong analytical and logical evaluation skills, particularly for multi-step problem-solving and reasoning validation.

  • Excellent written communication skills with precision and clarity in feedback documentation.

  • Ability to work independently and maintain consistency in a fully remote environment.

Preferred:

  • Familiarity with Python and testing frameworks such as Pytest or JUnit.


Engagement Details:

  • Type: Independent Contractor

  • Compensation: $50–$60 per hour (based on expertise)

  • Commitment: Approximately 30 hours per week

  • Location: Fully Remote

  • Payment: Weekly via Stripe Connect

  • Start Date: Immediate (applications reviewed daily)


Application Process:

Mercor’s streamlined evaluation process is designed to identify skilled professionals capable of performing advanced analytical evaluations.

  1. Upload your resume to Mercor’s AI platform.

  2. Complete an AI-led interview tailored to your professional background.

  3. Submit the application form for review.

Estimated total time: 20–30 minutes


Resources & Support:

  • Learn more about Mercor’s platform and hiring process:
    https://talent.docs.mercor.com/welcome/welcome

  • For support or technical assistance, contact support@mercor.com


About the Company:

Mercor is a cutting-edge technology platform that connects elite professionals with the most innovative AI research organizations globally. With a strong commitment to advancing artificial intelligence reliability, research validation, and model interpretability, Mercor provides a collaborative space for data scientists, engineers, and researchers to contribute to frontier-level AI projects.

Working at Mercor means joining a global network of AI experts driving advancements in reasoning, accuracy, and research integrity for next-generation models.

Organization Mercor
Industry IT / Telecom / Software Jobs
Occupational Category AI Research Evaluator
Job Location Montreal,Canada
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 2 Years
Posted at 2025-10-31 3:20 pm
Expires on 2025-12-15