About the Role:
We are hiring a Full-Stack Developer with a strong back-end focus to help us build a high-impact platform for automated adversarial testing, vulnerability detection, and model benchmarking of generative AI systems.
This platform empowers subject matter experts and enterprise clients to test and evaluate large language models (LLMs) across a wide range of data types and task taxonomies, without the need for manual evaluation. Your work will directly contribute to improving the safety, robustness, and alignment of modern AI systems deployed in production environments.
Responsibilities:
Develop and maintain full-stack features with a strong focus on back-end development
Build scalable batch pipelines that automate LLM testing and integrate third-party evaluators
Process and transform multi-modal data via ETL workflows and store results in MySQL and Elasticsearch
Create and manage stored procedures, job schedulers, and retry mechanisms for API pipelines
Design REST APIs to support front-end dashboards, filters, and benchmarking tools
Collaborate closely with front-end developers, QAs, DevOps, and product leads in an agile environment
Ensure systems are performant, fault-tolerant, and secure
Platform Capabilities:
Supported Data Types: Image, Video, Sensor (LiDAR), Audio, Speech, Document, Code
Task Taxonomies: Summarization, Image Evaluation, Image Reasoning, Q&A, Question Understanding, Entity Relation Classification, Text-to-Code, Logic & Semantics, Question Rewriting, Translation
Feedback Types: DPO (Direct Policy Optimization), Simple RLHF, Complex RLHF, Nominal Feedback
Techniques Tested: Payload Smuggling, Prompt Injection, Persuasion and Manipulation, Conversational Coercion, Hypotheticals, Roleplaying, One-/Few-shot Learning
Tech Stack:
Node.js, TypeScript, React
MySQL (including stored procedures), Elasticsearch
REST APIs, OAuth2.0, JWT
Docker, GitHub Actions, Kubernetes (optional)
Job orchestration tools (Cron, node-cron, BullMQ or similar)
Requirements:
3–5+ years of full-stack development experience, with a strong back-end orientation
Proficiency in Node.js and TypeScript; working experience with React
Strong experience integrating and orchestrating REST APIs at scale
Experience building ETL workflows and handling multi-modal data
Solid database development skills in MySQL and Elasticsearch
Familiarity with OAuth2.0, JWT, and secure API development
Comfortable working in a remote team with a 7:30 AM EST start time
Nice to Have:
Experience with LLM APIs, AI/ML workflows, or evaluation techniques (e.g., DPO, RLHF)
Familiarity with adversarial testing methods such as prompt injection and roleplaying
Experience with CI/CD, Docker, Kubernetes, or distributed system architecture
Background in AI safety or model evaluation frameworks
We’re an equal opportunity employer committed to increasing diversity and inclusion in today’s workforce. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. Minorities, women, LGBTQ candidates, and individuals with disabilities are encouraged to apply. If you require an accommodation, please review our
accessibility policy and reach out to our accessibility officer with any questions.