About Us:
Zenity is the first and only holistic platform built to secure and govern AI Agents from buildtime to runtime. We help organizations defend against security threats, meet compliance, and drive business productivity. Trusted by many of the world’s F500 companies, Zenity provides centralized visibility, vulnerability assessments, and governance by continuously scanning business-led development environments. We recently raised $38 million in a Series B funding, solidifying our position as a leader in the industry and enabling us to accelerate our mission of securing AI Agents everywhere.
About The Role:
This is a
research‑first role focused on deeply understanding LLM internals to improve the security of AI agents. You’ll design careful experiments on activations and interpretable features- e.g., probing, attribution & ablation/patching, representation‑geometry analyses-to uncover mechanisms behind jailbreak, indirect prompt injection, and other attacks. Then translate those insights into signals that can be used for detection and analysis of a model response.
The field of LLM interpretability at scale is exploding, with several major publications in the last months, and major opportunities for innovation.
What You’ll Do
- Investigate model internals, including activation/features analysis, unsupervised clustering, discovery of directions in latent space, etc. It may also require training specific model parts to improve interpretability metrics.
- Design security‑grounded evaluations: curate datasets for different attack types, evaluate performance of different white box (model internals) methods compared to black box (input/output only) baselines.
- Publish and share: produce Zenity Labs posts and open artifacts; when the work is strong, aim for tier‑1 ML venues (NeurIPS, ICML, etc.) and security forums. A publication of code and/or trained models in cases of community relevant novelty.
- Build tools: Several open source libraries exist (like Anthropic’s attribution graphs infra), but the research in the field is very dynamic, which will require you to build and adapt tools to your own research directions. This also includes agents to automate research work and distill knowledge from designed experiments.
Requirements:
- Deep learning expertise with a track record of non‑trivial research (industry or academia) in LLMs or other domains (e.g., CV, speech). We care that you’ve changed models or methods in meaningful ways (architecture/training/eval), not just used them.
- Strong experimental design and scientific writing; comfort pre‑registering hypotheses, testing causal claims, proposing novel directions in a fast-changing field.
- PhD or equivalent research experience in the industry (5+ years in a leading research team). Publication record or a portfolio of high‑impact open artifacts will make you stand out from the crowd.
- Familiarity with AI frameworks (e.g., HuggingFace Transformers, LangChain, scikit-learn, PyTorch); Experience with a production grade codebase with several contributors is a bonus.
- Experience in data analysis: visualization, exploration, cleanup.
- Knowledge in GenAI tools such as LLM Orchestrations and integration packages, Agents, RAG systems - a bonus.