DevJobs

Senior AI Engineer (LLM & Multi-Agent Systems)

Overview
Skills
  • Python Python
  • Elasticsearch Elasticsearch
  • Asyncio
  • AWS Bedrock
  • FastAPI
  • LangGraph
  • LangSmith
  • OpenAI API
  • Pydantic
  • Embedding models
  • Vector databases
  • LangChain

Join a Company That Invests in You

Seeking Alpha is the world’s leading community of engaged investors. We’re the go-to destination for investors looking for actionable stock market opinions, real-time market analysis, and unique financial insights. At the same time, we’re also dedicated to creating a workplace where our team thrives. We’re passionate about fostering a flexible, balanced environment with remote work options and an array of perks that make a real difference.

Here, your growth matters. We prioritize your development through ongoing learning and career advancement opportunities, helping you reach new milestones. Join Seeking Alpha to be part of a company that values your unique journey, supports your success, and champions both your personal well-being and professional goals.


What We're Looking For

Role Overview: We are developing Ask Seeking Alpha — a high-load financial analysis system based on Large Language Models. The architecture is built on complex multi-agent orchestration using LangGraph, FastAPI, and Elasticsearch.

We are looking for a Senior Backend Engineer specialized in Generative AI to design agent workflows, optimize interactions with models (OpenAI, AWS Bedrock), and ensure the reliability of non-deterministic systems in production.


Tech Stack: Python (Asyncio), FastAPI, LangChain, LangGraph, Pydantic, Elasticsearch, AWS Bedrock / OpenAI API, LangSmith.


What You'll Do

Agent Architecture: Design and implement complex agent orchestration logic using LangGraph. You will define state management, conditional routing, and error handling within the agent graph.

Tool Engineering: Build and optimize the tool layer (function calling) that allows LLMs to interact with internal financial APIs and databases accurately.

Performance Optimization:

  • Reduce end-to-end latency through asynchronous processing and streaming (SSE).
  • Implement semantic caching strategies to minimize API costs and response time.
  • Optimize token usage without sacrificing answer quality.

Observability & Evaluation: Implement automated evaluation pipelines using LangSmith. You will be responsible for setting up regression testing for prompts and agents to measure quality (correctness, faithfulness) before deployment.

Advanced RAG: Refine retrieval strategies. Work on hybrid search implementation (Keyword + Vector), re-ranking, and query expansion to feed the most relevant context to the model.


Requirements

Python Expert: Strong proficiency in modern Python. Deep understanding of asynchronous programming (asyncio) patterns is mandatory, as our entire I/O pipeline (Network, DB, LLM) is non-blocking. Experience with FastAPI and Pydantic (v2).

Agentic Frameworks: Production experience with LangChain. Hands-on experience or deep conceptual understanding of LangGraph (or similar state-machine based agent frameworks).

Deep LLM Expertise (What we mean by "Deep"):

Non-determinism Management: Strategies for handling LLM hallucinations and ensuring reliable outputs (e.g., self-correction loops, specific prompting techniques like CoT/ReAct).

Structured Outputs: Experience forcing LLMs to adhere to strict schemas (Pydantic/JSON mode) for reliable downstream processing.

Context Optimization: Advanced strategies for managing limited context windows (summarization chains, sliding windows, selective context injection) beyond simple truncation.

Inference Economics: Understanding the trade-offs between model size, latency, and cost (e.g., when to route to GPT-4 vs. a smaller/faster model).


Nice to Have

Experience with Elasticsearch (DSL queries, analyzers).

Knowledge of vector databases and embedding models.

Background in FinTech or familiarity with financial data structures.

Seeking Alpha