DevJobs

Senior Voice Engineer

Overview
Skills
  • AWS AWS
  • backend services
  • frontend
  • mobile
  • mobile SDKs
  • realtime audio systems
  • streaming
  • STT
  • telephony
  • TTS
  • web
  • A2A
  • LLM-driven agents
  • MCP
  • multi-agent orchestration

About TechSee:

TechSee is the global leader in AI-powered visual assistance, helping the world’s largest service providers transform how customers and technicians solve complex device and connectivity issues. Our Visual AI platform - trusted by Vodafone, Orange, Hitachi and dozens of Fortune 500 enterprises - combines computer vision, augmented reality, and conversational AI to resolve millions of support interactions every year. TechSee is backed by Salesforce Ventures, Telus, Scale Venture Partners, and OurCrowd.



The opportunity:

Voice is how most people will first reach our new AI assistant. They will pick up the phone, open a mobile app, or tap a voice button, and within seconds expect a real conversation that understands them, their home, and their problem.

As Senior Voice Engineer, you will own that experience end-to-end - from the frontend UI decisions, through the realtime backend, into the agent that actually solves the problem. You will build a voice stack that feels human, runs at consumer scale, and bridges seamlessly into visual and chat channels when the conversation needs to evolve.



Key Responsibilities:


Own the Voice Channel End-to-End

  • Architect and build the realtime voice pipeline: STT, TTS, turn-taking, VAD, barge-in, latency tuning.
  • Integrate with telephony, mobile SDKs, and messaging channels (WhatsApp, iMessage, in-app voice).
  • Design seamless multi-modal handoffs - from voice into camera-based visual sessions and back.


Full-Stack Delivery

  • Ship the mobile and web frontends that capture and render the voice experience.
  • Build the backend services that drive realtime audio, session state, and integration with our agent platform.
  • Make hard calls on protocols, codecs, streaming strategies, and provider trade-offs.


Make It Feel Human

  • Obsess over latency, prosody, interruption handling, and recovery from misrecognition.
  • Build instrumentation that surfaces voice quality issues before users complain.
  • Partner with AI engineers to make sure the agent behind the voice is actually conversational, not transactional.


Set the Bar

  • Define the team’s standards for realtime systems, audio quality, and observability.
  • Mentor engineers across frontend and backend on voice-first thinking.



Qualifications:

  • Senior-level experience building voice-based products in the conversational agent space.
  • Strong background across both frontend (mobile and/or web) and backend.
  • Hands-on experience with realtime audio systems - STT/TTS providers, streaming, telephony, or equivalent stacks.
  • Solid grasp of conversational design pitfalls and how to engineer around them.
  • B.Sc. or higher in Computer Science or a related field.



Advantage

  • Track record on large-scale production systems.
  • Experience with AWS or other major cloud platforms.
  • Background with LLM-driven agents, MCP/A2A, or multi-agent orchestration.



Why Work With Us?

At TechSee, we combine cutting-edge innovation with a people-first philosophy. We are looking for high-performers who are driven by excellence, collaboration, and the desire to make a tangible impact on the future of AI.

  • A voice product that matters. Millions of consumers, real problems, no scripts - your work will be heard, literally.
  • Greenfield stack. Pick the right tools, set the right patterns, build it the way it should be built.
  • Cross-disciplinary team. Sit between AI, mobile, infra, and domain experts who actually know the field.
  • Autonomy & Bold Execution: We value individuals who are proactive and results-oriented.
  • Hybrid by design. Herzliya office, flexible remote days, ownership over your craft.


TechSee Augmented Vision