About TechSee:
TechSee is the global leader in AI-powered visual assistance, helping the world’s largest service providers transform how customers and technicians solve complex device and connectivity issues. Our Visual AI platform - trusted by Vodafone, Orange, Hitachi and dozens of Fortune 500 enterprises - combines computer vision, augmented reality, and conversational AI to resolve millions of support interactions every year. TechSee is backed by Salesforce Ventures, Telus, Scale Venture Partners, and OurCrowd.
The opportunity:
Voice is how most people will first reach our new AI assistant. They will pick up the phone, open a mobile app, or tap a voice button, and within seconds expect a real conversation that understands them, their home, and their problem.
As Senior Voice Engineer, you will own that experience end-to-end - from the frontend UI decisions, through the realtime backend, into the agent that actually solves the problem. You will build a voice stack that feels human, runs at consumer scale, and bridges seamlessly into visual and chat channels when the conversation needs to evolve.
Key Responsibilities:
Own the Voice Channel End-to-End
- Architect and build the realtime voice pipeline: STT, TTS, turn-taking, VAD, barge-in, latency tuning.
- Integrate with telephony, mobile SDKs, and messaging channels (WhatsApp, iMessage, in-app voice).
- Design seamless multi-modal handoffs - from voice into camera-based visual sessions and back.
Full-Stack Delivery
- Ship the mobile and web frontends that capture and render the voice experience.
- Build the backend services that drive realtime audio, session state, and integration with our agent platform.
- Make hard calls on protocols, codecs, streaming strategies, and provider trade-offs.
Make It Feel Human
- Obsess over latency, prosody, interruption handling, and recovery from misrecognition.
- Build instrumentation that surfaces voice quality issues before users complain.
- Partner with AI engineers to make sure the agent behind the voice is actually conversational, not transactional.
Set the Bar
- Define the team’s standards for realtime systems, audio quality, and observability.
- Mentor engineers across frontend and backend on voice-first thinking.
Qualifications:
- Senior-level experience building voice-based products in the conversational agent space.
- Strong background across both frontend (mobile and/or web) and backend.
- Hands-on experience with realtime audio systems - STT/TTS providers, streaming, telephony, or equivalent stacks.
- Solid grasp of conversational design pitfalls and how to engineer around them.
- B.Sc. or higher in Computer Science or a related field.
Advantage
- Track record on large-scale production systems.
- Experience with AWS or other major cloud platforms.
- Background with LLM-driven agents, MCP/A2A, or multi-agent orchestration.
Why Work With Us?
At TechSee, we combine cutting-edge innovation with a people-first philosophy. We are looking for high-performers who are driven by excellence, collaboration, and the desire to make a tangible impact on the future of AI.
- A voice product that matters. Millions of consumers, real problems, no scripts - your work will be heard, literally.
- Greenfield stack. Pick the right tools, set the right patterns, build it the way it should be built.
- Cross-disciplinary team. Sit between AI, mobile, infra, and domain experts who actually know the field.
- Autonomy & Bold Execution: We value individuals who are proactive and results-oriented.
- Hybrid by design. Herzliya office, flexible remote days, ownership over your craft.