DevJobs

Computer Vision Engineer

Overview
Skills
  • C++ C++
  • Python Python
  • Data pipelines
  • Profiling
  • Performance tuning
  • NVIDIA Jetson
  • Inference pipelines
  • Embedded systems
  • Edge systems
  • Debugging
  • Logging
  • Low-level optimization
  • Monitoring
  • Multi-sensor fusion
  • ONNX Runtime
  • Production ML model deployment
  • Pruning
  • Quantization
  • Radar
  • TensorRT
  • Triton Inference Server
At Orca AI, we build AI-powered vision systems that enhance safety and decision-making for some of the world’s largest vessels.

Our platform processes live video streams from multiple onboard cameras to provide real-time situational awareness, detecting and tracking marine objects, even in low visibility and highly congested environments. These systems directly support navigational decisions and help prevent collisions, reduce human error, and improve operational efficiency.

Our systems are already deployed across thousands of vessels and have processed hundreds of millions of nautical miles of real-world data, operating in unpredictable and safety-critical conditions.

This role sits at the intersection of AI and high-performance systems engineering, focused on solving real-world problems under strict constraints. You will work on systems where performance and reliability are critical and where improvements have a direct, measurable impact on real-world safety.

This is a senior, systems-focused role with end-to-end ownership over performance and reliability of production computer vision pipelines. You will define optimization strategies, identify bottlenecks across the system, and drive improvements under real-world constraints.

What You’ll Do

  • Build and optimize real-time computer vision pipelines running on edge systems processing live maritime video streams (e.g, NVIDIA Jetson, Triton Inference Server)
  • Take models from research and turn them into production-ready, reliable components deployed on vessels
  • Optimize model inference using techniques such as quantization, pruning, and graph-level optimization (e.g., TensorRT, ONNX Runtime)
  • Profile and improve end-to-end system performance across: multi-camera video ingestion; preprocessing; inference; postprocessing
  • Identify and resolve bottlenecks across CPU, GPU, memory, and pipeline coordination
  • Make and justify tradeoffs between latency, accuracy, stability, and resource utilization
  • Design and implement robust data and inference pipelines (video -> model -> actionable output for crew)
  • Develop benchmarking and evaluation workflows to measure performance end-to-end and support release gating
  • Build and improve observability tools, including logging, monitoring, and debugging workflows for production systems
  • Define and maintain clear interfaces between research code and production systems
  • Work closely with research and backend teams to integrate new models into production systems
  • Continuously improve system efficiency and reliability under hardware and runtime constraints

Requirements:

  • 5+ years of software engineering experience, with a strong focus on systems and performance
  • Hands-on experience working with computer vision or deep learning systems in production
  • Strong programming skills in Python and/or C++
  • Experience working with edge or embedded systems (e.g., NVIDIA Jetson platforms)
  • Strong understanding of system bottlenecks, including CPU, GPU, memory, and latency constraints
  • Strong intuition for profiling-driven optimization and performance tuning
  • Experience debugging complex systems and reasoning about behavior in real-world, noisy environments

Strong advantage

  • Experience working with edge or embedded systems
  • Experience working with custom high-performance data or inference pipelines
  • Familiarity with multi-sensor fusion (e.g., combining vision with radar or other signals)
  • Experience deploying and maintaining ML models in production environments
  • Experience with low-level optimization and/or C++ performance tuning
  • Proven experience optimizing model inference (e.g., TensorRT, ONNX Runtime, quantization, pruning, or similar techniques)

Additional Context

The role is primarily system-focused with responsibility for optimizing inference, improving pipeline performance, and ensuring production reliability. Some interaction with models is expected (e.g., quantization, pruning, architecture-aware optimizations), while model development and training are primarily owned by the research team.

You will be working in environments where failures are often caused by real-world conditions rather than clean lab assumptions - such as low visibility, cluttered scenes, and dynamic environments - and where understanding system behavior in production is key to delivering robust solutions.
Orca AI