Founding Deep Learning Deployment Engineer (Senior/Forward Deployment Engineer — Founding Team)

Indore, India

0 applied

Full-time

₹ 20 - 25 Lakh /year

5 - 10 yrs

Posted on: May 5, 2026

Skills

python

mypy

pydantic

pypi

c++

rust

c programming

pytorch

onnx runtime

python libraries

cloud platforms

aws

gcp

azure

pulumi

python sdk

ci/cd pipelines

ci/cd tools: jenkins, github actions, gitlab ci

gpu

Role: Founding Deep Learning Deployment Engineer (Senior/Forward Deployment Engineer — Founding Team)

Location: On-site, Vijaynagar, Indore, MP, India

Availability: Immediate joiners or someone who can join us within 20-30 days

Budget: Up to 25 LPA CTC (Depending on skills and experience)

About Us & This Role:

We're an early-stage company building a product around deep learning for audio and signal processing. The founding team comes from a research background and a management background, which means we have strong instincts about the product, the science, and the market, but we're explicitly looking for a technical co-founder who owns the engineering side of the company.

There is no existing platform. No deployment pipeline. No SDK. No infrastructure. You will build all of it from scratch.

This is a founding engineering role. You will not be inheriting a codebase or slotting into an existing team — you will be defining how this company ships software. The decisions you make in the first six months will shape the product for years. If you're the kind of engineer who finds that prospect exciting rather than terrifying, please keep reading.

What You'll Be Doing:

Building the Product (Hands-On, Greenfield)
Take research models from our research team — typically Jupyter notebooks, PyTorch checkpoints, and unoptimized code — and turn them into clean, production-grade software.
Optimize models for inference using pruning, quantization, distillation, and graph optimization with ONNX, TensorRT, and Triton Inference Server.
Design and build our deployment pipeline from scratch — there is none today. You decide how it works.
Build our Python SDK — handle packaging, versioning, publishing to PyPI, and design the developer experience for whoever consumes it.
Stand up our cloud infrastructure using Infrastructure as Code (Terraform, Pulumi, or similar). You'll choose the cloud provider and the patterns we use.
Build local-deployment paths so smaller models run efficiently on CPU on Windows and macOS, with platform-specific optimizations.
Helping Build the Company

Beyond writing code, you'll be helping us figure out:

What our engineering culture looks like — code review standards, testing philosophy, how we ship.
What to build vs. what to buy — every infrastructure choice has cost, complexity, and lock-in trade-offs we'll make together.
Who to hire next — you'll have a real voice in growing the engineering team, and eventually you may lead it.
How research and engineering collaborate — defining the handoff process between research and production.
The technical roadmap — the founders are honest about not knowing the production side. We need your judgment, not just your hands.
You will not be a hired pair of hands executing a plan. You'll be a partner in figuring out what the plan should be.

Technical Requirements:

We've kept this section detailed on purpose. Read each item honestly — if most of these describe work you've actually done (not just heard of), you're a strong fit.

Python & Software Engineering (Must-Have):

5+ years writing production Python, with a deep understanding of the language: typing (mypy, pydantic), async/await, context managers, decorators, generators, and the data model.
Strong grasp of modern Python tooling: pyproject.toml, setuptools/hatchling/poetry-core as build backends, virtual environments, and at least one of poetry, uv, hatch, or pdm.
Published Python packages to PyPI or a private index — you understand wheels vs. sdists, platform tags (manylinux, macosx, win_amd64), entry points, and dependency resolution.
Comfortable writing C/C++/Rust extensions or bindings when needed (via pybind11, nanobind, cffi, or PyO3) — or willing to learn quickly.
Strong testing discipline: pytest, fixtures, parameterization, mocking, and writing tests that catch real bugs without becoming a maintenance burden.
Familiarity with code quality tooling: ruff, black, mypy, pre-commit hooks.

Deep Learning & Model Optimization (Must-Have)

3–4+ years of hands-on deep learning experience, including production deployment — not just training.
Fluent in PyTorch (and ideally familiar with TensorFlow or JAX); able to read research code and understand custom layers, autograd, and training loops.
Model compression and optimization techniques, with real production experience in at least three of: Post-training quantization (INT8, FP16, BF16) Quantization-aware training Structured and unstructured pruning Knowledge distillation Operator fusion and graph optimization
ONNX: exporting models from PyTorch, debugging unsupported ops, simplifying graphs (onnx-simplifier), and inspecting models with Netron.
ONNX Runtime: execution providers (CPU, CUDA, CoreML, OpenVINO, DirectML), session options, and performance tuning.
TensorRT: building engines, working with plugins, calibration for INT8, and debugging precision issues.
Triton Inference Server: model repository structure, ensemble models, dynamic batching, configuration of multiple backends (PyTorch, ONNX, TensorRT, Python).
Understanding of inference performance fundamentals: latency vs. throughput trade-offs, batch size effects, memory bandwidth limits, and how to profile a model end-to-end.

Local/On-Device Deployment (Must-Have):

Deploying small DL models on CPU for Windows and macOS, with experience in at least two of: ONNX Runtime with platform-specific execution providers OpenVINO for x86 CPU optimization Core ML / coremltools for macOS and Apple Silicon Apple's Accelerate framework / Metal Performance Shaders ggml-style runtimes or other quantized inference libraries
Understanding of CPU-specific optimization: SIMD (AVX2, AVX-512, NEON), threading models, and memory layout.
Experience packaging models for distribution in desktop applications — handling model versioning, downloading, caching, and integrity checks.
Awareness of cross-platform build challenges: code signing on macOS, MSVC vs. MinGW on Windows, universal binaries for Apple Silicon.

Audio & Signal Processing (Must-Have):

Solid DSP fundamentals: sampling theory, FFT, STFT, filtering, windowing, and aliasing.
Hands-on experience with audio data: working with WAV/FLAC/MP3, resampling, normalization, and common audio feature extraction (MFCCs, mel-spectrograms, log-mel).
Familiar with audio Python libraries: librosa, torchaudio, soundfile, scipy.signal.
Awareness of real-time audio constraints: streaming inference, chunking strategies, lookahead vs. causal models, latency budgets.

Bonus: experience with audio-specific model architectures (Conformer, Wav2Vec2, Whisper, RNN-T, diffusion-based audio models, etc.).

5. Cloud & Infrastructure (Must-Have):

Production experience with at least one major cloud: AWS, GCP, or Azure.
Infrastructure as Code with hands-on experience in at least one of:
Terraform (preferred) — modules, state management, workspaces
Pulumi — preferably with the Python SDK
AWS CDK or equivalent
Containers: writing efficient Dockerfiles, multi-stage builds, GPU-enabled images (nvidia/cuda base images), and understanding image size optimization.
Kubernetes: at least basic working knowledge — deployments, services, configmaps, secrets, and ideally GPU scheduling.
CI/CD with GitHub Actions, GitLab CI, or similar — you've built pipelines, not just used them.
Familiar with GPU instance types, costs, and trade-offs across cloud providers.

Personal Traits (Must-Have):

Comfort owning ambiguity — you can make good decisions without complete information.
Pragmatism over perfectionism — you ship things that work, then improve them.
Strong written communication — you can document decisions and explain technical trade-offs to non-technical founders.
Builder's instinct — the absence of existing infrastructure is a feature for you, not a bug.

Nice-to-Have:

Founding or early-stage startup experience (employees 1–10).
Experience designing and shipping SDKs that external developers actually use — including documentation, examples, and versioning policies.
Real-time/streaming inference experience — WebSockets, gRPC bidirectional streaming, or audio streaming protocols.
Experience with model registries (MLflow, Weights & Biases Artifacts, DVC) and reproducible model versioning.
Familiarity with WebAssembly/WASM for in-browser model inference.
Mobile deployment experience (iOS Core ML, Android NNAPI, TensorFlow Lite).
Open-source contributions to ML, audio, or inference frameworks.
Experience mentoring or leading engineers — you'll likely be doing this within a year.

Self-Assessment:

A rough guide to whether this role is right for you:

Strong fit: You match nearly all the must-haves and several nice-to-haves. You should apply.
Reasonable fit: You match most of the must-haves but have gaps in 1–2 areas (e.g., strong on deployment but light on audio, or vice versa). Apply and tell us in your cover note where you'd grow into the role.
Not yet a fit: You match the deep learning side but haven't shipped models to production, or you're strong on infrastructure but haven't worked with deep learning models. This role will be frustrating for you — wait until you have more deployment experience under your belt.

What You Should Know Before Applying

We want to be honest about this role:

It is not a comfortable senior IC role at an established company. There are no existing systems, no senior engineers above you, no established playbook. You will figure things out.
The founders are not engineers. We will lean on your judgment heavily for technical decisions. We're good collaborators, but we won't be debugging your Kubernetes config.
The work will be broad. Some weeks you'll be deep in TensorRT optimization. Other weeks you'll be writing docs, choosing a cloud provider, or interviewing candidates.
The trade-off is real ownership. You'll have a level of impact, autonomy, and equity that's not available at a larger company. The founding decisions will be yours.
If this kind of scope sounds like a chance to do the most meaningful work of your career, you're who we're looking for. If it sounds stressful and unstructured, this role will not make you happy, and we'd rather know that now.

Our Interview Process:

We've kept the process focused and respectful of your time. There are three main interviews:

Coding Round (Pair Programming) We'll solve a coding problem together. We care more about how you think and communicate than whether you reach the perfect answer.
Deep Learning Deep-Dive A technical conversation on deep learning concepts, model optimization techniques, and deployment trade-offs. Expect discussion on pruning, quantization, runtime selection, and real production scenarios.
Founding Fit & Problem Discussion A two-way conversation about the actual problems you'll be solving, the company we're building, and whether this role fits your goals. We'll be honest about where we are; we expect you to be honest about what you want.

Compensation:

This is a founding role and the compensation reflects that. We offer competitive salary plus meaningful equity — the kind that matters if the company succeeds. Specifics are open and we'll discuss them transparently with serious candidates.

If you've ever wanted to build the deployment and production foundation of a deep learning company from scratch — this is that role. We'd love to talk.

HireTo Co.

Indore, Madhya Pradesh, India

HireTo by Kuvaka Tech, Founded in 2019, HireTo is an AI-powered staffing and recruitment arm of Kuvaka Tech, dedicated to delivering high-quality IT hiring solutions for startups and mid-sized companies. Leveraging advanced AI-driven technical assessments, domain expertise, and a deep understanding of emerging technologies, HireTo specializes in sourcing, evaluating, and placing top-tier talent across Cloud, Backend, Frontend, Mobile Development, DevOps, Data Engineering, Data Science, AI/ML, and Prompt Engineering roles. Backed by Kuvaka Tech strong foundation in Blockchain, Artificial Intelligence, and SaaS solutions, HireTo combines precision, speed, and scalability,ensuring the right candidates reach clients on time, every time.

https://hireto.co/