Walmart
Senior, Software Engineer - AI Systems
Entry LevelRemoteFull-time
Location
Sunnyvale, CA
Salary
Not listed
Experience
4+ years
Posted
Today
Job Description
Senior, Software Engineer - AI Systems
Location: (USA) BELLEVUE WALMART GLOBAL TECH WA BELLEVUE Home Office
Position Summary...
We’re seeking a Software Engineer to design and build AI-first systems with a focus on agentic AI, high performance data/compute frameworks, and scalable, production-grade services. You’ll work across model-driven features and platform layers—integrating LLMs/agents, orchestrating pipelines with Ray, accelerating data science workloads with RAPIDS, and delivering robust APIs and services that power high-impact AI applications at scale.
The ideal candidate blends strong software engineering fundamentals with practical ML systems exposure and a passion for performance, reliability, and developer experience.
What you'll do...
Key ResponsibilitiesAI Systems & Agentic Workflows
Build agentic AI services (planning, tool use, retrieval, feedback loops) and integrate them with internal systems and APIs.
Implement orchestration, memory, tooling, evaluation, and guardrails for agentic workflows.
Collaborate with DS/MLE partners to productionize models (LLMs, GNNs, embedding services) behind stable APIs and SDKs.
Accelerated Compute & Data Pipelines
Develop GPU‑accelerated pipelines using RAPIDS (cuDF/cuML/cuGraph) and optimize end‑to‑end performance.
Use Ray (or similar) for distributed compute, batch/stream processing, and scalable workflow orchestration.
Profile and optimize bottlenecks across CPU/GPU, memory, and I/O layers; implement caching, vectorization, and async patterns.
Service & Platform Engineering
Design and maintain reliable microservices for training/inference, vector indexing, and real-time decisioning.
Implement observability (tracing/metrics/logging), fault tolerance, auto-scaling, and cost-aware execution.
Create internal SDKs/CLIs to streamline developer workflows, testing, and reproducibility.
Quality, Security & MLOps Integration
Establish CI/CD for AI services (unit/integration/e2e tests, canaries, blue/green, rollback).
Integrate with feature stores, vector databases, artifact registries, and model catalogs.
Enforce security, privacy, and compliance (data minimization, PII handling, governance, auditability).
Collaboration & Influence
Partner with product, platform, and DS/MLE teams to align requirements, SLAs, and success metrics.
Document systems thoroughly; contribute to design reviews and engineering best practices.
Mentor peers on AI systems patterns, distributed compute, and performance engineering.
Minimum Qualifications
Bachelor’s/Master’s in CS, Engineering, or equivalent industry experience.
4+ years building production backend or platform services (preferably in AI/ML contexts).
Proficiency in:
Languages: Python (primary), plus one of Go/Java/C++ for performance services.
Distributed frameworks: Ray, Spark, or Dask.
Accelerated compute: RAPIDS (cuDF/cuML/cuGraph) and GPU-aware programming concepts (streams, memory).
Service frameworks: FastAPI/Flask (Python), K8s (Kubernetes) and containerization (Docker).
Strong foundations in data structures/algorithms, concurrency, networking, and systems design.
Preferred Qualifications
Production experience with agent frameworks (e.g., LangGraph-style planners, tool-use patterns, retrieval and memory components).
Experience with vector databases (e.g., FAISS, Milvus, pgvector, Pinecone) and feature stores.
Familiarity with LLM and embedding services, prompt/tooling patterns, and evaluation harnesses.
Hands-on with Kubernetes, autoscaling (HPA/KEDA), and GPU scheduling/operators.
Performance profiling: PyTorch profiler, Nsight, line-profiler, Ray dashboard.
Experience with vLLM, Triton Inference Server, ONNX Runtime, or TensorRT for high‑throughput inference.
Soft Skills & Leadership
Pragmatic problem solver with a bias for measurable outcomes (latency, throughput, reliability).
Excellent communicator able to translate between research goals and production constraints.
Drives clarity in ambiguous problem spaces; mentors others and uplifts engineering standards.
About Walmart Global Tech
Additional Locations: Bellevue, WA, (USA) Crossman Excellence Building CA SUNNYVALE Home Office