Staff Engineer - Data
Pune, IN
About our group:
The Global Wafer Systems (GWS) group is responsible for the design, development, and support of essential business solutions across factory locations in Asia, Europe, and the United States. The team specializes in the advancement and maintenance of factory control systems, utilizing Artificial Intelligence in fields including image processing, automated data monitoring, recommendation, and automated decision-making.
About the role - you will:
● Lead and architect data platform strategy, AI-ready data foundations, and engineering best practices at scale.
● Lead the architecture and evolution of enterprise-scale data platforms supporting smart factory operations, including operational, parametric, sensor, and image data.
● Own end-to-end data architecture for AI and GenAI use cases, including LLM/RAG pipelines, feature stores, vector-ready datasets, metadata frameworks, and data lineage.
● Define and enforce data engineering standards for data modeling, ETL/ELT patterns, data quality, observability, governance, and lifecycle management.
● Design scalable and resilient data pipelines across batch and streaming workloads using Hadoop, distributed SQL engines, and cloud/on‑prem hybrid architectures.
● Partner with AI/ML, application, and product teams to translate business problems into robust data solutions and reusable data assets.
● Mentor and technically guide junior and senior data engineers, providing architectural direction, code reviews, and design feedback.
● Lead complex data problem investigations, driving root-cause analysis, and systemic fixes rather than local workarounds.
● Influence roadmap and technical strategy across teams and sites, aligning data capabilities with factory, analytics, and AI transformation goals.
● Drive GenAI readiness at scale, ensuring datasets are discoverable, well-documented, semantically consistent, and safe for AI/agent consumption.
About you:
● Strong passion for data platform excellence, AI-ready data, and GenAI enablement.
● Ability to think architecturally while remaining deeply technical.
● Comfort influencing without authority across global, cross-functional teams.
● A proactive mindset toward data quality, reliability, and long-term maintainability.
● Willingness to operate in a global environment with flexible working hours when required.
Your experience includes:
● Advanced expertise in SQL, including performance tuning and optimization in distributed data platforms.
● Deep experience designing and building large-scale ETL/ELT pipelines and data architectures.
● Strong proficiency in Python for data engineering, automation, and framework development.
● Expert-level data modeling skills, including dimensional, wide-table, and AI/feature-oriented models.
● Hands-on experience with Oracle PL/SQL, including complex procedures and performance optimization.
● Proven experience preparing data for AI/GenAI systems, including:
o Vector-friendly transformations
o Metadata extraction and enrichment
o Structured and unstructured data processing
o Data validation for AI consumption
● Experience leading technical design discussions and making architectural trade-offs.
● Experience with GenAI frameworks and patterns, including RAG architectures, embeddings, inference optimization, and prompt/data observability.
● Strong hands-on experience with distributed query engines (e.g., Trino) and big data ecosystems.
● Experience with workflow orchestration tools such as Airflow or NiFi at scale.
● Experience designing streaming ingestion frameworks using Spark Streaming, Kafka, or equivalent.
● Background in data governance, data quality frameworks, and observability tooling.
Location:
Our site in Pune is dynamic, both in our cutting-edge, innovative work, as well as our vibrant on-site food, and athletic and personal development opportunities for our employees. You can enjoy breakfast, lunch, or dinner from one of four cafeterias in the park. Take a break from your workday and participate in one of our many walkathons or compete against your colleagues in carrom, chess and table tennis. Learn about a technical topic outside your area of expertise at one of our monthly Technical Speaker Series, or attend one of the frequent on-site cultural festivals, celebrations, and community volunteer opportunities.
Location: Pune, India
Travel: None