Big Data Engineer - Intern
Wuxi, CN
About our group:
We are a proactive, highly solutions oriented and collaborative team that works with all the various business groups across the organization. Our purpose is capturing massive amounts of data is to transform this vital information into concrete and valuable insights that will allow Seagate to make better and more strategic business decisions.
About the role - you will:
· Part of 10-12 Platform Engineers that are the crux for developing and maintaining Big Data (Data Lake, Data Warehouse and Data Integration) and advanced analytics platforms at SeaGate
· Apply your hands-on subject matter expertise in the Architecture of and administration of Big Data platforms - Data Warehouse Appliances , Open Data Lakes (AWS EMR, HortonWorks), Data Lake Technologies (AWS S3/Databricks/Other) and experience with ML and Data Science platforms (Spark ML , H2O , KNIME)
· Develop and manage SPARK ETL Frameworks, Data orchestration with Airflow and support building PRESTO/Trino queries for the key stakeholders
· Design, scale and deploy Machine Learning pipelines.
· Collaborate with Application Architects and Business SMEs to design and develop end-to-end data pipelines and supporting infrastructure.
· Establish and maintain productive relationships with peer organizations, partners, and software vendors
About you:
· Excellent coding skills in any language with deep desire to learn new skills and technologies.
· You’re a passionate professional who is up to the challenge of blending the fast-changing technology landscape of Big Data analytics with the complex and high-impact space of HiTech and Manufacturing analytics
· As a motivated self-starter, you have the experience working in a dynamic environment
· Exceptional data engineering skills in large, high-scale Data platforms and applications using cloud and big data technologies like Hadoop ecosystem and Spark
· Strong appetite for constant learning, thinking out of the box, questioning the problems & solutions with the intent to understand and solve better
· Excellent interpersonal skills to develop relationships with different teams and peers in the organization
Your experience includes:
· Big data processing frameworks knowledge: Spark, Hadoop, Hive, Kafka, EMR
· Big data solutions on cloud (AWS or Other)
· Advanced experience and hands-on architecture and administration experience on big data platforms
· Data Warehouse Appliances, Hadoop (AWS EMR), Data Lake Technologies (AWS S3/GCS/Other) and experience with ML and Data Science platforms (Spark ML , H2O , KNIME )
· Python, Java, Scala
· DevOps, Continuous Delivery, and Agile development
· Creating a culture of technical excellence by leading code and design reviews, promoting mentorship, and identifying and promoting educational opportunities for engineers
You might also have:
· Strong understanding of Micro-services and container-based development using Docker and Kubernetes ecosystem is a BIG plus
· Experience working in a Software Product Development environment is a BIG plus
Location:
Located in the high technology district, our site in Wuxi is a global manufacturing hub, where nearly 5,000 employees deliver technical services, customer test, manufacturing, and support capabilities. On site you can grab breakfast, lunch, dinner and snacks at our canteen, grab-and-go market and coffee shop. We offer basketball, badminton, yoga clubs and group exercise classes. We also have music, dance, photography, and literature clubs galore, along with an active Toastmasters International chapter, and frequently have on-site festivals, celebrations and community volunteer opportunities.
Location: Wuxi, China
Travel: None