Data Engineer, Scaling Analytics
OpenAI
Use the employer link to read the full source listing and submit your application.
Listing data may include public employer ATS feeds and Jobs by Adzuna.
Before you apply
The decision-making details job seekers want first
We pulled the strongest signals from the listing so you can quickly judge fit, compensation, and what the company expects before opening the full source post.
Compensation
Salary & market context
354% above the BLS national median
BLS national median: $74,680
Requirements
Top requirements
- s
- 5+ years of experience building and maintaining production data pipelines and analytical systems.
- Strong proficiency in SQL and experience designing scalable data models.
- Proficiency in Python or another programming language commonly used for data engineering.
Perks & setup
Benefits candidates care about
- About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
Why candidates care
Benefits & perks
- About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
Start here
Requirements
- s
- 5+ years of experience building and maintaining production data pipelines and analytical systems.
- Strong proficiency in SQL and experience designing scalable data models.
- Proficiency in Python or another programming language commonly used for data engineering.
- Experience working with modern data warehouses (e.g., Snowflake, BigQuery, Redshift) and orchestration frameworks (e.g., Airflow, Dagster).
- Experience designing reliable ETL/ELT workflows with a focus on maintainability, performance, and operational excellence.
- Experience partnering with cross-functional stakeholders to translate business requirements into technical solutions.
- Experience implementing data quality checks, monitoring, and observability practices in production environments.
Responsibilities
What you'll do
- s
- Design, build, and maintain scalable data pipelines supporting infrastructure deployment, operations, capacity planning, and supply chain functions.
- Develop trusted datasets and reporting systems that provide visibility into hardware inventory, deployment status, site readiness, capacity utilization, and operational performance.
- Partner with cross-functional stakeholders to define metrics, establish data standards, and improve decision-making across infrastructure organizations.
- Create scalable data models that enable consistent reporting and analytics across multiple data sources and operational systems.
- Improve data quality, lineage, observability, and governance practices across critical infrastructure datasets.
- Support executive reporting, operational reviews, forecasting exercises, and strategic planning initiatives through reliable analytical foundations.
- Collaborate with engineering teams to integrate new data sources and operational telemetry into existing analytics ecosystems.
Role snapshot
About the role
About the Team
OpenAI, in close collaboration with our capital partners, is building the world's most advanced AI infrastructure ecosystem. The Scaling Analytics team serves as the data backbone for this effort, enabling leaders and operators to make informed decisions across infrastructure deployment, hardware operations, supply chain, capacity planning, and site execution.
As OpenAI’s Industrial Compute expands across an increasing number of global data center campuses, the complexity of managing infrastructure capacity, hardware health, supply flows, and operational performance continues to grow. Scaling Analytics develops the data models, pipelines, metrics, and reporting systems that transform fragmented operational data into actionable insights, helping OpenAI operate infrastructure at unprecedented scale.
We are seeking a Data Engineer to help build and scale the analytical foundations that power OpenAI's infrastructure organization. This individual will partner closely with Hardware Operations, Capacity Planning, Supply Chain, Infrastructure Delivery, Finance, and Engineering teams to create reliable data products that support critical operational and strategic decisions.
Source text