About the job Data Architect - Senior (Hybrid) JP921
This position will play a key role in enhancing EPAs Digital Regulatory Assurance System. The position is in a high performing team, working in a fast-paced, agile environment.
A. Reducing duplication in EPEA and WA Monitoring Report Templates
Identify overlapping parameters, metrics, and calculations across monitoring and reporting templates across programs and regulatory regimes.
Define and maintain canonical data elements (e.g., facility, authorization, activity, parameter, time, location) to support consistent interpretation and reuse.
Rationalize template designs to reduce redundancy and prepare them for ingestion into standardized data structures.
B. Designing Streamlined Data Structures
Translate manual, upload oriented reporting templates into analytics ready data structures, including:
Object / fact tables (e.g., sample observations, measurements, monitoring events)
Dimension tables (e.g., facility, source, program, parameter, geography, time)
Lead and steward the Unified Business Object model for EPEA Approvals and related regulatory domains by defining:
Core business objects
Attributes and relationships
Authoritative definitions shared across DRAS and the DMP
Define structural standards, including:
Row level grain
Column and attribute consistency
Versioning, corrections, and historical traceability
Document and maintain a traceable record of architectural decisions and their rationale.
Translate finalized data model designs into engineering specifications (e.g., schemas, ingestion contracts, transformation expectations, and data quality rules)for the data engineering team, provide design guidance during pipeline build, review platform implementations for structural conformance and surface any deviation from canonical model intent before data reaches analytical layers.
C. Preparing Data for Advanced Analytics (Databricks)
Design data structures that support advanced analytical use cases, including:
Aggregation and roll ups
Trend analysis
Anomaly and outlier detection
Future machine learning and predictive use cases
Ensure data structures are:
Normalized where semantic clarity and governance are required
Denormalized where performance and analytical usability matter
Establish and maintain a structured model feedback loop with the data analysts, including regular review of validation findings, edge case reports, and prototyping results to inform schema evolution decisions.
Plan for schema evolution, enabling future environmental monitoring and reporting needs to be accommodated with minimal disruption while preserving analytical continuity.