Data Engineering

Apache Spark

Apache Spark is a distributed data processing engine for large-scale computation.

Audience Intent Data Model Development

Designed a unified data model to integrate any data source into a big data geospatial analytics program, handling over 100TB of data in GCS and BigQuery with...

Data Engineering and Big Data

Projects building data pipelines, warehouses, lakes, and large-scale analytics infrastructure.

Data lineage systems

Tracking systems record the origin, transformations, and dependencies of data across pipelines and reports.

Data modeling systems

Abstract representations of entities and relationships are structured for efficient storage, querying, and interpretation.

Data monetization systems

Market structures exchange data as a product by aligning suppliers and consumers through pricing, packaging, and access controls.

Data privacy/compliance systems

Frameworks govern how data is stored, shared, and used to meet legal, contractual, and ethical expectations.

Data Product Systems

Pipelines and structures turn raw data into packaged, sellable, and repeatable products with defined schemas and use cases.

Data quality systems

Processes and tools ensure data accuracy, completeness, and consistency through validation, monitoring, and correction.

Data Reporting Transformation for Hotel Chain

Solutions architect role redesigning a fundamentally flawed Pentaho ETL into a scalable AWS Redshift data warehouse for a hospitality leader. Identified root...

What It Is

Apache Spark

Audience Intent Data Model Development