Own Pemo’s core data infrastructure across transactional databases and the data warehouse, ensuring it is scalable, reliable, and ready for rapid product growth.
Design and optimize data models in PostgreSQL for our core platform, working closely with backend engineers to support high-throughput, low-latency product features.
Define and implement data warehouse models (fact/dimension tables, star/snowflake schemas) in solutions such as BigQuery, ClickHouse, Snowflake, Redshift, etc., optimized for analytics and AI use cases.
Design and maintain ingestion flows from operational systems (e.g. PostgreSQL) and external tools (e.g. Segment, Mixpanel, Elastic) into the warehouse, balancing freshness, cost, and complexity.
Tune transactional and analytical performance through indexing strategies, partitioning, clustering, materialized views, and schema refactoring where needed.
Implement and maintain observability for data infrastructure, including query performance, data freshness, job failures, and schema changes across both transactional and analytical systems.
Work closely with data analysts to provide clean, well-modeled datasets that power BI dashboards and self-serve analytics for teams across Pemo.
Collaborate with business stakeholders (finance, operations, compliance, leadership) to understand reporting, control, and audit requirements and reflect them in data models and tooling.
Partner with AI engineers to design schemas, tables, and views that are optimized for AI agents, retrieval, and downstream ML/LLM workloads.
Work with backend engineers to align on domain boundaries, event models, and data contracts so that product features and data infrastructure evolve together.
Implement practical data compliance controls such as masking, column-level permissions, row-level filters, and PII separation in coordination with our security and compliance efforts.
Act as the go-to person for data infrastructure questions, helping teams diagnose data issues, performance bottlenecks, and modeling gaps.
Who we are looking for
A strong software engineer with 3-5 years of experience working on data-intensive backend systems, databases, or data platforms.
Deep experience with relational databases, especially PostgreSQL, including schema design, indexing, query optimization, and understanding of query planners.
Proficient in data warehouse solutions such as BigQuery, ClickHouse, Snowflake, Redshift, etc., including partitioning, clustering, and designing analytics-friendly schemas.
Solid understanding of data ingestion patterns (CDC, batch loads, streaming), and experience building or maintaining reliable ingestion pipelines from operational databases into a warehouse.
Comfortable working with SQL at an advanced level, including complex joins, window functions, aggregations, and performance tuning on large datasets.
Experience supporting BI and analytics teams, providing well-modeled data, documenting datasets, and collaborating on metrics definitions and dashboards.
Familiar with data observability concepts (freshness, completeness, schema changes, data drift) and monitoring/alerting for data pipelines and warehouses.
Able to implement data compliance requirements in practice, such as masking sensitive columns, enforcing column- and row-level access, and separating PII from non-PII datasets using the tools available in the data stack.
Familiar with data governance concepts and tooling (data catalogs, lineage, ownership, documentation) and able to apply them pragmatically in a growing startup.
Comfortable collaborating with backend, AI, analytics, and business stakeholders, translating their requirements into concrete data models, tables, and flows.
Strong communicator who can explain trade-offs in schema design, performance, and data architecture in simple language to non-technical stakeholders.
Self-motivated, hands-on, and comfortable taking ownership in a fast-moving, high-responsibility environment.