Data Architect
- 6 - 10 years of experience
- Bengaluru, Chennai
- Full Time
We are expanding our Data Intelligence practice and are seeking a hands-on Data Platform Architect with 5–8 years of experience designing and delivering modern cloud data platforms. The Data Architect will define the reference architecture, standards, and patterns that underpin our analytics, reporting, and AI/ML solutions, and will work shoulder-to-shoulder with data engineers, analysts, and platform teams to translate architecture into working, production-grade systems.
The ideal candidate has production experience on Databricks or Snowflake (Databricks preferred), strong opinions on data modeling and data governance, and a pragmatic, hands-on mindset — comfortable whiteboarding a target-state architecture in the morning and prototyping a solution in SQL or Python in the afternoon. The Data Architect will partner closely with stakeholders across engineering, analytics, data science, security, and business domains to ensure the platform evolves in line with business needs. Employees may perform other related duties as required to meet the ongoing needs of the organization.
Essential Responsibilities
- Own the reference architecture for the modern data platform — ingestion, storage, transformation, semantic/serving, and consumption layers — on Databricks or Snowflake (Databricks preferred).
- Define and evolve platform standards, patterns, and guardrails — data modeling conventions, medallion architecture (bronze/silver/gold), naming, folder/schema layout, and environment strategy (dev/test/prod).
- Design enterprise-grade data models — Kimball dimensional, Data Vault 2.0, and/or domain-oriented data product designs — and guide their implementation across the practice.
- Stay hands-on: build prototypes, proofs-of-concept, and reference implementations in SQL and Python/PySpark; review pipeline code and pull requests for architectural conformance.
- Lead architecture for data governance and stewardship — data contracts, catalogs, lineage, data quality frameworks, master data management (MDM), and classification — leveraging Unity Catalog, Snowflake Horizon, Purview, Collibra, or equivalent.
- Champion data mesh / data-as-a-product principles where appropriate — domain ownership, data product contracts, discoverability, and federated governance — balanced with a pragmatic central platform.
- Architect AI/ML and GenAI enablement on the platform — feature stores, vector stores, retrieval-augmented generation (RAG), LLM integration patterns, and MLOps handoffs.
- Partner with security and compliance to design controls — identity (Azure AD / Entra ID), access (RBAC, row/column security, masking), encryption, PII handling, and audit logging.
- Own platform cost and capacity architecture — cluster/warehouse sizing patterns, workload isolation, storage tiering, FinOps tagging, budgets, alerts, and chargeback models.
- Engage stakeholders at every level of the organization — elicit requirements, translate them into target-state architecture, and communicate trade-offs clearly to technical and non-technical audiences.
- Lead architecture reviews, design sessions, and technical deep-dives; provide final architectural sign-off on major releases and production deployments.
- Mentor data engineers and analysts on architectural principles and modern data platform best practices; grow the architectural maturity of the practice.
- Stay current with the data ecosystem; evaluate new features and tools, and recommend adoption where they drive measurable business or platform value.
Required Qualifications
- 5–8 years of progressive experience in data engineering, data platform, or data architecture roles, with clear time spent as an architect or technical lead.
- Production experience designing and delivering solutions on a modern cloud data platform — Databricks or Snowflake (Databricks strongly preferred).
- Demonstrated experience defining reference architectures, platform standards, and patterns adopted by multiple teams.
- Strong, hands-on proficiency in SQL and Python (PySpark a plus) — able to prototype, review code, and debug production issues.
- Deep expertise in data modeling — Kimball dimensional modeling, Data Vault 2.0, and medallion architecture — with a clear point of view on when each applies.
- Hands-on experience with ETL/ELT design patterns, streaming / real-time architectures, and ingestion from heterogeneous sources (ERPs, SaaS APIs, databases, files, events).
- Solid experience designing for data governance — access controls, PII handling, row/column-level security, masking, tokenization, data quality, lineage, and catalogs.
- Production experience with at least one major cloud provider (Azure, AWS, or GCP); Azure preferred.
- Working knowledge of orchestration (Airflow, Azure Data Factory, Databricks Workflows, dbt) and CI/CD for data workloads.
- Experience designing BI and serving layers on top of the platform for modern BI tools (Power BI, Tableau, Looker, or similar).
- Strong stakeholder management, technical writing, and presentation skills — able to produce architecture decision records (ADRs), diagrams, and standards documents.
- Bachelor’s degree in computer science, Engineering, or equivalent practical experience.
Preferred Qualifications
- Production experience with Databricks Unity Catalog, Delta Lake, Delta Live Tables, and Databricks SQL; or Snowflake equivalents (Horizon, Dynamic Tables, Streams & Tasks, Snowpark).
- Experience designing and rolling out a data mesh or data-product operating model — domain teams, data contracts, discoverability, and federated governance.
- Experience architecting AI/ML and GenAI workloads — feature stores, vector databases, RAG, LLM integration, and MLOps handoffs.
- Experience leading data platform migrations (e.g., on-prem to cloud, legacy warehouse to Lakehouse, Snowflake ↔ Databricks).
- Experience with FinOps for data — cost modeling, chargeback, workload tuning, and capacity planning.
- Experience with enterprise data catalog and governance tools (Microsoft Purview, Collibra, Alation, Atlan, Informatica).
- Familiarity with real-time / streaming architectures (Kafka, Event Hubs, Kinesis, Structured Streaming, Snowpipe Streaming) and IoT data patterns.
- Exposure to compliance and regulatory frameworks relevant to data (SOC 2, HIPAA, GDPR, PCI).
Preferred Certifications
- Databricks Certified Data Engineer Professional or Databricks Certified Data Analyst Associate
- SnowPro Advanced: Architect or SnowPro Advanced: Data Engineer
- Microsoft Certified: Azure Solutions Architect Expert (AZ-305)
- Microsoft Certified: Azure Data Engineer Associate
- TOGAF, DAMA CDMP, or equivalent data-management certifications
Soft Skills & Ways of Working
- Excellent written and verbal communication — able to produce clear architecture documents, ADRs, and diagrams, and to present to engineers, leaders, and business stakeholders.
- Strong stakeholder management — gathers requirements across domains, negotiates scope, and keeps diverse audiences aligned.
- Pragmatic and outcome-focused — balances architectural purity with the need to deliver incremental business value; comfortable operating in ambiguity.
- Mentorship mindset — invests in growing the skills of data engineers and analysts and raises the architectural maturity of the practice.
- Collaborative and curious — works effectively across engineering, analytics, data science, security, and business teams; learns continuously.
- Ownership and accountability — independently drives architecture workstreams end-to-end, flags risks early, and follows through on commitments.
- Systems thinker — considers end-to-end impact of design decisions on reliability, cost, security, and user experience.