Leading development efforts in ingesting and transforming data from various sources. Working in the Big Data/ Hadoop environment, should be hands on in writing code, building scripts, writing specifications and responsible for end to end delivery of data in the Enterprise Data Lake environment
Build distributed, reliable and scalable data pipelines to ingest and process data from multiple data sources
Designing, building, operationalizing the data platform using Google Cloud Platform (GCP) data services such as DataProc, Dataflow, CloudSQL, BigQuery, CloudSpanner in combination with third parties such as Spark, Apache Beam/ composer, DBT, Cloud PubSub, Confluent Kafka, Cloud storage Cloud Functions & Github
Designing and implementing data ingestion patterns that will support batch, streaming and API interface on both the Ingress and E...