[Remote] GCP Data Engineer
Note: The job is a remote job and is open to candidates in USA. Rivago Infotech Inc is seeking a highly skilled Senior GCP Data Engineer to design, build, and optimize scalable, cloud-native data pipelines on Google Cloud Platform (GCP). This role will act as the primary technical owner for implementing data pipelines and driving engineering best practices across the data ecosystem.
Responsibilities
- Design, develop, and maintain scalable data pipelines on GCP for batch and real-time processing
- Implement data ingestion frameworks (push/pull) using services like Dataflow, DataStream, Pub/Sub, and Cloud Run
- Build and optimize data transformation pipelines using BigQuery, Dataform, and Spark-based frameworks
- Act as the technical lead/owner ensuring alignment with EA-approved data architecture
- Drive best practices in data engineering, including CI/CD, testing, monitoring, and cost optimization
- Collaborate with cross-functional teams to deliver end-to-end data solutions
- Ensure data quality, reliability, governance, and performance across pipelines
Skills
- 6+ years of experience in data engineering, with strong focus on Google Cloud Platform (GCP)
- GCP Professional Data Engineer certification - must have
- Design, develop, and maintain scalable data pipelines on GCP for batch and real-time processing
- Implement data ingestion frameworks (push/pull) using services like Dataflow, DataStream, Pub/Sub, and Cloud Run
- Build and optimize data transformation pipelines using BigQuery, Dataform, and Spark-based frameworks
- Act as the technical lead/owner ensuring alignment with EA-approved data architecture
- Drive best practices in data engineering, including CI/CD, testing, monitoring, and cost optimization
- Collaborate with cross-functional teams to deliver end-to-end data solutions
- Ensure data quality, reliability, governance, and performance across pipelines
- Strong hands-on experience with Medallion architecture (Centralized Hub/Spoke model)
- Strong hands-on experience with BigQuery (data warehousing & optimization)
- Strong hands-on experience with Dataform (SQL-based transformations)
- Strong hands-on experience with Dataflow (batch & streaming pipelines)
- Strong hands-on experience with Datastream (CDC ingestion)
- Strong hands-on experience with Cloud Composer (Airflow) (orchestration)
- Strong hands-on experience with Dataproc (Spark/PySpark)
- Strong hands-on experience with Cloud Run
- Strong hands-on experience with Cloud Storage
- Strong hands-on experience with Pub/Sub
- Advanced proficiency in Python
- Advanced proficiency in PySpark / Apache Spark
- Advanced proficiency in SQL (complex transformations, performance tuning)
- Deep understanding of data modeling techniques
- Deep understanding of Medallion architecture (Bronze/Silver/Gold layers)
- Deep understanding of Centralized Hub/Spoke data platform design
- Strong ownership and accountability
- Ability to work across distributed teams and vendors
- Excellent communication with technical and non-technical stakeholders
- Experience with real-time streaming architectures
- Familiarity with CI/CD pipelines (Terraform, Cloud Build, GitOps)
- Strong understanding of data governance, security, and compliance
- Experience working in large enterprise or utility/regulated environments
Company Overview