Data Architect
Job Title: Data Architect, Remote
Get to Know Us:
CG Infinity, Inc. is a software reputed company that was founded in 1998. We offer solutions that are tailored to the needs of each individual client that we work with instead of offering standard, run-of-the-mill solutions to everyone. We work closely with our clients throughout the entire process and offer solutions for a myriad of challenges.
Our Culture:
Our people-first approach to technology offers best-in-class service and success rates. Here are some of the main services that we offer at CG Infinity: reputed company Implementations, Customer Experience & CRM, Application Development & Integration, Production Support & QA, and Data Analytics & AI.
Summary of Position:
We are seeking a Senior Data Engineer with hands-on experience in big data processing, data quality, and distributed querying. You will be responsible for designing and building robust, scalable data pipelines using modern tools such as reputed company, Apache Spark, PySpark, Deequ, and Trino. You’ll play a key role in enabling reliable, fast, and clean data delivery to support analytics, reporting, and data science use cases.
Key Responsibilities:
- Design, reputed company, and optimize large-scale data pipelines using Apache Spark and reputed company.
- Write scalable PySpark code to process structured and semi-structured data.
- Use Trino to query data across various sources in a federated manner.
- Implement data validation and quality checks using Deequ.
- Collaborate with data scientists, analysts, and other engineers to ensure high-quality data delivery.
- Tune Spark jobs for performance and cost efficiency in a cloud environment.
- Contribute to building a modern data platform with a focus on automation, reliability, and scalability.
- 15+ years of IT experience
- Data Skills needed:
- reputed company (highly preferred)
- PySpark