USNLX Ability Jobs

USNLX Ability Careers

Job Information

Gemini Space Station Senior Data Engineer in New York, New York

Gemini Space Station seeks Senior Data Engineer in New York, NY.Duties: Envisioning, architecting, developing, and implementing "State of Art" Platform Centric Solutions to solve complex problems for Data Ingestion, Data Validation, Real-Time Data Ingestion, and Data Visualizations using Big Data Solutions and Frameworks. Proposal of architectural solutions to the internal platform team, preparation of ADR (Architecture Design Review) documents, and leading/performing architectural design reviews with multiple teams to evaluate the pros and cons of the given solution and thereby arrive at an optimal solution through continuous discussion, explorations, and evolution. Building a custom data ingestion framework internal to the organization leveraging cloud-native services to achieve high resiliency and reliability. Development of data visualization over the Metadata/data ingested to ensure data availability and transparency across the organization. Prototyping, developing, orchestrating, and generating dynamic pipelines using a scalable orchestration engine for data analytics and engineering workflows. Defining and implementing extensible Python code for data pipelines in a modular and extensible fashion to adapt to future code base evolutions and adaptations. Leading, developing, and implementing Data Quality (DQ) framework in various architectural layers of the data pipeline to auto-perform data testing, data documentation, data validation, and data profiling thereby eliminating pipeline debts and achieving data quality and data integrity. Mentoring, developing, and implementing Real-Time Data ingestion solutions by leveraging cloud-native services and custom Pythonic transformation and continuously streaming in data to Databricks Machine Learning jobs/models to perform price prediction, market making, and fraud detection. Optimization and tuning SPARK SQL, Snowflake queries, and ETL Jobs based on the analysis of query execution plans and DAG (Direct acyclic graphs) transformation. Performing dimensional modeling for the ingested dataset based on the business asks and optimizing the workflow to provide faster business insights. Identification of the areas of improvement from a reliability and scalability perspective in Big data system designs/architectures, providing performance improvement fixes and solutions during the cycle of Data identification, ingestion, transformation, schematization, and enrichment. Implementing and architecting Data lake using a big data processing engine - Apache Spark by leveraging data lake cloud platforms to unify incremental in-flow of data into existing data on an ongoing basis. Mentoring Data Scientists, Data Engineers, and Risk teams on the usage of data in a business-centric way that complements organization strategic decisions with deeper business insights. Documentation of the architecture, creating data dictionaries, and defining standards/protocols for the team members for future adaptation and evolution. Maintenance of pipelines using an orchestration engine to ensure data availability on an hourly/daily basis to arrive at Business Intelligence or critical financial reports. Monitoring the health of the overall data systems in terms of memory, CPU, and concurrency using open observability platforms like Grafana and AWS Cloud watch events. Analyzing logs to identify the root cause of the challenge and address them by enhancing the existing solution.Position based at headquarters and may be assigned to unanticipated worksites throughout the U.S. as determined by management. Telecommuting permitted.Requires a Bachelor’s degree in Computer Science, Electronic Engineering, or a related technical field (or foreign equivalent) and 5 years of experience in the job offered, software developer, technical architect, or related occupation.Position requires experience in each of the following skills:1.Data engineering with data warehouse technologies.2.Custom ETL design, implementation, and maintenance.3.Schema design and dimensional data modeling.4.Advanced skills with Python and SQL.5.Experience with one or more MPP databases (Redshift, Bigquery, Snowflake, etc.) and one or more ETL tools (Informatica, Pentaho, SSIS, Alooma, etc.)6.Optimization and tuning SPARK SQL, Snowflake queries, and ETL Jobs based on the analysis of query execution plans and DAG (Direct acyclic graphs) transformation.7.Identification of the areas of improvement from a reliability and scalability perspective in Big data system designs/architectures.8.Providing performance improvement fixes and solutions during the cycle of Data identification, ingestion, transformation, schematization, and enrichment.Employment and background checks may be required.$200,000-$220,000 per year. Must also have authority to work permanently in the U.S. Applicants who are interested in this position may apply at jobpostingtoday.com Ref# 47240.

Minimum Salary: 200,000 Maximum Salary: 220,000 Salary Unit: Yearly

DirectEmployers