Description :
|
- 12-15 years of experience working in large BI, DWH and EDL implementations
- Mandatory 2-3 enterprise implementation using Cloudera Data Lake and it's all services for example: YARN, KUDU, OOZIE, IMPALA, HIVE, HBASE, SPARK, Navigator
- Should have strong experience in converting Oracle like SQL/PL, Data transformations, Data quality and complex SQL's into data lake
- Experience with cloud platforms (AWS, Azure), including readiness, provisioning, security, and governance
- Experience with data integration and streaming tools used for both EDW’s and Hadoop (Informatica CDC, Spark, Kafka, etc.)
- Experience or understanding of Data Science and related technologies (Python, R, SAS, etc.)
- Experience or understanding of Artificial Intelligence (AI), Machine Learning (ML), and Applied Statistics
- Conducts facilitated working sessions with technical executives and experts
- Should have strong knowledge of replicating data on real time from enterprise application like ERP, CRM, DWH like teradata to Cloudera data lake
- Provide a solution design that both meets the business requirements but also the best technical practices of standing up a Cloudera-based data Lake
- Working with the Hadoop Admin, prescribe data ingestion standards that will implement solutions similar to Flume-Kafka integration.
- Responsible for creating all processing data tenants which include utilization of Spark, Hive, SQL and other Cloudera based frameworks
- Work in concert with the Security Administrator to outline data security utilizing core Cloudera platform, such as Apache Sentry to securely manage authentication by verifying user credentials
- D the user access template that combine multiple access rules that utilize the Active Directory access model combined with the Cloudera security model
- Provide expert data model consultation on decisions on NoSQL vs Relational Database vs in- memory data storage
- Travels to client locations
- Very good communication skills
|