Alternate Job Titles

Big Data Engineer, Business Intelligence Engineer, Computer Vision Engineer, Data Architect, Machine Learning Engineer

Job Level

Entrant
Specialist

Functional Group

Software and Systems

Job Family

Applications configurations and Installations and IT Security

Job Description

The Data Engineer is responsible to: 

  • Support the design, implementation and maintenance of data flow channels and data processing system 
  • Assist in developing and implementing data pipelines and data stores  
  • Support the collection, storage, batch and real-time processing, and analysis of information in a scalable, repeatable and secure manner 
  • Perform administrative tasks to provide accessibility, irretrievability, security and protection of data 
  • Design and implements data pipelines and data stores to acquire and prepare data 
  • Apply data engineering standards and tools to create and maintain data pipelines and extract, transform and load data 
  • Carry out routine data quality checks and remediation 
  • Design, implement, and maintain complex data engineering solutions to acquire and prepare data 
  • Create and maintain data pipelines to connect data within and between data stores, applications and organisations 
  • Carry out complex data quality checking and remediation 
  • Plan and drive the development of data engineering solutions ensuring that solutions balance functional and non-functional requirements 
  • Monitor application of data standards and architectures including security and compliance 
  • Contribute to organisational policies, standards, and guidelines for data engineering 

Critical Work Function

Identification of Business Needs

  • Lead the selection and development of data engineering methods, tools and techniques.
  • Develop organisational policies, standards, and guidelines for the development and secure operation of data services and products.
  • Ensure adherence to technical strategies and architectures.
  • Plan and lead data engineering activities for strategic, large and complex programmes.
  • Identify suitable data structures based on business needs to ensure availability and accessibility of data 
  • Determine technical system requirements based on data needs.
  • Keep abreast of latest technologies and products in database and data processing software, and technologies.

Data Pipeline Building and Maintenance

  • Assist in building scalable data pipelines to extract, transform, load and integrate data.
  • Develop codes and scripts to process structured and unstructured data in real-time from a variety of data sources.
  • Test data pipelines for scalability and reliability to process high data volume, variety and velocity. 
  • Consolidate and create data storage solutions for storage and retrieval of information.
  • Develop prototypes and proof-of-concepts for data solutions.
  • Monitor data system performance.
  • Support the handling and logging of errors.
  • Develop backup data archiving systems to ensure system continuity.
  • Implement and monitor data security and privacy measures on existing data solutions.

Architecture design

  • Design the architecture of a data platform.
  • Develop of data related instruments/instances.
  • Develop, customize and manage integration tools, databases, warehouses, and analytical systems.
  • Manage data pipeline maintenance/testing.
  • Test the reliability and performance of each part of a system.

Machine Learning Algorithm Deployment

  • Design machine learning models by data scientists.
  • Responsible for deploying machine learning models into production environments.
  • Provide the model with data stored in a warehouse or coming directly from sources. 
  • Configure data attributes.
  • Manage computing resources.
  • Set up monitoring tools.

Management of Data and Meta-Data.

  • Manage the data stored and structuring it properly via database management systems. 
  • Provide data-access tools. 
  • Set up tools to view data, generate reports, and create visuals.
  • Track data pipeline stability.
  • Monitor the overall performance and stability of the system.
  • Monitor and modify the automated part as change needed for data/models/requirements.

Data Solution Optimisation

Entry Requirements

#1

Data Engineer

BDQF Level 6 in Computer Science, Information Systems, Software Engineering or related field or


BDQF Level 5 in Computer Science, Information Systems, Software Engineering or related field with minimum 5 years relevant experience and certification.

Skills & Competencies

Technical Skills

Soft Skills

Recommended Technical Training Courses

Amazon Web Services (AWS) Certified Data Analytics - Speciality

Cloudera Certified Associate (CCA) Data Analyst

Cloudera Certified Professional (CCP) Data Engineer

Cloudera Data Platform Generalist Certification

Data Science Council of America (DASCA) Associate Big Data Engineer

Data Science Council of America (DASCA) Senior Big Data Engineer

Dell EMC Data Science Track (EMCDS)

Google Professional Data Engineer

IBM Certified Data Engineer

IBM Certified Solution Architect – Cloud Pak for Data v4.x

IBM Certified Solution Architect – Data Warehouse V1

IBM Data Science Professional Certificate

Microsoft Certified: Azure AI Fundamental

Microsoft Certified: Azure Data Engineer Associate

Oracle Business Intelligence Certification

SAS Certified Big Data Professional

SAS Certified Data Scientist

Tensor flow Developer Certificate

Agile Methodologies

Information Technology Infrastructure Library (ITIL)

Power BI

Project Management