Micromaster's

Hands-On Data Engineering

A comprehensive data engineering program for career switchers and data specialists

Find out more about the program

Hands-On Data Engineering

Program Start: February 19

Duration: 12 weeks

Format: online

ECTS: 4

Language: Eng / Ukr

Payment in parts: Available

Price: 12,800 UAH/month

Find out more about the program

Program Overview

Today, every business is data-driven. The demand for data specialists continues to grow. That is why we created this program — a micro-master’s track that provides fundamental knowledge of data storage, processing, and retrieval.

Over the course of three months, you will master all key areas of data work — from SQL querying to orchestration and monitoring. For each topic, we selected the most universal and in-demand tools. You will work with open-source tools and services such as Cassandra, Spark, Kafka, and more. These are core technologies used in data engineering and are optimal for hands-on learning. By mastering them, you will easily be able to apply your skills when working with equivalent managed services in Azure, GCP, and AWS. We will cover those in more detail in the final module of the program.

This knowledge and skill set will help you enter the field, strengthen your position in it, or systematize your existing expertise in a trending technology domain.

WHAT YOU WILL LEARN

SQL & NoSQL databases

Data modeling, normalization, and denormalization

PySpark for stream and batch data processing

Collecting data from various sources (file systems, object storage, APIs, and event streams)

Implementing efficient data integration strategies

Airflow, Prometheus, and Grafana for orchestration and monitoring

PARTICIPANT REQUIREMENTS

Basic understanding of Python

Basic understanding of SQL

Basic understanding of Docker

English proficiency at B1 level or higher

EDUCATIONAL MODULES

Module 0: Prerequisites and Introduction to Data Engineering

This module explores foundational skills essential for success in Data Engineering. You will review Python for data handling, the RDBMS for query writing and optimization, and Docker for containerization and environment setup. Additionally, you'll revisit code management for effective collaboration using version control tools. The module will conclude with an overview of Data Engineering and its importance in modern data systems

Module Content

Python Refresher
RDBMS Refresher
Docker Refresher
Code Management Refresher
What is Data Engineering

Module 1: Data Storage

This module introduces the basics of data storage. You'll start with relational databases and learn SQL and data modeling techniques for structured data. The module also covers non-relational databases, such as document-based, column-family, key-value, and analytical types. You'll explore data formats and storage strategies in object storage systems. Additionally, you'll master data modeling principles, including normalization for organization and denormalization for improved performance.

Module Content

Introduction to Database Types
Relational Databases and SQL
Data Modeling: Normalization and Denormalization
Non-Relational Database Types: Documents, Column-Family, Key-Value, and Analytical Insights
Data Formats and Storage Strategies in Object Storage Systems

Workshops

Workshop 1 - What is Data Engineering and Why Do We Need It? Storing Data: Overview of Database Types
Workshop 2 - Relational Databases (RDBMS) and SQL: Data Modeling and Querying
Workshop 3 - NoSQL Databases. Cassandra and MongoDB. Data Modelling and Querying
Workshop 4. NoSQL Databases (nRDBMS) and Data Warehouses (DWH). RDBMS Data Modeling
Workshop 5. Massive and Distributed Storages (Hadoop, ADX). Object Storage and Data Organization: Formats (Text, Binary, Columnar), Partitioning, Queueing (Redis, Kafka)

Module 2: Data Processing

This module explores data processing principles, focusing on batch and stream processing methods. You will understand how to handle data at scale, working with tools like PySpark and Flink to process datasets efficiently. By the end of this module, you will be equipped to implement robust and scalable data pipelines for real-world applications.

Module Content

Batch and Stream Processing
Using the Tools: PySpark and Flink

Workshops

Workshop 6 - Batch Processing with PySpark

Module 3: Data Retrieval

This module explores various methods for obtaining data from diverse sources. You will begin by handling files in file systems and object storage environments and learning to extract and manage data effectively. Next, you will dive into REST APIs to understand how to interact with external services to retrieve and integrate data. Lastly, you will study event streams and message queues.

Module Content

Files on File Systems and Object Storages
REST АРІ
Event Stream and Message Queue

Workshops

Workshop 7. Event Streams and Apache Kafka
Workshop 8. Streaming Processing with PySpark Streaming

Module 4. Coordination and Monitoring

This module develops your ability to coordinate and monitor data workflows and systems effectively. You will explore Airflow, a powerful tool for orchestrating data pipelines, and gain insights into designing efficient workflows. Additionally, you will delve into Prometheus and Grafana, mastering the art of monitoring system performance, visualizing metrics, and ensuring reliability in data operations.

Module Content

Airflow
Prometheus and Grafana

Workshops

Workshop 9. Pipeline Orchestration using Airflow
Workshop 10. Monitoring in Data Engineering: Prometheus and Grafana

Module 5. Data Engineering on Cloud Platforms

This module introduces the principles of data engineering in the cloud, comparing cloud-based and on-premise solutions and their respective advantages and limitations. You will gain a high-level understanding of major cloud platforms - AWS, Azure, and Google Cloud - and explore their functional equivalents of popular data engineering tools.

Module Content

Pros and Cons of Using Cloud Platforms Compared to On-Premise Solutions
High-level Overview of Cloud Platforms: AWS, Azure, GCP

Functional Analogs of the Common Data Engineering Tools

PostgreSql: AWS RDS, Azure Database for PostgreSQL, Google Cloud SQL
Cassandra: AWS DynamoDB, Azure Cosmos DB, Google Cloud Bigtable
MongoDb: AWS DocumentDB, Azure Cosmos DB, Google Cloud Firestore, Atlas MongoDB
Spark: AWS EMR, Databricks on Azure, Google Cloud Dataproc
Spark Streaming: AWS Kinesis, Azure Event Hubs, Google Cloud Dataflow
Kafka: AWS Kinesis, Azure Event Hubs, Google Cloud Pub/Sub
Cloud Analytics: AWS Redshift, Azure Synapse Analytics, Google Cloud BigQuery

Workshops

Workshop 11. Overview and Comparison of the Cloud-Based Data Engineering Tools

The module “Data Engineering on Cloud Platforms” is designed within the project “Knowledge Rise: Advancing Sustainable Blue-Green Economies via Deep Tech – Innovation Capacity Building in Higher Education” (grant agreement No. 24473). The project is part of the broader CloudEARTHi initiative and is funded by the European Union through the EIT HEI Initiative, coordinated by the European Institute of Innovation and Technology (EIT), Cohort 4.

Curator and Instructor

Dmytro Pryimak

Engineer with over 10 years of professional experience in designing and building systems for distributed data processing. Throughout his career, Dmytro has worked on numerous projects covering insurance, healthcare, medical data processing, online media, and entertainment. In recent years, he has shifted his focus from purely engineering to team leadership and mentoring/coaching.

Dmytro is also a guest lecturer at SET University, where he teaches Big Data at the Master’s degree programs.

Maksym Ivashura

Experienced database/data warehouse and business intelligence engineer with over 30 years of professional background in manufacturing and outsourcing sectors. Currently works at Trinetix, where he also serves as a mentor and technical interviewer. Maksym is the author of the StoreOff accounting system and has extensive experience with various database management systems, including MS SQL Server (as well as SSAS, SSIS), Azure DB, PostgreSQL, Redshift, Snowflake, MySQL, Oracle, SQLight, MongoDB, Redis, Cassandra, and less conventional systems such as Firebird/InterBase, MS Access, DBase, DataEase, and DuckDB. Based in Kharkiv, Ukraine, and Málaga, Spain.

Khrystyna Kokolius

A seasoned Data Engineer at SoftServe with over 4 years of hands-on experience in designing and developing data pipelines. Her core expertise lies in optimizing SQL queries, streamlining data processing workflows, and elevating overall system performance. By leveraging best practices in data analytics, Khrystyna helps teams scale efficiently and adopt modern technologies for enhanced data-driven solutions.

Sirojiddin Dushaev

Data Engineering & BI expert with extensive experience in building scalable data solutions. Specializes in database architecture, data warehousing, and business intelligence development. Proficient in cloud platforms such as AWS, GCP, and Azure, as well as big data technologies like Apache Spark and Kafka. Skilled in SQL, Python, and deploying machine learning models. Passionate about data-driven decision-making and optimizing analytical infrastructures. Loves collaborating with teams to enhance data efficiency and business insights.

ADVANTAGES

The program provides the necessary skills and knowledge to start a career in one of the most in-demand IT fields

Flexible learning format, allowing you to combine it with full-time work

Training by expert practitioners who provide relevant feedback and quality support during the course

WHO IT’S FOR

Developers looking to grow in the field of data engineering

Data Scientists and Data Analysts who want to transition into a Data Engineer role

Junior Data Engineers looking to organize their knowledge and use data tools effectively

Experienced technical specialists who need to learn data engineering for project management, architecture design, and expanding overall skills in this technology area.

Senior+ tech specialists looking to expand their expertise for project management and architecture design

Reviews

Yevhenii Pylypchuk

Senior Software Engineer at GlobalLogic

Wrapped up the Data Engineering certificate with SET University – and honestly, this was the toughest course I’ve taken there so far. What a journey! For me, this path wasn’t about becoming a “data engineer,” but about expanding my mindset into the data world. Along the way, I got my hands dirty with: building real-time pipelines (Kafka + Spark + Cassandra); debugging containers until they screamed (well, honestly, that was me who screamed most of the time); seeing how fragile data flows can be – and how dangerous that is from a security perspective.

Key reflection: data rules the world. Realizing that every broken data pipeline isn’t just a tech headache – in a security context, it can turn into an open door for attackers or a blind spot that hides malicious activity. I may never call myself a data engineer, but this course helped me realize how important and interconnected this path is with cybersecurity. It showed me how much more there is to explore – and I’m more driven than ever to go deeper down the rabbit hole.

Thank you Dmytro Pryimak, Maksym Ivashura and the whole SET University community. Your guidance, challenges, and energy turned this course from “just hard” into one of the toughest – and most rewarding – journeys I’ve ever taken.

більше

Iryna Yershova

Senior Software Engineer at Splash Tech

It was a fascinating yet difficult experience. I’ve learned so much new, and now I’m able to apply this knowledge in my professional career. Even though I’m currently working as a frontend engineer, I believe it’s crucial to understand how to work with data. Data is the new oil, as they say.

Thanks to Dmytro Pryimak and Maksym Ivashura for challenges you were creating for us during this course!

більше

FAQ

I already work as a data engineer. Is it worth taking your course?

If you have been working in this position for a year or less, then yes, this course will help you structure your knowledge and fill in gaps in your mastery of specific tools.

Learn more about the SET University program

Name * Error Phone Number * Error E-mail * Error

I have a promo code Promocode

57 + 4 = ? Error

I accept the Terms of the Public Offer and consent to the processing of my personal data in accordance with the Privacy Policy.

you did not agree with the documents

Thank you! We will contact you shortly.