- Qurix Technology GmbHData EngineerDIGITALAGENTUREN & IT-CONSULTINGMai 2023 - Heute (2 Jahre und 1 Monat)Hamburg, DeutschlandSeptember 2023 - now, Data Catalog, Food Industry (Remote)
- Developed an engine in Python that reads data sources and derives key information such as row count, 10 most frequent values per column, column types and pandas statistical calculations.
- Developed a Streamlit UI application that provides a consolidated overview of the metadata.
- Integrated Confluent Kafka as a messaging broker between the engine and the Streamlit UI App.
- Automated the manual process (originally in Excel) of maintaining and querying metadata across multiple systems including Postgres, DB2, Kafka topics, and Excel tables.
- Accelerated migration activities through the provided consolidated metadata and key information about table contents and data types across all data sources.
- Containerized both applications for kubernetes deployment in the customer environment.
May 2023 - August 2023, Hubspot Integration, Machinery Industry (Remote)- Implemented the integration of Hubspot CRM with Alphaplan, a machinery company's ERP system, to enhance CRM data collection and ingestion processes.
- Integrated Apache Kafka as a messaging broker between Hubspot and target system.
- Developed a Python Kafka producer application in Azure Cloud (Azure Functions) that performs API calls and sends entries to Kafka.
- Developed a Python Kafka Consumer application on-premises that listens to the Kafka topic and writes entries to the target system’s database.
- Streamlined CRM data acquisition enabling automatic data transfer to the on-premises ERP system.
- Significantly improved user experience and operational efficiency by eliminating the need for sales team users to maintain CRM data through direct access to the on-premises ERP system.
- Capgemini Deutschland GmbHData EngineerDIGITALAGENTUREN & IT-CONSULTINGApril 2021 - April 2023 (2 Jahre und 1 Monat)Düsseldorf, DeutschlandMarch 2023 - April 2023, event-driven ETL in AWS Cloud, Financial Sector (Remote)
- Developed an event-driven ETL pipeline in AWS Cloud using AWS services (Glue, Step Functions, Lambda and SQS) to catch S3 events and trigger processing and validation of XML files.
March 2022 - March 2023, Support project in Azure Cloud, Consumer Goods Industry (Remote)- Application development in Pyspark to process Terabytes of data.
- Set up monitoring and alerting functionalities for Python and Pyspark applications.
- Migrated processing logic from Dremio to Databricks which increased performance.
- Responsible for maintenance of 5 applications in the Azure Cloud.
July 2021 - February 2022, Migration project in Azure Cloud, Consumer Goods Industry (Remote)- Migrated data from data lake gen1 to data lake gen2 using data factory.
- Implemented changes in Python and Pyspark applications to use data lake gen2 API.
- Developed and maintained ETL pipelines using Pyspark, Data Factory, and Azure DevOps.
- Responsible for migration activities for 10 applications in the Azure Cloud.
- Huf GroupTest EngineerAUTOMOBILSEKTORMai 2018 - April 2020 (1 Jahr und 12 Monate)Velbert, DeutschlandDerived test cases from customer requirementsSpecified of the test environment in coordination with the stakeholdersEvaluated test dataManaged test activities in international development projects
- Data Science BootcampSpiced Academy Köln2020
- M.Sc. Electrical EngineeringMittweida University of Applied Sciences2016
- B. Sc. Applied PhysicsNational Institute of Applied Sciences and Technology Tunis2013
- Certified Professional for Requirements EngineeringCertible2018
- Deep Learning SpecializationDeepLearning.AI2022
- AWS Machine Learning SpecialtyAWS2022
- Azure Data Engineer AssociateMicrosoft Azure
- DataExpert.io Free Data Engineering BootcampDataExpert.io