Providing remote Data Engineer and Data Scientist services to various clients to set up their data driven businesses, and to deliver insights from their customers.
My passion is building data infrastructures and huge data warehouses. I like to write beautiful program code and create useful open-source packages. Furthermore, I'm big fan of making computer games.
June 2018 - Present, various locations
- bfaludi/dbsa - Table definition and schema description classes to use with Airflow
- Data Architecture for your Business - Presentation for Data Council Singapore conference
Mar 2019 - Present, London (GBR)
Building a full 360-degree near real-time data architecture for a new product based on Amazon Web Services. Using Amazon S3 as data storage, Amazon EMR with Presto as the quering language, Apache Airflow for pipeline management. Ingesting events from clients and backend services with Amazon Kinesis Data Firehose less than 60 seconds delay. My job is to put together the infrastructure to run everything under Terraform and Helm; specify and document events to measure all business KPIs; create core datasets in Presto; materialise and aggregate data into Cubes; build models for sentiment analysis and review clusterisation.
Aug 2019 - Present, Berlin (GER)
Myo is a communication platform where caregivers share content with relatives who can witness important moments. Setting up and desinging a data architecture to provide valuable insights for the business and for facilities. The work includes specifing backend, web, android, ios events and centralise all events into core tables to make it easy to use and access. The work also includes setting up servers, architecture elements on Amazon Web Services, tracking, data quality checks, developing an administritive interface as an Airflow plugin, and create visualisation in Tableau. To build the required system I had to interact with many stakeholders, founders, developers directly to deliver the full data system in a short period of time.
June 2018 - July 2019, London (GBR); Oct 2019 - Present, London (GBR)
Providing services such as consultation and optimalisation of current data storage and functionality. I build a centralised data-warehouse that combines data across various sources - including more than 20 sources from publishers and 3rd party companies. It helps the company to improve purchase conversions, support better business decisions, and help understand customer behaviour while enable Marketing and other teams to drill-down and answer specific questions. Using Amazon Web Services, Redshift, Airflow as scheduler and ETL tool. During the job I designed and built core data-sets that can be used by other people to collect insight easily while replicating already existing processes the company had.
Data Engineer @Facebook
June 2017 - Nov 2018, London (GBR)
Working in Account Integrity team. Team's mission is reducing negative experiences caused by abusive accounts. With over 2B+ monthly active users, this is extremely impactful which directly influences all of Facebook’s integrity efforts from fighting fake news and misinformation to preventing harmful behaviours such as bullying and harassment.
Building stable and efficient data pipelines, self-explanatory reusable datasets, dashboards, and productionise previous research techniques. Also, I help to measure team efforts and business goals by defining, refining and creating new stable metrics for short and long terms. Working closely to cross functional teams to support their data needs and enable product goals. Using different data-exploration techniques to discover previously unasked questions and future opportunities.
Data & Applied Scientist @Microsoft
June 2015 - May 2017, Berlin (GER)
Continuing the work on Wunderlist's data infrastructure after Microsoft's acquisition of 6Wunderkinder. Shipping improvements and working constantly on scaling and stability. Benchmarking Microsoft's recent data solutions and doing the migration from Amazon Web Services to Azure. Furthermore, building a brand new open-source client tracking solution in Go and a MSSQL database CLI in NodeJS.
Designing and building a new data architecture for Microsoft To-Do using technoglogies such as Azure Cosmos DB, Azure SQL Data Warehouse and Azure Kusto. Contributing to the client tracking implementation for the iOS and Web clients, and designing the required events. Moreover, creating score card metrics to clusterize our users with machine learning algorithms based on their usage patterns to improve user retention and satisfaction.
- Amazon Redshift's UDF - Article
- Using Python for data? - Presentation for Python Ireland 2015 conference
- bfaludi/aqs-sweeper - Azure Queue Storage dumper that copies data into Azure Blob Storage
- bfaludi/azrcmd - Azure Blob Storage command line tool to download and upload files
- wunderlist/logsanitizer - Log processing and sanitizer package
- wunderlist/hamustro - Collector of events
- wunderlist/cheetah - Command line interface for MSSQL that works in OSX and Linux
- Building Open Source libraries - Short presentation for budapest.py meetup
- Cloud Hopping with a Stack of Data - Presentation for Big Data Universe conference
- Azure SQL Data Warehouse Customer Stories from Early Adopters - Presentation for PASS Summit 2016 conference
(contractor) Data Consultant @eyeo
Mar 2017 - Jun 2017, Berlin (GER)
Working as a part-time data consultant to guide Eyeo's data team how to build a maintainable, self-healing and secure data architecture for Adblock Plus. My main responsibility was providing a second opinion on the current solutions and forming ideas that can improve the current system and fix existing bottlenecks.
Data Engineer @6Wunderkinder
May 2015 - June 2015, Berlin (GER)
Working on Wunderlist's data infrastructure on Amazon Web Services. Building data pipelines, data architecture components, log processing and improving data quality and data reliability significally by setting up anomaly detection and failsafe protocols. Optimizing the execution time and the resource usage of the infrastructure, while preparing all components for scaling. Furthermore, I'm collecting valuable insights about user behaviour to influence product improvements.
Using mostly Python, Ruby, Go, Scala, Bash, Redshift, SQL, Chart.io, Makefile, Flask, NodeJS, etc.
Senior Database Manager @hellowearemito
Feb 2013 - June 2016, Budapest (HUN)
Designing, implementing and maintaining complex data warehouses, and optimising the performance of large databases. Making data reports for our clients to answer key business questions, and analysing their data to support various business goals.
Developing programs and packages such as
- data cleansing API to correct names and addresses or different contact information for the CEE region to improve data quality,
- extract, transform and load tool to move data between different data sources easily,
- building deduplication algorithms to remove similar information coming from different data sources,
- building customer relationship management software to track marketing communications and its efficiency.
Furthermore, I was leading and coordinating the data related developments. Scheduling and controlling tasks, allocating resources, and did code review from time to time.
- Extract, Transform, Load using mETL - Presentation for PyCon Sei, Florince conference
- Python for Data Science - Presentation for Budapest BI Fórum conference
- Extract, Transform, Load using mETL - Workshop for PyData Berlin 2014 conference
- Data Cleansing *data related to individual - Presentation for @budapestpy meetup
- mETL - Open-source ETL package to load any kind of data with easy configuration
- Augmented Reality - Presentation for Budapest Data Science Meetup meetup
(contractor) Database Developer @hellowearemito
May 2012 - Jan 2013, Budapest (HUN)
Managing PostgreSQL database, reviewing and checking queries created by developers. My primary role was creating complex stored procedures, triggers, migration scripts and SQL queries. My task include database query optimalization and speed reduction, CPU and memory usage reduction.
IT Project Manager @Isobar
Dec 2011 - Jan 2013, Budapest (HUN)
Bringing clients and contractors together with the company's internal creative, development and planning department to satisfy clients' needs with the highest quality. Above of assessing clients' needs, my tasks include writing offers, contracts, functional specifications; proposing requirements and steps of production with taking into consideration the newest technology and innovative ideas. My primary roles were scheduling and controlling projects (written in C#, Java, PHP), estimating the resources needed to different tasks, and doing functional check and quality control on tested tasks. Every project requires a continuous client communication and holding presentations and education in the final stage.
Web Application Developer @LensaHR
Nov 2011 - Dec 2011, Budapest (HUN)
Continuing the work with Lensa recruitment system at the headquarter with a sales team. Sheduling and controlling tasks, allocating resources, writing functional specifications and make documentation for the most complex software architecture in the project. I cooperated with professors in different Universities and collected materials on data mining, lexical analyzers and artificial neural networks to improve test scores in CV extract.
Lead Software Engineer @hellowearemito
June 2009 - Nov 2011, Budapest (HUN)
My primary role was leading the development of Lensa recruitment system, scheduling and controlling tasks. I also participated in software design, customer communication, UX monitoring and allocating resources. Among my other tasks was the execution of more complex problems in programming, like extracting and interpreting texts from CVs, detecting portraits, developing Python framework and optimizing SQL queries.
Used technologies: Python, ExtJS, PostgreSQL, solr, opencv, etc.
PHP for beginners & Web applications teacher @eotvos_uni
Feb 2008 - Aug 2010, Budapest (HUN)
At the Eötvös Loránd University, students learn how a dynamic website is constructed and they can create web pages with the help of PHP. They cover MVC and OOP approach and they get familiar with the usage of different databeses and how they create queries with optimized way.
Dec 2014 - Present, Budapest (HUN)
The purpose of budapest.py to bring together the Hungarian Python programmers and give them a friendly environment where they can learn about new packages and tools.
May 2013 - May 2015, Budapest (HUN)
Budapest Database Meetup's purpose to help with your daily work. It presents new ideas, techniques for everybody who interested in databases.
Nov 2014 - Present
The mission of the Python Software Foundation is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.
Community Member @NumFOCUS
Nov 2014 - Present
NumFOCUS promotes and supports the ongoing research and development of open-source computing tools through educational, community, and public channels.
(Play) A game to experience the working poor @ Mito Hackathon 2015
The project purpose is to draw attention to deep poverty all around the world. This is a serious problem and they need more selfless heroes who help them to survive.
(Play) Automatons Game @ Global Game Jam 2015
We used Unity3d and 3DS Max to create this game in a 48 hours hackathon. I worked on level design and level editing.
Automated drones working in a factory in peace and quiet -- but when one of them realizes that there is more to life than moving packages -- what does it do now? Escape from the factory and avoid all security using isometric controls in this short, 3-level game created at the Budapest Game Jam site!
(Play) No Hope For Us @ Mito Hackathon 2014
No Hope For Us was made at the mighty Mito Hackathon, a 24 hours jam. The game was built on top of the phaser.io framework, multi-player is powered by a custom Node.js server. I worked on the wave balancing, zombie generation, stats and backend programming.
It is 2125. Earth is done. The virus took over and only a few thousand survivors managed to escape to four orbiting space stations. While they are getting ready to move on, dropships are scanning the surface looking for survivors.
Education and Training
(not finished) Computer Software Engineering, BSc @ Eötvös Loránd University
Sep 2006 - Feb 2012, Budapest (HUN)
IT Class @ Ipari Secondary Technical School
Sep 2002 - Jul 2006, Veszprém (HUN)
Programming & scripting languages
|Python (8+ years)|
|SQL & PLSQL & TSQL (10+ years)|
|PHP (8+ years)|
|C & C++|
Databases and SQL query engines
|PostgreSQL (5+ years)|
|Azure SQL Data Warehouse|
|Oracle (5+ years)|
|MySQL (5+ years)|
|Flask (3+ years)|
|Pyramid (3+ years)|
|ExtJS (3+ years)|
|jQuery (3+ years)|
|PhaserJS + Box2D|