Omar ElMaria's Projects
This repo contains a Selenium script that automatically checks for Consultation appointments on the Volkshochschule Berlin Mitte Website (https://vhsmitte.flexappoint.de/#/). This website is used to book appointments for the "Leben in Deutschland" test, which is a prerequisite for obtaining the permanent residence or citizenship in Germany
This repo contains a full-fledged Python-based script that scrapes a JavaScript-rendered website, cleans the data, and pushes the results to a cloud-based database. The workflow is orchestrated on Airflow to run automatically
A Python script to analyze if the variant allocation algorithm produces any bias in switchback tests
This repo contains a multi-stage R-based script that scrapes a JavaScript-rendered E-commerce website using RSelenium and RVest. It also formats and cleans the data and stores it in a table for analysis purposes.
This repo contains a scraping script that crawls a JavaScript-rendered website using the scrapy-playwright package in Python and the scrapy framework
This repo contains the source code showing how to integrate a Proxy service (ScraperAPI) with Scrapy Playwright. The repo has two spiders, one for quotestocrape.com and the other for httpbin.org/ip
This repo contains the scripts used to analyze the "shops" experiment in SG that targeted both the groceries and health and wellness verticals
This repo contains a Python script that crawls 5120 flight routes from the popular flight aggregator Skyscanner
This repo contains a GBQ script that clusters vendors according to their elasticity of demand and conversion rate trends. The goal is to identify vendors with lower price sensitivity than their peers to implement a differentiated pricing strategy. The R script analyzes the performance of an ABn test that was set up to validate the quality of the clusters in how much incremental gross profit does the differentiated pricing strategy yield
This repo contains a GBQ script that pulls, cleans, and aggregates data of a hybrid experiment (AB & diff-in-diff). The R script contains a logic that analyzes the performance and significance of the results according to key success metrics
This repo contains a data pipeline composed of Python and Big Query scripts that extract, clean, and aggregate data, as well as perform statistical significance tests. The code is fully orchestrated on Airflow and feeds a Tableau dashboard that displays the success metrics of surge pricing switchback experiments.
This repo contains queries used to extract data about vendor performance in APAC markets of Delivery Hero. The data is used to simulate the impact on gross profit and GMV if free delivery beneficiaries were charged delivery fees. The simulation is done in a Tableau dashboard.
This repo contains a GBQ script that pulls operational performance metrics of an E-commerce platform's suppliers and ranks them against one another for benchmarking purposes
This repo contains a Python script that tracks the availability of medical appointments on https://wafid.com/medical-status-search/ in the UAE
This repo contains two Rmd files. The first file scrapes wine listings under the brand name "mövenpick" using the rvest package. The second scrapes Javascript-rendered apartment listings on the Swiss real estate website (homegate.ch) using RSelenium
This repo contains a Python Selenium script that scrapes the restaurant name, subtitle, delivery fee, and promised order time from the restaurants listing page of Wolt (https://wolt.com/en/discovery/restaurants)