Data Crawler & Automation Engineer

Youssef
Nagy.

I reverse-engineer APIs
you're not supposed to use

~/projects
$

// About

I specialize in reverse-engineering closed APIs — protobuf, GraphQL, internal endpoints — and turning them into clean, scalable data pipelines.

While others spin up browsers and fight CAPTCHAs, I decode the underlying protocols for direct HTTP extraction. Every tool I build ships as an open-source pip package, production-ready from day one.

My systems extract 100,000+ records per week across Google Maps, Meta Ads, government databases, and more — all without a browser in sight.

100K+
Records / Week
5+
Open-Source Tools
28
State Gov Sites
0
Browsers Needed

// What I Do

API Reverse Engineering

Decode private protocols — protobuf, GraphQL, internal REST — for direct HTTP extraction without browser overhead.

Anti-Bot Evasion

Bypass Cloudflare, Shape Security, Incapsula, DataDome, and other protection systems with TLS fingerprinting and stealth techniques.

Scalable Data Pipelines

Async scraping architectures that extract 100K+ records per week with automatic retry, proxy rotation, and deduplication.

Open-Source Tooling

Production-ready pip packages anyone can install and run. Clean APIs, thorough docs, battle-tested in production.

// Featured Projects

GoogleMapsCollector

Reverse-engineers Google Maps' internal protobuf API. Extract business data at scale — 100K+ records/week, no browser needed.

pip install gmaps-extractor
ProtobufPythonaiohttp

MetaAdsCollector

Reverse-engineers Meta's private GraphQL API. Full Ad Library extraction with zero API keys required.

pip install meta-ads-collector
GraphQLPythonMeta API

gov-websites-collector

Collects business registrations & professional licenses from 28 US state government websites. Anti-detect browser + ISP proxy support.

pip install gov-websites-collector
PlaywrightAnti-DetectISP Proxies

generic-scraper-1

LLM-powered structured data extraction from any website. Define fields in plain English, get clean data — no CSS selectors needed.

LLMPythonAI Extraction

linkedin-profile-extractor

LinkedIn profile data extraction with anti-detection measures. Stealth browser automation for reliable, undetected scraping.

SeleniumAnti-DetectionStealth

// Tech Stack

Languages

Python JavaScript TypeScript

Scraping

Playwright Selenium Scrapy aiohttp Camoufox

Reverse Engineering

Protobuf GraphQL TLS Fingerprinting HTTP/2

Backend

FastAPI Node.js React

Databases

PostgreSQL MongoDB Redis

Infrastructure

Docker Proxy Rotation ISP Proxies

// Get In Touch

Have a data extraction challenge? Need to reverse-engineer an API? Let's talk.