PROFESSIONAL SUMMARY
Technical professional specializing in Apple Silicon Architecture, Data Engineering, and Machine Learning. Proven expertise in building scalable ML pipelines, processing Tera-bytes of data, and optimizing systems for real-time performance. Highly productive in remote, asynchronous environments with exceptional written communication skills (bilingual French/English). Seeking remote position to leverage technical expertise in AI/ML, data engineering, and system architecture.
Key Differentiators:
- ⭐⭐⭐⭐⭐ Apple Silicon Architecture (M4 Max optimization, CoreML, MLX - very rare expertise)
- ⭐⭐⭐⭐ Data Engineering (Polars, Parquet, Dollar Bars, massive data processing - Tera-bytes)
- ⭐⭐⭐⭐ Machine Learning (Meta-Labeling, scientific validation, feature engineering)
- ⭐⭐⭐⭐ Computer Vision (OCR, YOLO, SAM2 - real-time optimization)
- ⭐⭐⭐⭐ Feature Engineering (FracDiff, TDA, Microstructure - advanced techniques)
CORE TECHNICAL COMPETENCIES
AI/ML & Data Engineering
- Machine Learning: Meta-Labeling, XGBoost, CatBoost, Feature Engineering (FracDiff, TDA, Microstructure)
- Data Engineering: Polars (Lazy API), Parquet, Dollar Bars, massive data processing (Tera-bytes)
- Scientific Validation: CPCV (Combinatorial Purged Cross-Validation), DSR (Deflated Sharpe Ratio)
- Feature Engineering: FracDiff, Topological Data Analysis (TDA), Microstructure (VPIN, OBI), Pattern Detection
Architecture & Systems
- Apple Silicon: M4 Max optimization, CoreML, MLX, Metal Performance Shaders (MPS)
- Local-First Architecture: SSD external storage, memory management (36GB RAM), scalable pipelines
- Blockchain: Solana RPC, Jito bundles, Borsh decoding, flash-loan simulation
- Computer Vision: OCR, YOLO, SAM2, real-time image processing
Languages & Tools
- Languages: Python (advanced), SQL (basics), Rust (basics)
- Data Processing: Polars (Lazy API), Pandas, NumPy, Parquet
- ML Frameworks: XGBoost, CatBoost, Scikit-learn
- Data Engineering: ETL Pipelines, Data Processing, Dollar Bars, Feature Engineering
- Infrastructure: Local-First Architecture, Git, GitHub, Docker (basics)
- Version Control: Git, GitHub
- Communication: Slack, Jira, Notion, Email
- Documentation: Markdown, Technical Writing
Work Style
- ✅ Remote-First: 100% remote, asynchronous communication
- ✅ Text-Based: Excellent written communication (English/French)
- ✅ Self-Directed: Autonomous, proactive, results-oriented
- ✅ Documentation: Strong technical documentation skills
PROFESSIONAL EXPERIENCE
ML Engineer / Data Engineer (Personal Projects)
Remote | Independent Projects | 2023 - Present
Built and orchestrated complex ML/data engineering systems from scratch:
Renaissance V3+ - Trading ML Pipeline
- Built scalable ETL pipeline processing 2.5+ TB of crypto market data (Binance, CCXT)
- Implemented 52 engineered features: 12 FracDiff features, 8 TDA features, 15 Microstructure features (VPIN, OBI), 17 Pattern Detection features
- Developed Meta-Labeling architecture (M1: XGBoost + M2: CatBoost) with scientific validation (CPCV, DSR)
- Optimized for Apple Silicon (M4 Max) achieving 87ms average latency (P95: 120ms) for real-time inference
- Architecture: Polars Lazy API, Parquet (zstd compression), Dollar Bars re-sampling, SSD external storage, 36GB RAM management
Python
Polars
Parquet
XGBoost
CatBoost
Asyncio
CCXT
ETL Pipeline
Poker GTO-RT - Real-Time Poker Analysis
- Developed real-time poker analysis system with OCR, YOLO, SAM2 for table/card detection
- Implemented GTO solver (CFR++, Monte-Carlo) for decision support
- Optimized pipeline for 380ms average latency (P95: 450ms) on Apple Silicon (CoreML)
- Built calibrated OCR system achieving 94% accuracy on poker table parsing
Python
CoreML
YOLO
SAM2
Computer Vision
OCR
Rust
Titan DeFi - Blockchain Scanner & Simulator
- Built Solana blockchain scanner processing 302k accounts with Borsh decoding
- Developed flash-loan simulator with Jito bundle integration (atomic transactions)
- Created risk engine for liquidation detection and management
- Architecture: Hybrid code+vault, secure transaction handling
Python
Solana RPC
Jito
Borsh
Flash-loan stack
GlassBox - SEC EDGAR Data Ingestion
- Built data ingestion system for SEC EDGAR (580+ companies)
- Implemented parsing pipeline (HTML → structured JSON)
- Architecture: Scalable, automated, local-first
Python
Web Scraping
Data Parsing
JSON
SQL ETL Pipeline - Production ETL System
- Built production-ready ETL pipeline with Extract, Transform, Load workflow
- Implemented SQL transformations (CTEs, window functions, aggregations)
- Developed data validation framework with schema and quality checks
- Architecture: Modular design, error handling, logging, retry logic
Python
SQL
SQLAlchemy
Pandas
ETL Pipeline
Customer Service & Administrative Coordinator
Public Housing Services, Toulouse, France | 2005 - 2015
- Managed customer inquiries and service requests for 800+ residents across multiple properties
- Coordinated administrative processes including applications, renewals, and documentation management
- Resolved complex tenant issues through professional written and verbal communication (French/English)
- Maintained accurate databases and generated regular reports for management review
- Processed high-volume data entry with 99%+ accuracy rate
- Trained new staff on customer service protocols and administrative procedures
Key Skills: Written communication, data management, problem-solving, cross-functional collaboration
EDUCATION
Self-Taught | 2020 - Present
Technical expertise developed through hands-on projects, online resources, and continuous learning. Focused on Data Engineering, Machine Learning, and Apple Silicon Architecture.
LANGUAGES
- French: Native proficiency
- English: Professional working proficiency (written and spoken) - Excellent for text-based communication
ADDITIONAL INFORMATION
- US Permanent Resident (Green Card) - authorized to work in USA
- Available for immediate start
- Flexible schedule - can work across time zones
- Experienced remote worker with reliable home office setup (Mac Studio M4 Max, 36GB RAM)
- Work Authorization: Authorized to work in USA (no visa sponsorship required)
KEY PROJECTS (GitHub Portfolio)
Renaissance V3+
Complete ML trading pipeline with advanced data engineering, feature engineering (FracDiff, TDA), Meta-Labeling, and scientific validation (CPCV, DSR). Processes Tera-bytes of crypto data.
GitHub: github.com/fabienpierret/renaissance-v3 | Portfolio: fabienpierret.github.io
Poker GTO-RT
Real-time poker analysis system with computer vision and GTO solver.
GitHub: github.com/fabienpierret/poker-gto-rt | Portfolio: fabienpierret.github.io/projects/poker
Titan DeFi
Blockchain scanner and flash-loan simulator for Solana.
GitHub: github.com/fabienpierret/titan-defi | Portfolio: fabienpierret.github.io/projects/titan
SQL ETL Pipeline
Production-ready SQL ETL pipeline with data validation, transformation, and loading. Demonstrates SQL proficiency and ETL best practices.
GitHub: github.com/fabienpierret/sql-etl-pipeline | Portfolio: fabienpierret.github.io
Note: This CV is optimized for remote tech positions (Data Engineer, ML Engineer, AI Engineer). It highlights technical expertise while maintaining professional experience. Adapt sections based on specific job requirements.