Skip to content
karim.semaan(open to work)
WorkExperienceAboutSkillsContactResume ↓
AI / ML Engineer

I build machine learning models and the products that ship them.

AI Solutions Lead at BDO (remote, part-time) · MS in AI at Northeastern. I take models from notebook to production.

View work →Get in touch
  • Automated payroll · FMCG/pharma
  • Real-time reporting · 350+ employees
  • 7-model ensemble + 10k-run Monte Carlo
  • AWS ML – Specialty

Open to AI/ML roles + projects

01 / Selected Work

Things I've built.

Production client systems, applied-ML projects, and live in-browser demos. Public work runs in-page or links out; protected client work is previewable on synthetic data, and source and sensitive details stay private.

  • KickCast preview
    01
    Graduate projectLive

    KickCast

    Turns a 21,371-match feature matrix into 10,000-run Monte-Carlo odds for every one of the 104 games of the 2026 World Cup: full probability distributions, not a single guess.

    An end-to-end ML system that predicts international football matches as a 3-class (home/draw/away) problem over a 21,371-match feature matrix. From those probabilities it Monte-Carlo simulates the entire 48-team, 104-game 2026 World Cup. Seven classifiers plus a custom PyTorch net, SHAP explainability, and a live dashboard.

    • Python
    • scikit-learn
    • XGBoost
    • LightGBM
    • PyTorch
    • Optuna
    • 21,371 matches × 38 feat.Feature matrix
    • 7 classifiers + PyTorch netModels
    • 10,000 iterationsMonte Carlo
    Open live↗View source↗
  • Bastion preview
    02
    Protected

    Bastion

    Compresses a 10–12-week security engagement into one workflow: a multi-stage Claude pipeline (~80% token reduction) where every finding cites the evidence it came from and a human signs off.

    A production multi-tenant SaaS that runs the full cybersecurity-maturity assessment lifecycle (evidence vault, a multi-stage Claude gap-analysis pipeline against frameworks like NIST CSF 2.0, scoring, and client-ready reports) with a strict RLS-enforced consultant/client split. Live as an invite-only deployment, with a walkthrough available on request.

    • Next.js 15
    • React 19
    • TypeScript
    • Supabase
    • Claude (Sonnet + Haiku)
    • Pinecone (RAG)
    • 5Frameworks · NIST / CIS / ISO / SOC 2 / CMMC
    • cites evidence + human sign-offEvery finding
    • Supabase RLS (client/internal)Tenant isolation
    Private · request access
  • INTELIPS preview
    03
    Graduate projectCompleted

    INTELIPS

    Auto-labels 25,640 Enron emails with an LLM, then benchmarks five architectures. A weighted ensemble tops out at 72.85% macro-F1 on the real annotated data.

    A graduate NLP project that auto-labels ~25,640 Enron emails into Low/Normal/Critical priority with an LLM served via the Groq API, then trains and benchmarks five architectures: TF-IDF + metadata XGBoost, a context-aware MLP, a BERT + attention model, fine-tuned BERT, and a personalized user-embedding net.

    • Python
    • PyTorch
    • scikit-learn
    • XGBoost
    • BERT / Transformers
    • Groq API (LLM)
    • 72.85% (ensemble)Best real-data F1
    • 72.18% macro-F1XGBoost baseline
    • 25,640 (Enron)LLM-annotated emails
    View source↗
  • PayStream preview
    04
    Protected

    PayStream

    A full-stack payroll-automation platform built for multinational FMCG/pharma clients: a FastAPI + pandas/openpyxl backend that turns vendor payroll exports into bank payment files, GL/variance reports, and accruals, with a React config UI, JWT auth, and Docker deployment.

    • React
    • TypeScript
    • Vite
    • FastAPI
    • Python
    • pandas
    • bank files · GL/variance · accrualsOutputs
    • multi-currency · multi-DBPlatform
    • JWT/bcrypt · RBAC · audit logSecurity
    Private · request access

More work

Showing 8 of 13 projects.

  • Sol-Sniper

    AI Solana trading bot (MCP)

    • Python
    • LightGBM
    • scikit-learn
    Protected
  • KickCast Calibration Study

    isotonic recalibration · 2022 WC holdout

    • Python
    • scikit-learn
    • XGBoost
    Study
  • SEM BM

    Byzantine-chant learning games

    • Next.js 15
    • React 19
    • TypeScript
    Live
  • Eval Gauntlet

    LLM regression testing · in-browser

    • TypeScript
    • React
    • Next.js
    Live
  • Agrovio

    AgTech produce marketplace

    • Next.js
    • React
    • TypeScript
    Protected
  • Gainz Trackerz

    GPT-4 nutrition + fitness tracker

    • Next.js 14
    • TypeScript
    • FastAPI
    Completed
  • Instagram AI Automation

    Multi-model GenAI content pipeline

    • TypeScript
    • Node.js
    • OpenAI GPT-4o
    Completed
  • Birthday Buddy

    AI WhatsApp birthday concierge

    • TypeScript
    • Deno
    • Supabase Edge Functions
    Protected
open source & beyond

Every repository.

The full inventory behind the curated work above: 50 repositories, 8 public.

Showing 12 of 50 repositories.

  • kickcast-worldcup

    World Cup match prediction: leakage-safe 21,371-match pipeline, 7-model ensemble + PyTorch net, calibration-first evaluation, 10k-run Monte Carlo simulation

    Jun 2026Completed↗
  • PAEPS

    Jun 2026WIP↗
  • Achievements

    Jun 2026PrivateWIP
  • Birthday_Buddy

    Jun 2026PrivateWIP
  • sol-sniper

    Jun 2026PrivateWIP
  • TimeSheet_Report

    Real-time time-reporting and analytics platform for 350+ employees at a multinational consulting firm, cutting reporting time ~80%.

    May 2026PrivateWIP
  • Terrace_Redesign

    May 2026PrivateWIP
  • Prayer_App

    May 2026PrivateWIP
  • KickCaster

    Original incremental development repo for KickCast (EECE 5644, Spring 2026) — notebooks, src, cached outputs. Curated public release: karimsemaan/kickcast-worldcup

    ★ 1 starsApr 2026Completed↗
  • Bastion

    Cybersecurity assessment platform built with Next.js, Supabase, and Claude API. Evidence vault with version tracking, LLM-powered gap analysis against NIST CSF 2.0 controls, configurable maturity scoring, conflict detection, and RBAC-enforced client/internal views.

    Apr 2026PrivateWIP
  • EECE5644-ML-Foundations-

    Mar 2026PrivateWIP
  • CS7150-Deep-Learning

    Feb 2026PrivateWIP
02 / Experience

Where I've worked.

A cloud / DevOps foundation (Harvard, Orna Therapeutics, Vocadian) that's now an AI Solutions Lead role shipping production AI for enterprise clients.

  1. Jun 2025 – Present

    (remote, part-time)

    BDO

    AI Solutions Lead

    Achrafieh, Lebanon

    • Built an automated payroll platform now used daily by Fortune 500 FMCG/pharma payroll teams for reporting (the PayStream work shown above).
    • Rolled out a real-time time-reporting & analytics system for 350+ employees, cutting reporting time ~80% (the TimeSheet platform).
  2. Sep 2024 – Jun 2025

    Vocadian

    DevOps Engineer

    Cambridge, MA

    • Migrated secrets to AWS Secrets Manager for SOC 2 compliance, closing off credential-leak risk.
    • Built CI/CD with GitHub Actions (~50% faster deploys) and migrated 10+ databases to RDS with zero data loss.
  3. Jan – Jul 2023

    Orna Therapeutics

    AWS Cloud / DevOps Engineer · Co-op

    Watertown, MA

    • Built AWS Lambda data-validation pipelines that cut data discrepancies ~15% for cleaner reporting.
    • Optimized GitLab CI/CD for Kubernetes and automated infrastructure with Terraform.
  4. Jan – Jul 2022

    Harvard University

    Cloud & IT Analyst · Co-op

    Cambridge, MA

    • Resolved 3,000+ ServiceNow tickets and 600+ Active Directory incidents.
    • Wrote PowerShell automation to deploy 1,000+ machines.
Live demos

Machine learning you can poke at.

Two interactive toys I wrote from scratch (no ML library), running in your browser. Train a small neural network and watch its decision boundary form, or race four optimizers down a loss surface. Real code, not screenshots.

Blue = class 0 · magenta = class 1 · the pale band is the decision boundary the network is still unsure about.

Epoch
0
Loss
–
Accuracy
0%
Dataset
Hidden units: 8 ×2 layers
Learning rate: 0.30
Activation
03 / About

Four years, one direction.

whoami

$ whoami

karim_semaan

$ cat role.txt

AI / ML Engineer

$ cat focus.txt

applied ML · LLM systems · ship to production

$ cat status.txt

open to ai/ml roles + projects

$ cat location.txt

Boston, MA · Beirut, LB

I'm Karim, an AI/ML engineer. I started with a Flask app in 2022 and a stubborn curiosity about how things actually work. That turned into a CS foundation (algorithms, OOD, the Berkeley Pac-Man AI projects), then AWS cloud certifications, then graduate-level Deep Learning, NLP, and Reinforcement Learning.

Today I do the part I love most: taking models out of the notebook. Whether it's an ensemble predicting a World Cup or an LLM reasoning over security controls, I care about the whole path: the math, the data, and the product people end up touching. I've also shipped production systems for clients (security, payroll, and a produce marketplace), which appear here as anonymized previews on synthetic data.

Next, I'm pushing further into applied ML, building on the AWS Machine Learning – Specialty certification and moving toward graduate AI/ML research at Northeastern University, while shipping more of this work as runnable, public demos.

“A model is only as good as the product that delivers it.”
04 / Timeline

How I got here.

  1. 2026Shipped flagship AI products: KickCast (ML prediction) and Bastion (LLM-powered security), plus consumer apps.
  2. 2025AWS Machine Learning – Specialty certified. Graduate AI/ML: Deep Learning, NLP, ML Foundations, Algorithms; first shipped automation tools.
  3. 2024AWS Solutions Architect – Associate certified.
  4. 2023CS fundamentals: Java OOD, web (React/Node), Berkeley Pac-Man AI (search, multiagents, RL). First GPT experiments.
  5. 2022First builds: Flask + databases.
05 / Capabilities

What I work with.

ML / DL
PyTorch, Ensemble methods, Monte Carlo simulation, Deep learning, Reinforcement learning, Model evaluation
LLM / GenAI
Claude API, OpenAI GPT-4o, Agentic / multi-agent (MCP), RAG (Pinecone), Prompt engineering, Groq
Data
Pandas, NumPy, Jupyter, Data pipelines, Feature engineering, Openpyxl
Ship it
Next.js, React, TypeScript, Supabase, Vercel, Flask, AWS
Languages
Python, TypeScript, JavaScript, Java, SQL
Credentials

Certified by AWS.

  • AWS Certified Machine Learning – Specialty

    Amazon Web Services Training and Certification

    Issued
    January 17, 2025
    Expires
    January 17, 2028
    AdvancedVerify ↗
  • AWS Certified Solutions Architect – Associate

    Amazon Web Services Training and Certification

    Issued
    July 30, 2024
    Expires
    July 30, 2027
    IntermediateVerify ↗
Education

Academic foundation.

  • Northeastern University

    M.S. Artificial Intelligence, Machine Learning concentration

    Focus · Deep Learning · Natural Language Processing · Reinforcement Learning

    Prior · B.S. Computer Science & Mathematics · B.S. GPA 3.70 · Dean's List · NU Partners Scholarship

    Boston, MA · Sep 2025 – May 2027 (expected)

    Graduate
  • Graduate & upper-division coursework

    Deep Learning, Natural Language Processing, Machine Learning, Artificial Intelligence, Reinforcement Learning, Algorithms, Theory of Computation, Object-Oriented Design

06 / Contact

Let's put a model in production.

Open to full-time AI/ML roles and freelance projects. The fastest way to reach me is email, or just send a note below.

semaankarim02@gmail.com
LinkedIn ↗GitHub ↗Resume ↓

Your note goes straight to my inbox. No newsletter, no spam.

© 2026 Karim SemaanBuilt with Next.js, Tailwind & Supabase.LinkedIn ↗GitHub ↗