
Third Space Group AI
friend-affinity neural net · Keras + Postgres
An end-to-end affinity pipeline: Postgres feature store, PL/pgSQL feature extraction, and a trained Keras MLP scoring user pairs 0 to 100 for group matchmaking.
No live deployment; full source and the trained model are on GitHub.
Built by a two-person team over Nov to Dec 2023 for Third Space, a platform whose core idea is facilitating social interaction between people it thinks would be compatible friends. The data layer is a seven-table Postgres schema (users, hobbies, user_hobbies, friends, affinities, groups, group_members) seeded with 1,000 synthetic users, 16 hobbies, 7,238 friendship rows, and 3 groups; a PL/pgSQL function, get_user_features, computes the six pairwise features for any user pair in the database. Training labels come from an explicit relationship heuristic (buddies score 100, friends 50, otherwise 0), with a random-label generator kept for comparison. The model is a Keras MLP (Dense 128 and 64 with 0.3 dropout and L2 regularization, sigmoid output rescaled to 0 to 100) trained with Adam on MSE/MAE under early stopping, learning-rate reduction, and checkpointing, and the trained weights are committed so a CLI loop can serve interactive affinity predictions without retraining. The honest read: with deterministic heuristic labels the network learns to reproduce that rule from the remaining features, so the substance of the project is the end-to-end loop from feature store to served prediction.
- Python
- Keras
- NumPy
- scikit-learn
- PostgreSQL
- PL/pgSQL
- psycopg2
- Matplotlib
- Team
- 2 devs · 56 commits
- Pairwise features
- 6 (PL/pgSQL extractor)
- Network
- Keras MLP 128 → 64 → 1
- Synthetic graph
- 1,000 users · 7,238 friend rows
What I'd improve
The affinity labels come from a hand-written relationship heuristic (buddies 100, friends 50, otherwise 0), so the network can only learn to reproduce that rule from the remaining features. The real upgrade is label quality: scores derived from actual interaction outcomes, a proper held-out evaluation beyond MAE, and treating the 0/50/100 label space as ordinal classification rather than regression.