ML / Data ScienceCompleted2023

Third Space Group AI

friend-affinity neural net · Keras + Postgres

An end-to-end affinity pipeline: Postgres feature store, PL/pgSQL feature extraction, and a trained Keras MLP scoring user pairs 0 to 100 for group matchmaking.

ML / Data Science · PreviewThird Space Group AIfriend-affinity neural net · Keras + Postgres

No live deployment; full source and the trained model are on GitHub.

Built by a two-person team over Nov to Dec 2023 for Third Space, a platform whose core idea is facilitating social interaction between people it thinks would be compatible friends. The data layer is a seven-table Postgres schema (users, hobbies, user_hobbies, friends, affinities, groups, group_members) seeded with 1,000 synthetic users, 16 hobbies, 7,238 friendship rows, and 3 groups; a PL/pgSQL function, get_user_features, computes the six pairwise features for any user pair in the database. Training labels come from an explicit relationship heuristic (buddies score 100, friends 50, otherwise 0), with a random-label generator kept for comparison. The model is a Keras MLP (Dense 128 and 64 with 0.3 dropout and L2 regularization, sigmoid output rescaled to 0 to 100) trained with Adam on MSE/MAE under early stopping, learning-rate reduction, and checkpointing, and the trained weights are committed so a CLI loop can serve interactive affinity predictions without retraining. The honest read: with deterministic heuristic labels the network learns to reproduce that rule from the remaining features, so the substance of the project is the end-to-end loop from feature store to served prediction.

Python
Keras
NumPy
scikit-learn
PostgreSQL
PL/pgSQL
psycopg2
Matplotlib

Team: 2 devs · 56 commits
Pairwise features: 6 (PL/pgSQL extractor)
Network: Keras MLP 128 → 64 → 1
Synthetic graph: 1,000 users · 7,238 friend rows

What I'd improve

The affinity labels come from a hand-written relationship heuristic (buddies 100, friends 50, otherwise 0), so the network can only learn to reproduce that rule from the remaining features. The real upgrade is label quality: scores derived from actual interaction outcomes, a proper held-out evaluation beyond MAE, and treating the 0/50/100 label space as ordinal classification rather than regression.

View sourceProject report (PDF)

Want something like this? Get in touch →

What I'd improve