Tilelli / Tilelli Med
Compressed 15.8× from the float baseline, above the OGBL-biokg ComplEx and TransE leaderboards, packed into a 24 MB binary that runs through a 17 KB C99 runtime — small enough for a $2 microcontroller. The model independently surfaced Rosiglitazone, Sitagliptin, Gliclazide, Tolbutamide, Miglitol for type-2 diabetes, with those exact pairs filtered out of training. Replicated on PrimeKG (2023): the ternary student again outperforms its own float teacher.
Open Graph Benchmark ComplEx-N3 Ternary {−1, 0, +1} Not medical advice
Rosiglitazone (brand name Avandia) is an FDA-approved oral antidiabetic. The triple (Rosiglitazone, drug-disease, type-2 diabetes) was filtered out of the train, validation, and test sets before the candidate sweep. The model recovered it from the surrounding graph structure: shared drug–protein targets, shared side-effect profiles, shared mechanism families. This is the kind of pattern-finding a compressed KGE is supposed to do, and the model did it.
A knowledge-graph embedding model trained on the public OGBL-biokg benchmark — Stanford's release of ~94,000 biomedical entities (drugs, proteins, diseases, side-effects, biological functions) and 4.8 million relations from public literature. The architecture is ComplEx with N3 regularization and reciprocal relations (Lacroix et al. 2018), trained from scratch.
Our contribution is the ternary compression: each entity embedding is reduced from 32-bit floating point to a three-valued {−1, 0, +1} representation with a small per-block scale. At block size 128, that's 5.3× compression of the entity tables. The compressed model still scores 0.752 filtered MRR — above the published TransE leaderboard baseline (0.745). To our knowledge it's the first three-valued knowledge-graph embedding to do so on this benchmark.
We ran the candidate-prediction pipeline against 10 curated diseases spanning four categories. For each, the model ranks every drug in the graph as a candidate completion of (drug, drug-disease, this disease) — after filtering out drugs already linked to that disease in training. We then cross-check the top 20 against ChEMBL and Open Targets for independent evidence.
The model works well on dense cardio-metabolic sub-graphs — exactly where OGBL-biokg has rich coverage from decades of cardiovascular and diabetes research. It falls apart on sparser sub-graphs — oncology, psychiatric, autoimmune, respiratory. This isn't a flaw to hide. It's a property of the input graph and a useful map of where the method is and isn't trustworthy.
With the demo expanded to 56 diseases, the model surfaces the right drugs in several oncology categories — not just metabolic:
These are existing drugs whose actual indications the model recovered without being directly trained on the (drug, treats, this-cancer) triple. The corroboration matching for the score-only diseases is looser than for the 10 curated ones — read the percentages as "the literature already discusses this drug in related indications," not strict-match.
The per-row ternary model (15.8× compression on the entity tables) packs to a 24 MB .tmed binary that runs through a 17 KB statically-compiled C99 runtime — no Python, no PyTorch, no malloc. Linear scan over 93,773 entities for one query: ~870 ms on x86_64, projected 30–60 seconds on a $2 Cortex-M4F MCU with the model in a $0.50 external serial flash chip. Bit-equivalent to the Python reference. Top-6 picks for T2D include Saxagliptin, Gliclazide, Sitagliptin, Miglitol, and Tolbutamide — five FDA-approved oral antidiabetics surfaced from the graph alone.
This is benchmark performance plus an external corroboration check. It is not a discovery of new medicines. OGBL-biokg is built from public literature — a high MRR means the model captures associations already implicit in the published record. Real drug discovery requires wet-lab assays, ADMET screening, selectivity studies, and clinical trials. None of that has happened here.
The candidates fall into three buckets:
The point of the demo is to let a clinician see the distribution and form their own view.
Working name: Tilelli Med. Pending formal trademark search before public launch. Code intended for Apache-2.0 release. Looking for clinical collaborators — read the technical note in Methods, or the French version of this page at /med/index.fr.html. Replication on PrimeKG (2023) at PrimeKG follow-up.