Spoiler-Free Race Quality Scoring for Formula 1

A Data-Driven Approach to Viewer Recommendation

Vishal Soni

shalliwatchtherace.com

Technical Report · v1.0 · May 2026

Abstract

We present a lightweight scoring system that classifies completed Formula 1 race sessions as either Watch (worth viewing in full) or Highlights (adequately summarised by a short package), without revealing finishing positions, driver names, or any other spoiler-bearing information. Four telemetry-derived signals—race incident severity, ambient weather conditions, on-track position dynamics, and pit strategy variance—are combined into a normalised composite score from which a binary verdict is derived. An optional personalisation layer upgrades a Highlights verdict to Watch for users who follow a driver or constructor that was notably active in a session, but never downgrades a Watch verdict. All data are sourced from the public OpenF1 telemetry API.

1. Introduction

Formula 1 race broadcasts run for approximately two hours. Many viewers are unable or unwilling to commit this time without prior knowledge of whether a race was competitive. However, consulting conventional reviews or social media invariably reveals results, creating a binary choice between spoilers and blind viewing.

This report describes the scoring algorithm underpinning Shall I Watch The Race?, a tool that resolves this tension by producing a verdict derived exclusively from process signals: how many times the safety car was deployed, whether it rained, how much movement occurred in the running order, and how varied the pit stop strategies were. None of these signals encodes a result. Positions are treated as anonymous integers; finishing order is never queried.

The system is intentionally simple. Complex machine-learning approaches would require labelled training data (subjective viewer ratings) and would be less interpretable. The rule-based approach described here is fully auditable: every point awarded maps directly to a specific, observable race event.

2. Data Sources

All race telemetry is retrieved from the OpenF1 public API (openf1.org). Four endpoints are consumed per session:

Endpoint	Data provided
/race_control	Safety car, virtual safety car, and red flag events with timestamps
/weather	Rainfall (mm), track temperature, and humidity at 1-minute intervals
/position	Per-driver position values sampled every few seconds throughout the session
/pit	Individual pit stop records with driver number and lap number

Session metadata (calendar round, official race name) is sourced from the Jolpica/Ergast compatibility API. This is used solely for display purposes and does not influence scoring.

3. Scoring Methodology

The total raw score $R$ is the sum of four independent sub-scores, each capped at a fixed maximum:

R \;=\; S_{\text{incidents}} + S_{\text{weather}} + S_{\text{position}} + S_{\text{pit}}

Signal	Cap (pts)	% of ceiling
Race incidents	25	38%
Weather	15	23%
Position dynamics	40	62%
Pit strategy	10	15%
Theoretical maximum	90	—

3.1 Race Control Events

Interruptions to normal racing—safety car (SC) deployments, virtual safety car (VSC) periods, and red flags—reliably indicate incidents that reshape the field. Let $n_{SC}$ , $n_{VSC}$ , and $n_{RF}$ denote the count of each event type observed in the race control feed:

S_{\text{incidents}} \;=\; \min\!\Bigl[\,\min(10\,n_{SC},\;20) \;+\; \min(4\,n_{VSC},\;8) \;+\; \min(12\,n_{RF},\;12),\;\;25\Bigr]

Safety cars receive the highest per-event weight because they frequently cause pit stop divergence and position reshuffling. The cap at 25 prevents a race with many safety cars from scoring highly on incidents alone—on-track action must also be present.

3.2 Weather Conditions

Let $N$ denote the total number of weather readings for a session and define the wet-readings fraction:

r_w \;=\; \frac{\bigl|\{i : \mathrm{rainfall}_i > 0\}\bigr|}{N}

A piecewise function maps $r_w$ to a rainfall score $f(r_w)$ :

f(r_w) \;=\; \begin{cases} 15 & r_w \;\geq\; 0.40 \\ 7 & 0.10 \;<\; r_w \;<\; 0.40 \\ 0 & r_w \;\leq\; 0.10 \end{cases}

The 10% lower bound prevents a brief shower—common at circuits such as Interlagos—from receiving the same score as a genuinely wet race. An additional 5-point bonus $g(\Delta T, r_w)$ is awarded when track temperature swings exceed 8°C under dry conditions, capturing races where a cooling track changes tyre behaviour mid-race:

g(\Delta T,\, r_w) \;=\; \begin{cases} 5 & \Delta T > 8 \;\wedge\; r_w = 0 \\ 0 & \text{otherwise} \end{cases}

S_{\text{weather}} \;=\; \min\!\bigl[f(r_w) + g(\Delta T, r_w),\;\;15\bigr]

3.3 On-Track Position Dynamics

The position telemetry stream is a time-ordered sequence of $(d, t, p)$ triples where $d$ is a driver number, $t$ a timestamp, and $p$ a position integer. For each driver $d \in D$ , define:

$p_d^{(0)}$ — first recorded position (grid position proxy)
$p_d^{(T)}$ — last recorded position (finishing position proxy)
$p_d^{*}$ — maximum position value reached (worst point in race)

A driver is a forward mover if they finished at least 5 places higher than they started (net gain only; retirements, which drop drivers to the back, do not qualify):

\phi(d) \;=\; \mathbf{1}\!\left[p_d^{(0)} - p_d^{(T)} \;\geq\; 5\right]

A driver is a recovery drive if they fell 10 or more positions from their grid slot yet finished within 2 places of where they started—indicating a dramatic mid-race comeback that the forward-mover signal misses (net change ≈ 0):

\rho(d) \;=\; \mathbf{1}\!\left[p_d^{*} - p_d^{(0)} \;\geq\; 10 \;\wedge\; \bigl|p_d^{(T)} - p_d^{(0)}\bigr| \;\leq\; 2\right]

Let $M$ be the total mover count and $L$ the number of unique drivers who held first position at any point during the session:

M \;=\; \sum_{d \in D}\!\left(\phi(d) \;\vee\; \rho(d)\right), \qquad L \;=\; \bigl|\!\bigl\{d \in D : \exists\, t \text{ s.t. } p_d^{(t)} = 1\bigr\}\bigr|

S_{\text{position}} \;=\; \min\!\bigl(6L \;+\; 2M,\;\;40\bigr)

Lead changes ( $L$ ) are weighted three times higher than mover counts ( $M$ ) because a contested lead is the strongest indicator of a race worth watching in full.

3.4 Pit Strategy Variance

Let $s_d$ be the number of pit stops recorded for driver $d$ and $D$ the set of drivers with at least one recorded stop:

\bar{s} \;=\; \frac{1}{|D|}\sum_{d \in D} s_d, \qquad s_{\max} \;=\; \max_{d \in D}\, s_d

S_{\text{pit}} \;=\; \begin{cases} 10 & s_{\max} \;\geq\; 3 \;\vee\; \bar{s} \;>\; 2 \\ 0 & \text{otherwise} \end{cases}

This signal captures races where genuinely varied strategy—some drivers taking two stops while others extend on one—creates diverging storylines. It does not reward uniformly high stop counts, since a race in which every driver pits three times on a standard degradation schedule is not more exciting than a conventional two-stop race.

4. Score Normalisation and Verdict Assignment

The raw score $R$ is normalised against a calibration ceiling $C = 65$ (rather than the theoretical maximum of 90) and rounded to one decimal place:

\sigma \;=\; \min\!\left(10,\;\;\left\lfloor\frac{R}{C} \times 100\right\rfloor \big/ 10\right)

The ceiling of 65 was calibrated so that a genuinely exciting race—containing a safety car deployment, multiple lead changes, and solid overtaking—scores approximately 6–8 out of 10. Setting $C = 90$ (the theoretical maximum) compressed scores into a narrow range and made separation between races difficult. The binary verdict is then:

V \;=\; \begin{cases} \textit{Watch} & \sigma \;\geq\; 4.0 \\ \textit{Highlights} & \sigma \;<\; 4.0 \end{cases}

The threshold of 4.0 was chosen empirically. At this level, a race requires at least one safety car or a modest amount of on-track overtaking to be recommended. Processional races, even those with brief showers, score below the threshold.

5. Personalisation Layer

During scoring, the algorithm maintains a notable driver set $N \subseteq D$ :

N \;=\; \bigl\{d \in D : p_d^{(t)} = 1 \text{ for some } t\bigr\} \;\cup\; \bigl\{d \in D : \phi(d) \;\vee\; \rho(d)\bigr\}

For a user following driver set $D_u$ (which may contain a single driver number or all driver numbers belonging to a followed constructor), the personalised verdict is:

\hat{V}(u) \;=\; \begin{cases} \textit{Watch} & V = \textit{Highlights} \;\wedge\; D_u \cap N \;\neq\; \emptyset \\ V & \text{otherwise} \end{cases}

Personalisation is strictly one-directional: a Highlights verdict may be upgraded to Watch, but a Watch verdict is never downgraded. The rationale is asymmetric: the cost of missing a race your favourite driver featured in is higher than the cost of watching an unremarkable race. An upgraded verdict is labelled for you in the interface to communicate that it reflects personal preference rather than general race quality.

6. Data Quality and Cancelled Race Detection

The OpenF1 session feed occasionally contains phantom sessions—test entries or sessions that were created administratively but never ran. Two independent filters are applied before a session is eligible for scoring:

Jolpica calendar gate. Each session must match a round in the official Jolpica/Ergast calendar for the requested year. Matching uses a closest-date approach with an 8-day window; sessions outside this window are dropped.
Lap-data check (recent sessions only). For sessions completed within the last 60 days, the scoring system verifies that at least 10 drivers have a recorded lap-1 entry. A cancelled session that happens to fall within the calendar window (e.g., a wet-weather cancellation at a circuit running back-to-back weekends) has zero lap records and is therefore excluded. This check is limited to recent sessions to avoid issuing 20+ concurrent requests when loading historical seasons.

All API responses are cached at the infrastructure layer: session lists for 2 minutes, telemetry for 1 hour, and driver grids for 1 hour with a 24-hour stale-while-revalidate window.

7. Limitations and Future Work

Several aspects of race excitement are not captured by the current scoring system:

Championship context. A last-lap overtake between title contenders carries more significance than the same move between mid-field runners. The algorithm has no awareness of standings.
Close battles without overtakes. Wheel-to-wheel racing in the final laps that does not result in a position change goes undetected.
Grid-position signal quality. First-recorded position is used as a grid-position proxy. Drivers who start from the pit lane or are penalised before lap 1 may have an inaccurate starting reference, slightly over-reporting forward movement.
Strategy interpretation. The pit-strategy signal rewards high stop counts but does not model undercut/overcut sequences or tyre-compound divergence, which require additional telemetry not currently in scope.
Sprint sessions. The same algorithm is applied to sprint races, which are shorter and structurally different. No scoring adjustments are made for session length.

Future iterations may incorporate a continuous-overtake count (if made available through the OpenF1 API) and a nearest-rival proximity signal derived from GPS coordinates to address the close-battle limitation.

References

OpenF1 Project. OpenF1 — Free and Open-Source F1 Data. openf1.org, 2024.
Jolpica. Jolpica-F1 API — Ergast Compatibility Layer. api.jolpi.ca/ergast, 2024.
Fédération Internationale de l'Automobile. 2025 Formula One Sporting Regulations. FIA, Geneva, 2025.