Network Experiments & Interference
When you give a user a new feature, they tell their friends. When you change a seller's pricing, it ripples through the buyers they serve. When you treat a node in a social network, the untreated nodes in its neighborhood are no longer a clean control. Network interference — the violation of SUTVA through social or marketplace connections — is one of the most common and most underappreciated sources of bias in tech industry experiments.
Theory
Bipartite marketplace: seller treatment spills over to shared buyers
Network interference occurs when unit 's outcome depends not just on its own assignment but on the assignments of its neighbors. This invalidates the standard potential outcomes framework, which assumes . To reason carefully about interference, we need the exposure mapping framework.
Exposure mappings. An exposure mapping compresses the full assignment vector into a summary of the relevant exposure for unit . For example:
- In a social network: — my assignment and the fraction of neighbors who are treated
- In a bipartite marketplace:
The key insight is that we replace with , which is a manageable object to reason about and estimate.
Why it had to be this way. Without an exposure mapping, we can only estimate ITT (intent-to-treat) effects — the average over all the messy ways interference contaminates the control group. With an exposure mapping, we can define and estimate: direct effects (unit treated, neighbors not), indirect effects (unit not treated, neighbors treated), and total effects. These decompositions are essential for understanding mechanism and for correcting bias.
Randomization strategies for networked data:
-
Cluster randomization (ego-network): Randomly assign entire friendship clusters — ego and all alters — to treatment or control. Eliminates within-cluster interference but ignores between-cluster spillovers.
-
Graph clustering: Partition the social graph into dense communities (Louvain, spectral clustering) and randomize at the community level. Reduces interference because few edges cross community boundaries.
-
Bipartite randomization: In two-sided markets, randomize on one side (sellers) and measure the other (buyers). Buyers in the same exposure zone (all sellers treated, all untreated, or mixed) are analyzed separately.
Estimands under interference:
- Direct effect (DE): — treated vs. control with untreated neighbors
- Indirect/spillover effect (IE): — neighbor treatment on untreated unit
- Total effect (TE): — from full control to full treatment
Walkthrough
Scenario: A marketplace has 10,000 sellers and 100,000 buyers. We test a new seller pricing tool. Sellers are randomized to treatment/control; we measure buyer outcomes.
Step 1: Compute buyer exposure mapping.
import pandas as pd
import numpy as np
def compute_buyer_exposure(
buyer_seller_edges: pd.DataFrame, # columns: buyer_id, seller_id
seller_assignments: pd.Series, # index: seller_id, values: 0/1
) -> pd.DataFrame:
"""Classify each buyer's exposure based on their sellers' assignments."""
merged = buyer_seller_edges.merge(
seller_assignments.rename('treatment'),
left_on='seller_id', right_index=True,
)
stats = merged.groupby('buyer_id').agg(
n_sellers=('treatment', 'count'),
n_treated=('treatment', 'sum'),
).reset_index()
stats['treat_frac'] = stats['n_treated'] / stats['n_sellers']
def exposure_type(row):
if row['treat_frac'] == 1.0:
return 'direct'
elif row['treat_frac'] == 0.0:
return 'control'
return 'indirect'
stats['exposure_type'] = stats.apply(exposure_type, axis=1)
return statsStep 2: Estimate direct and spillover effects.
def estimate_network_effects(
buyer_outcomes: pd.DataFrame, # columns: buyer_id, outcome
buyer_exposure: pd.DataFrame,
) -> dict:
"""Estimate direct, indirect, and total effects."""
df = buyer_outcomes.merge(
buyer_exposure[['buyer_id', 'exposure_type']], on='buyer_id'
)
means = df.groupby('exposure_type')['outcome'].agg(['mean', 'std', 'count'])
direct_effect = means.loc['direct', 'mean'] - means.loc['control', 'mean']
indirect_effect = means.loc['indirect', 'mean'] - means.loc['control', 'mean']
se_direct = np.sqrt(
means.loc['direct', 'std']**2 / means.loc['direct', 'count'] +
means.loc['control', 'std']**2 / means.loc['control', 'count']
)
return {
'direct_effect': round(direct_effect, 6),
'indirect_effect': round(indirect_effect, 6),
'total_effect': round(direct_effect + indirect_effect, 6),
'se_direct': round(se_direct, 6),
}Step 3: Cluster-robust standard errors. Because buyers sharing sellers are correlated, cluster SEs at the seller level.
Analysis & Evaluation
Where your intuition breaks. The natural reaction to interference is to build bigger clusters — assign entire friend groups together. Bigger clusters do reduce interference, but they also reduce the number of independent observations. A design with 10 large clusters has the statistical power of roughly 10 observations, not 10,000. The optimal cluster size balances interference reduction against effective sample size reduction — and this optimum is often surprisingly small clusters (5–20 users), accepting modest interference for far greater power.
| Design | Interference | Effective | Use when |
|---|---|---|---|
| Individual randomization | High | users | No network, SUTVA valid |
| Ego-cluster | Low | clusters | Dense friendship graphs |
| Graph clustering (Louvain) | Very low | communities | Sparse graphs, few cross-edges |
| Bipartite (seller-side) | Structured | sellers | Two-sided marketplaces |
| Full holdout | Zero | markets | Only option for full-network effects |
When to give up and use a holdout. If the network is so dense that no reasonable clustering achieves cross-cluster edge fraction below 5%, a bipartite experiment or full geo holdout is more honest than reporting an "experiment result" with heavy interference contamination.
Interference biases toward zero. When treatment spills into the control group, control outcomes improve, compressing the apparent effect. This means interference typically causes underestimation of true treatment effects — network experiments are conservative, not anti-conservative.
Production-Ready Code
"""
Network experiment pipeline.
Graph clustering, exposure mapping, spillover-adjusted
estimation, and cluster-robust SEs.
"""
from __future__ import annotations
from dataclasses import dataclass
import warnings
import numpy as np
import pandas as pd
import networkx as nx
from scipy.stats import norm
@dataclass
class NetworkExperimentConfig:
n_clusters_target: int = 500
max_cross_edge_frac: float = 0.10
randomization_seed: int = 42
def cluster_graph_for_experiment(
edges: list[tuple[int, int]],
config: NetworkExperimentConfig,
) -> dict[int, int]:
"""Partition social graph into clusters for randomization.
Returns node -> cluster_id mapping. Warns if cross-cluster
edge fraction exceeds threshold.
"""
G = nx.Graph()
G.add_edges_from(edges)
communities = nx.community.greedy_modularity_communities(G)
node_to_cluster: dict[int, int] = {}
for cid, community in enumerate(communities):
for node in community:
node_to_cluster[node] = cid
cross = sum(
1 for u, v in G.edges()
if node_to_cluster.get(u) != node_to_cluster.get(v)
)
cross_frac = cross / max(G.number_of_edges(), 1)
if cross_frac > config.max_cross_edge_frac:
warnings.warn(
f"Cross-cluster edge fraction {cross_frac:.1%} > "
f"{config.max_cross_edge_frac:.0%}. "
"Interference contamination may be substantial."
)
return node_to_cluster
def assign_clusters(
cluster_ids: list[int],
config: NetworkExperimentConfig,
) -> dict[int, int]:
"""Randomly assign clusters to treatment (1) or control (0)."""
rng = np.random.default_rng(config.randomization_seed)
unique_clusters = sorted(set(cluster_ids))
n = len(unique_clusters)
perm = rng.permutation(n)
return {cid: int(perm[i] < n // 2) for i, cid in enumerate(unique_clusters)}
def spillover_adjusted_estimator(
outcomes: pd.DataFrame, # columns: unit_id, outcome, exposure_type, cluster_id
cluster_col: str = 'cluster_id',
) -> dict:
"""Estimate direct and spillover effects with cluster-robust SEs."""
def cluster_mean_se(mask: pd.Series) -> tuple[float, float]:
sub = outcomes[mask]
cluster_means = sub.groupby(cluster_col)['outcome'].mean()
return float(cluster_means.mean()), float(cluster_means.sem())
direct_m, direct_se = cluster_mean_se(outcomes['exposure_type'] == 'direct')
control_m, control_se = cluster_mean_se(outcomes['exposure_type'] == 'control')
indirect_m, indirect_se = cluster_mean_se(outcomes['exposure_type'] == 'indirect')
de = direct_m - control_m
de_se = np.sqrt(direct_se**2 + control_se**2)
ie = indirect_m - control_m
ie_se = np.sqrt(indirect_se**2 + control_se**2)
def z_pval(effect: float, se: float) -> float:
return float(2 * norm.sf(abs(effect / se))) if se > 0 else float('nan')
return {
'direct_effect': round(de, 6),
'direct_se': round(de_se, 6),
'direct_pvalue': round(z_pval(de, de_se), 4),
'indirect_effect': round(ie, 6),
'indirect_se': round(ie_se, 6),
'indirect_pvalue': round(z_pval(ie, ie_se), 4),
'total_effect': round(de + ie, 6),
}Enjoying these notes?
Get new lessons delivered to your inbox. No spam.