BLP Demand Estimation with PyBLP

Notes on the Berry, Levinsohn & Pakes (1995) random coefficients logit model for demand estimation, implemented via the PyBLP Python package.

BLP solves a fundamental problem in empirical IO: estimating demand for differentiated products using only market-level data (aggregate shares and prices), while allowing for heterogeneous consumer preferences and endogenous prices. The key insight is that unobserved product quality ($\xi_j$) is correlated with price, so OLS is biased — BLP uses a GMM framework with instrumental variables to recover consistent estimates.

We walk through the model using the Nevo (2000) fake cereal dataset — a panel of 24 cereal brands across 94 city-quarter markets.

The Model

Consumer Utility

Consumer $i$ in market $t$ choosing product $j$ gets indirect utility:

\[ u_{ijt} = \underbrace{x_j' \beta_i - \alpha_i \, p_{jt} + \xi_{jt}}_{\text{mean + individual deviation}} + \epsilon_{ijt} \]

where: - $x_j$ = observed product characteristics (sugar, mushiness, brand dummies) - $p_{jt}$ = price - $\xi_{jt}$ = unobserved product quality (the structural error) - $\epsilon_{ijt} \sim$ Type I Extreme Value (logit error) - $\beta_i, \alpha_i$ = consumer-specific taste parameters

The outside option (not buying cereal) has utility $u_{i0t} = \epsilon_{i0t}$, normalized to zero mean.

Random Coefficients

Taste heterogeneity is modeled as:

\[ \beta_i = \underbrace{\bar{\beta}}_{\text{mean taste}} + \underbrace{\Pi \, d_i}_{\text{demographic interaction}} + \underbrace{\Sigma \, \nu_i}_{\text{unobserved heterogeneity}} \]

where $d_i$ are observed demographics (income, age, etc.) and $\nu_i \sim N(0, I)$ captures unobserved taste variation. The parameters to estimate are:

$\bar{\beta}$ — mean tastes (linear parameters)
$\Pi$ — how demographics shift tastes (K $\times$ D matrix)
$\Sigma$ — standard deviations of unobserved heterogeneity (K $\times$ K, often diagonal)

Market Shares

Integrating over the consumer distribution, the predicted market share of product $j$ is:

\[ s_{jt}(\delta, \theta_2) = \int \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} \, dF(\nu_i, d_i) \]

where the mean utility is $\delta_{jt} = x_j' \bar{\beta} - \bar{\alpha} \, p_{jt} + \xi_{jt}$ and the individual deviation is $\mu_{ijt} = x_j' (\Pi d_i + \Sigma \nu_i)$.

This integral has no closed form — it’s approximated via simulation (Monte Carlo draws) or quadrature (product rules).

Berry Inversion

Given nonlinear parameters $\theta_2 = (\Sigma, \Pi)$ and observed shares $S_{jt}$, the contraction mapping of Berry (1994) recovers the mean utility vector $\delta$ that rationalizes the data:

\[ \delta_{jt}^{(r+1)} = \delta_{jt}^{(r)} + \ln S_{jt} - \ln s_{jt}(\delta^{(r)}, \theta_2) \]

This iterates until $s(\delta, \theta_2) = S$ (predicted = observed shares). The structural error is then $\xi_{jt} = \delta_{jt} - x_j' \bar{\beta} + \bar{\alpha} \, p_{jt}$.

GMM Estimation

The endogeneity problem: $\text{Corr}(p_{jt}, \xi_{jt}) \neq 0$ — firms set higher prices for products with higher unobserved quality.

Instruments $Z$ must satisfy $\mathbb{E}[Z' \xi] = 0$. Common choices: - BLP instruments: sums of characteristics of other products by the same firm (cost shifters) - Differentiation instruments: functions of distance to rivals in characteristic space

The GMM objective:

\[ \hat{\theta} = \arg\min_\theta \; \xi(\theta)' Z \, W \, Z' \xi(\theta) \]

where $W$ is a weighting matrix (identity for 1-step, $\widehat{\text{Var}}(Z'\xi)^{-1}$ for 2-step efficient GMM).

Setup & Data Exploration

import pyblp
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

pyblp.options.digits = 2
pyblp.options.verbose = False

print(f'PyBLP version: {pyblp.__version__}')

c:\Users\danny\anaconda3\Lib\site-packages\pandas\core\computation\expressions.py:22: UserWarning: Pandas requires version '2.10.2' or newer of 'numexpr' (version '2.8.7' currently installed).
  from pandas.core.computation.check import NUMEXPR_INSTALLED
c:\Users\danny\anaconda3\Lib\site-packages\pandas\core\arrays\masked.py:56: UserWarning: Pandas requires version '1.4.2' or newer of 'bottleneck' (version '1.3.7' currently installed).
  from pandas.core import (

PyBLP version: 1.1.2

Nevo (2000) Cereal Data

The dataset contains 24 cereal brands observed across 94 city-quarter markets. Each observation is a product-market pair with: - shares: market share of the product - prices: price per serving - sugar: sugar content - mushy: indicator for mushy cereals - demand_instruments0–19: pre-computed BLP instruments - firm_ids: 5 firms (think Kellogg’s, General Mills, etc.) - product_ids: 24 brand identifiers

# Load product-level data
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)

print(f'Shape: {product_data.shape}')
print(f'Markets: {product_data["market_ids"].nunique()}')
print(f'Products: {product_data["product_ids"].nunique()}')
print(f'Firms: {product_data["firm_ids"].nunique()}')
print()

# Key columns
product_data[['market_ids', 'product_ids', 'firm_ids', 'shares', 'prices', 'sugar', 'mushy']].head(10)

Shape: (2256, 30)
Markets: 94
Products: 24
Firms: 5

	market_ids	product_ids	firm_ids	shares	prices	sugar	mushy
0	C01Q1	F1B04	1	0.012417	0.072088	2	1
1	C01Q1	F1B06	1	0.007809	0.114178	18	1
2	C01Q1	F1B07	1	0.012995	0.132391	4	1
3	C01Q1	F1B09	1	0.005770	0.130344	3	0
4	C01Q1	F1B11	1	0.017934	0.154823	12	0
5	C01Q1	F1B13	1	0.026602	0.137049	14	0
6	C01Q1	F1B17	1	0.025015	0.144209	3	1
7	C01Q1	F1B30	1	0.005058	0.128191	4	0
8	C01Q1	F1B45	1	0.005332	0.149611	14	0
9	C01Q1	F2B05	2	0.038068	0.108514	1	0

product_data[['prices', 'shares', 'sugar', 'mushy']].describe().round(4)

	prices	shares	sugar	mushy
count	2256.0000	2256.0000	2256.0000	2256.0000
mean	0.1257	0.0198	8.6250	0.3333
std	0.0290	0.0256	5.7879	0.4715
min	0.0455	0.0002	0.0000	0.0000
25%	0.1055	0.0052	3.0000	0.0000
50%	0.1238	0.0111	8.5000	0.0000
75%	0.1433	0.0246	13.2500	1.0000
max	0.2257	0.4469	20.0000	1.0000

fig, axes = plt.subplots(1, 3, figsize=(14, 4))

# Price distribution
axes[0].hist(product_data['prices'], bins=30, color='#440154', alpha=0.8, edgecolor='white')
axes[0].set_xlabel('Price', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].set_title('Price Distribution', fontsize=13)
axes[0].grid(True, alpha=0.2)

# Share distribution
axes[1].hist(product_data['shares'], bins=30, color='#31688e', alpha=0.8, edgecolor='white')
axes[1].set_xlabel('Market Share', fontsize=12)
axes[1].set_title('Share Distribution', fontsize=13)
axes[1].grid(True, alpha=0.2)

# Products per firm
firm_counts = product_data.groupby('firm_ids')['product_ids'].nunique()
axes[2].bar(firm_counts.index.astype(str), firm_counts.values, 
            color=['#440154', '#31688e', '#35b779', '#fde725', '#f97316'], 
            edgecolor='white')
axes[2].set_xlabel('Firm ID', fontsize=12)
axes[2].set_ylabel('Number of Products', fontsize=12)
axes[2].set_title('Products per Firm', fontsize=13)
axes[2].grid(True, alpha=0.2)

plt.tight_layout()
plt.show()

# Load agent (consumer) demographic data
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)

print(f'Agent data shape: {agent_data.shape}')
print(f'Agents per market: {agent_data.groupby("market_ids").size().iloc[0]}')
print()
agent_data[['market_ids', 'weights', 'income', 'income_squared', 'age', 'child']].describe().round(4)

Agent data shape: (1880, 12)
Agents per market: 20

	weights	income	income_squared	age	child
count	1880.00	1880.0000	1880.0000	1880.0000	1880.0000
mean	0.05	-0.0000	0.0000	-0.0000	0.0000
std	0.00	0.9396	16.0435	0.9446	0.4215
min	0.05	-4.9382	-65.9464	-3.2258	-0.2309
25%	0.05	-0.4282	-8.4548	-0.3926	-0.2309
50%	0.05	0.1608	2.0565	0.2399	-0.2309
75%	0.05	0.5887	10.1278	0.6454	-0.2309
max	0.05	2.3346	46.8563	1.2740	0.7691

Step 1: Plain Logit

The simplest demand model: all consumers have identical tastes ($\Sigma = 0$, $\Pi = 0$). The market share equation collapses to the familiar logit:

\[ \ln s_{jt} - \ln s_{0t} = x_j' \beta - \alpha \, p_{jt} + \xi_{jt} \]

This is a linear IV regression of log share ratios on product characteristics and price. We absorb brand fixed effects (product dummies) and use the 20 pre-computed demand instruments.

# Formulation: price with product fixed effects absorbed
logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')

logit_problem = pyblp.Problem(logit_formulation, product_data)
logit_problem

Dimensions:
================================
 T    N     F    K1    MD    ED 
---  ----  ---  ----  ----  ----
94   2256   5    1     20    1  
================================

Formulations:
==================================
     Column Indices:          0   
--------------------------  ------
X1: Linear Characteristics  prices
==================================

logit_results = logit_problem.solve()
logit_results

Problem Results Summary:
==========================================
GMM   Objective  Clipped  Weighting Matrix
Step    Value    Shares   Condition Number
----  ---------  -------  ----------------
 2    +1.9E+02      0         +5.7E+07    
==========================================

Cumulative Statistics:
========================
Computation   Objective 
   Time      Evaluations
-----------  -----------
 00:00:00         2     
========================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices  
----------
 -3.0E+01 
(+1.0E+00)
==========

The price coefficient $\hat{\alpha} \approx -30$ is large and negative: higher prices reduce market share. But the plain logit imposes the IIA property — substitution patterns are proportional to market shares, regardless of product similarity. This is unrealistic: if Cheerios’ price rises, consumers are more likely to switch to similar cereals than to very different ones.

# Compute own-price elasticities
logit_elasticities = logit_results.compute_elasticities()

# Extract own-price elasticities (diagonal)
logit_own = logit_results.extract_diagonal_means(logit_elasticities)

print(f'Logit own-price elasticities:')
print(f'  Mean: {logit_own.mean():.3f}')
print(f'  Range: [{logit_own.min():.3f}, {logit_own.max():.3f}]')

Logit own-price elasticities:
  Mean: -3.706
  Range: [-4.052, -3.289]

Step 2: Random Coefficients Logit (BLP)

Now we add heterogeneous preferences via random coefficients on the constant (inside vs. outside good), price, sugar, and mushy. This breaks IIA: products with similar characteristics become closer substitutes.

We need two formulations: - X1 (linear, $\bar{\beta}$): variables entering mean utility — price + product fixed effects - X2 (nonlinear, $\Sigma$): variables receiving random coefficients — constant, price, sugar, mushy

Plus an agent formulation for demographics: income, income$^2$, age, child.

# X1: linear parameters (price + brand FE)
X1 = pyblp.Formulation('0 + prices', absorb='C(product_ids)')

# X2: random coefficient variables
X2 = pyblp.Formulation('1 + prices + sugar + mushy')

# Agent demographics formulation
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')

# Build the problem
blp_problem = pyblp.Problem(
    (X1, X2),
    product_data,
    agent_formulation,
    agent_data,
)
blp_problem

Dimensions:
=================================================
 T    N     F    I     K1    K2    D    MD    ED 
---  ----  ---  ----  ----  ----  ---  ----  ----
94   2256   5   1880   1     4     4    20    1  
=================================================

Formulations:
===================================================================
       Column Indices:           0           1           2      3  
-----------------------------  ------  --------------  -----  -----
 X1: Linear Characteristics    prices                              
X2: Nonlinear Characteristics    1         prices      sugar  mushy
       d: Demographics         income  income_squared   age   child
===================================================================

Initial Values & Solving

The nonlinear GMM objective is non-convex, so starting values matter. We use values close to Nevo (2000)’s published estimates:

$\Sigma_0$ = diagonal matrix of standard deviations for (constant, price, sugar, mushy)
$\Pi_0$ = matrix of demographic interactions

Zeros in $\Sigma_0$ or $\Pi_0$ are constrained to remain zero during optimization — this is how you restrict the model.

# Initial sigma (diagonal: std devs for constant, price, sugar, mushy)
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])

# Initial pi (K2 x D demographics interaction matrix)
# Rows: constant, price, sugar, mushy
# Cols: income, income_squared, age, child
initial_pi = np.array([
    [ 5.4819,  0,       0.2037,  0     ],
    [15.8935, -1.2000,  0,       2.6342],
    [-0.2506,  0,       0.0511,  0     ],
    [ 1.2650,  0,      -0.8091,  0     ],
])

# Solve with BFGS optimizer, 1-step GMM
blp_results = blp_problem.solve(
    initial_sigma,
    initial_pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
    method='1s',
)
blp_results

Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number 
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +4.6E+00   +6.9E-06     +2.3E-05        +1.6E+04        0         +6.9E+07          +8.4E+08     
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:18       Yes          51           57          46371       143944   
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child   
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +5.6E-01                                       |    1      +2.3E+00      +0.0E+00      +1.3E+00    +0.0E+00 
        (+1.6E-01)                                      |          (+1.2E+00)                  (+6.3E-01)            
                                                        |                                                            
prices   +0.0E+00    +3.3E+00                           |  prices   +5.9E+02      -3.0E+01      +0.0E+00    +1.1E+01 
                    (+1.3E+00)                          |          (+2.7E+02)    (+1.4E+01)                (+4.1E+00)
                                                        |                                                            
sugar    +0.0E+00    +0.0E+00    -5.8E-03               |  sugar    -3.8E-01      +0.0E+00      +5.2E-02    +0.0E+00 
                                (+1.4E-02)              |          (+1.2E-01)                  (+2.6E-02)            
                                                        |                                                            
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +9.3E-02   |  mushy    +7.5E-01      +0.0E+00      -1.4E+00    +0.0E+00 
                                            (+1.9E-01)  |          (+8.0E-01)                  (+6.7E-01)            
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices  
----------
 -6.3E+01 
(+1.5E+01)
==========

Interpreting the results:

$\hat{\Sigma}$ (Sigma): the Cholesky root of the unobserved heterogeneity covariance. Diagonal elements give the std dev of random coefficients.
$\hat{\Pi}$ (Pi): how demographics shift tastes. E.g., $\Pi_{\text{price, income}} > 0$ means higher-income consumers are less price-sensitive.
$\hat{\beta}$ (Beta): the mean price coefficient (from X1).
The GMM objective value measures fit; the weighting matrix condition number flags numerical issues.

Post-Estimation Analysis

Own-Price Elasticities

The own-price elasticity measures the percentage change in market share for a 1% price increase:

\[ \varepsilon_{jj} = \frac{\partial s_j}{\partial p_j} \cdot \frac{p_j}{s_j} \]

With random coefficients, these vary across products and markets (unlike logit, where $\varepsilon_{jj} \approx \alpha \, p_j (1 - s_j)$ for all products).

# Compute full elasticity matrices (J x J for each market)
elasticities = blp_results.compute_elasticities()

# Extract own-price elasticities (diagonal of each market's matrix)
own_elasticities = blp_results.extract_diagonal_means(elasticities)

print(f'BLP own-price elasticities:')
print(f'  Mean:   {own_elasticities.mean():.3f}')
print(f'  Median: {np.median(own_elasticities):.3f}')
print(f'  Std:    {own_elasticities.std():.3f}')
print(f'  Range:  [{own_elasticities.min():.3f}, {own_elasticities.max():.3f}]')

BLP own-price elasticities:
  Mean:   -3.618
  Median: -3.591
  Std:    0.300
  Range:  [-4.354, -3.073]

fig, ax = plt.subplots(figsize=(10, 5))

# Logit own-price elasticities (all products across markets)
logit_diag = []
market_ids = product_data['market_ids'].unique()
idx = 0
for m in market_ids:
    n_j = (product_data['market_ids'] == m).sum()
    mat = logit_elasticities[idx:idx+n_j]
    logit_diag.extend(np.diag(mat).tolist())
    idx += n_j

# BLP own-price elasticities (same extraction)
blp_diag = []
idx = 0
for m in market_ids:
    n_j = (product_data['market_ids'] == m).sum()
    mat = elasticities[idx:idx+n_j]
    blp_diag.extend(np.diag(mat).tolist())
    idx += n_j

ax.hist(logit_diag, bins=40, alpha=0.5, color='#440154', label='Logit', density=True, edgecolor='white')
ax.hist(blp_diag, bins=40, alpha=0.5, color='#35b779', label='BLP (Random Coefficients)', density=True, edgecolor='white')
ax.axvline(np.mean(logit_diag), color='#440154', linestyle='--', linewidth=2, alpha=0.8)
ax.axvline(np.mean(blp_diag), color='#35b779', linestyle='--', linewidth=2, alpha=0.8)

ax.set_xlabel('Own-Price Elasticity', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title('Distribution of Own-Price Elasticities: Logit vs BLP', fontsize=14)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.2)

plt.tight_layout()
plt.show()

Cross-Price Elasticity Heatmap

The cross-price elasticity $\varepsilon_{jk} = \frac{\partial s_j}{\partial p_k} \cdot \frac{p_k}{s_j}$ measures how product $j$’s share responds to a change in product $k$’s price.

Under logit, all cross-elasticities with respect to product $k$ are proportional to $s_k$ — substitution is driven purely by market share, not similarity. Under BLP, cross-elasticities reflect product similarity through the random coefficients, producing richer substitution patterns.

# Pick a single market
market = 'C01Q1'
market_mask = product_data['market_ids'] == market
n_products = market_mask.sum()
product_labels = product_data.loc[market_mask, 'product_ids'].values

# Extract elasticity matrix for this market
market_idx = np.where(market_mask)[0]
start = market_idx[0]
elast_matrix_logit = logit_elasticities[start:start+n_products, :n_products]
elast_matrix_blp = elasticities[start:start+n_products, :n_products]

fig, axes = plt.subplots(1, 2, figsize=(16, 7))

# Logit
im1 = axes[0].imshow(elast_matrix_logit, cmap='RdBu_r', aspect='auto', 
                      vmin=-6, vmax=1)
axes[0].set_xticks(range(n_products))
axes[0].set_yticks(range(n_products))
axes[0].set_xticklabels(product_labels, rotation=90, fontsize=7)
axes[0].set_yticklabels(product_labels, fontsize=7)
axes[0].set_title('Logit Elasticities', fontsize=13)
axes[0].set_xlabel('Price of product $k$', fontsize=11)
axes[0].set_ylabel('Share of product $j$', fontsize=11)
plt.colorbar(im1, ax=axes[0], shrink=0.8)

# BLP
im2 = axes[1].imshow(elast_matrix_blp, cmap='RdBu_r', aspect='auto', 
                      vmin=-6, vmax=1)
axes[1].set_xticks(range(n_products))
axes[1].set_yticks(range(n_products))
axes[1].set_xticklabels(product_labels, rotation=90, fontsize=7)
axes[1].set_yticklabels(product_labels, fontsize=7)
axes[1].set_title('BLP Elasticities (Random Coefficients)', fontsize=13)
axes[1].set_xlabel('Price of product $k$', fontsize=11)
plt.colorbar(im2, ax=axes[1], shrink=0.8)

plt.suptitle(f'Cross-Price Elasticity Matrix — Market {market}', fontsize=15, y=1.02)
plt.tight_layout()
plt.show()

# Seaborn heatmap with annotations for BLP cross-elasticities
elast_df = pd.DataFrame(
    elast_matrix_blp, 
    index=product_labels, 
    columns=product_labels
)

fig, ax = plt.subplots(figsize=(14, 10))
sns.heatmap(
    elast_df, 
    cmap='RdBu_r', 
    center=0, 
    annot=True, 
    fmt='.2f', 
    linewidths=0.5,
    linecolor='white',
    annot_kws={'size': 7},
    cbar_kws={'label': 'Elasticity', 'shrink': 0.8},
    ax=ax,
)
ax.set_title(f'BLP Cross-Price Elasticity Matrix — Market {market}', fontsize=15)
ax.set_xlabel('Price of product $k$', fontsize=12)
ax.set_ylabel('Share of product $j$', fontsize=12)

plt.tight_layout()
plt.show()

Diversion Ratios

The diversion ratio from product $j$ to product $k$ measures what fraction of $j$’s lost sales go to $k$ when $j$’s price increases:

\[ D_{jk} = -\frac{\partial s_k / \partial p_j}{\partial s_j / \partial p_j} \]

Diversion ratios are central to merger analysis: if two merging firms’ products have high diversion ratios to each other, the merger raises greater competitive concerns.

# Compute diversion ratios
diversions = blp_results.compute_diversion_ratios()

# Extract for our market
div_matrix = diversions[start:start+n_products, :n_products]

div_df = pd.DataFrame(div_matrix, index=product_labels, columns=product_labels)

fig, ax = plt.subplots(figsize=(14, 10))
sns.heatmap(
    div_df, 
    cmap='YlOrRd', 
    annot=True, 
    fmt='.2f', 
    linewidths=0.5,
    linecolor='white',
    annot_kws={'size': 7},
    cbar_kws={'label': 'Diversion Ratio', 'shrink': 0.8},
    vmin=0,
    ax=ax,
)
ax.set_title(f'BLP Diversion Ratios — Market {market}', fontsize=15)
ax.set_xlabel('Destination product $k$', fontsize=12)
ax.set_ylabel('Source product $j$ (price increase)', fontsize=12)

plt.tight_layout()
plt.show()

Marginal Costs & Markups

Given estimated demand and the assumption of multi-product Bertrand-Nash pricing, we can back out implied marginal costs from the first-order conditions:

\[ p = c + \underbrace{(\Omega \odot \Delta)^{-1} s}_{\text{markup}} \]

where $\Omega$ is the ownership matrix and $\Delta_{jk} = -\partial s_k / \partial p_j$ is the demand Jacobian.

# Compute marginal costs and markups
costs = blp_results.compute_costs()
markups = blp_results.compute_markups(costs=costs)

fig, axes = plt.subplots(1, 2, figsize=(13, 5))

# Marginal costs
axes[0].hist(costs.flatten(), bins=40, color='#31688e', alpha=0.8, edgecolor='white')
axes[0].axvline(costs.mean(), color='black', linestyle='--', linewidth=1.5, 
                label=f'Mean: {costs.mean():.4f}')
axes[0].set_xlabel('Marginal Cost', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].set_title('Distribution of Marginal Costs', fontsize=13)
axes[0].legend(fontsize=10)
axes[0].grid(True, alpha=0.2)

# Markups
axes[1].hist(markups.flatten(), bins=40, color='#35b779', alpha=0.8, edgecolor='white')
axes[1].axvline(markups.mean(), color='black', linestyle='--', linewidth=1.5,
                label=f'Mean: {markups.mean():.4f}')
axes[1].set_xlabel('Markup ($p - c$) / $p$', fontsize=12)
axes[1].set_ylabel('Frequency', fontsize=12)
axes[1].set_title('Distribution of Markups (Lerner Index)', fontsize=13)
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.2)

plt.tight_layout()
plt.show()

print(f'Marginal cost — Mean: {costs.mean():.4f}, Median: {np.median(costs):.4f}')
print(f'Markup (Lerner) — Mean: {markups.mean():.4f}, Median: {np.median(markups):.4f}')

Marginal cost — Mean: 0.0824, Median: 0.0812
Markup (Lerner) — Mean: 0.3639, Median: 0.3371

Merger Simulation

A core application of BLP: simulate the price effects of a hypothetical merger. The idea is simple:

Estimate demand and recover pre-merger marginal costs $c_j$
Change the ownership matrix $\Omega$ to reflect the merger (merging firms now internalize pricing externalities)
Solve for new Nash equilibrium prices given the new $\Omega$ but the same $c_j$
Compare pre-merger vs. post-merger outcomes

We simulate a merger between Firm 1 and Firm 2 (the two largest firms, with 9 products each).

# Create post-merger firm IDs: Firm 2 becomes part of Firm 1
product_data['merger_ids'] = product_data['firm_ids'].replace(2, 1)

print('Pre-merger firm structure:')
print(product_data.groupby('firm_ids')['product_ids'].nunique().to_string())
print()
print('Post-merger firm structure:')
print(product_data.groupby('merger_ids')['product_ids'].nunique().to_string())

Pre-merger firm structure:
firm_ids
1    9
2    9
3    2
4    3
6    1

Post-merger firm structure:
merger_ids
1    18
3     2
4     3
6     1

# Pre-merger benchmarks
pre_hhi = blp_results.compute_hhi()
pre_profits = blp_results.compute_profits(costs=costs)
pre_cs = blp_results.compute_consumer_surpluses()

# Post-merger equilibrium prices
post_prices = blp_results.compute_prices(
    firm_ids=product_data['merger_ids'],
    costs=costs,
)

# Post-merger outcomes
post_shares = blp_results.compute_shares(post_prices)
post_hhi = blp_results.compute_hhi(
    firm_ids=product_data['merger_ids'],
    shares=post_shares,
)
post_markups = blp_results.compute_markups(post_prices, costs=costs)
post_profits = blp_results.compute_profits(post_prices, post_shares, costs)
post_cs = blp_results.compute_consumer_surpluses(post_prices)

# Price changes
price_changes = (post_prices - product_data['prices'].values.reshape(-1, 1)) / product_data['prices'].values.reshape(-1, 1) * 100

print(f'=== Merger Simulation: Firm 1 + Firm 2 ===')
print(f'Price change (%):     Mean = {price_changes.mean():.2f}%, Median = {np.median(price_changes):.2f}%')
print(f'HHI change:           Mean = {(post_hhi - pre_hhi).mean():.1f}')
print(f'Consumer surplus ($/person):')
print(f'  Pre-merger:  {pre_cs.mean():.4f}')
print(f'  Post-merger: {post_cs.mean():.4f}')
print(f'  Change:      {(post_cs - pre_cs).mean():.4f}')

=== Merger Simulation: Firm 1 + Firm 2 ===
Price change (%):     Mean = 10.16%, Median = 9.41%
HHI change:           Mean = 1790.8
Consumer surplus ($/person):
  Pre-merger:  0.0342
  Post-merger: 0.0296
  Change:      -0.0047

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Price changes
axes[0].hist(price_changes.flatten(), bins=40, color='#440154', alpha=0.8, edgecolor='white')
axes[0].axvline(price_changes.mean(), color='black', linestyle='--', linewidth=1.5,
                label=f'Mean: {price_changes.mean():.2f}%')
axes[0].set_xlabel('Price Change (%)', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].set_title('Post-Merger Price Changes', fontsize=13)
axes[0].legend(fontsize=10)
axes[0].grid(True, alpha=0.2)

# HHI changes
hhi_change = (post_hhi - pre_hhi).flatten()
axes[1].hist(hhi_change, bins=40, color='#f97316', alpha=0.8, edgecolor='white')
axes[1].axvline(hhi_change.mean(), color='black', linestyle='--', linewidth=1.5,
                label=f'Mean: {hhi_change.mean():.0f}')
axes[1].set_xlabel('$\Delta$ HHI', fontsize=12)
axes[1].set_title('Change in Market Concentration', fontsize=13)
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.2)

# Consumer surplus changes
cs_change = (post_cs - pre_cs).flatten()
axes[2].hist(cs_change, bins=40, color='#31688e', alpha=0.8, edgecolor='white')
axes[2].axvline(cs_change.mean(), color='black', linestyle='--', linewidth=1.5,
                label=f'Mean: {cs_change.mean():.4f}')
axes[2].set_xlabel('$\Delta$ Consumer Surplus', fontsize=12)
axes[2].set_title('Change in Consumer Surplus', fontsize=13)
axes[2].legend(fontsize=10)
axes[2].grid(True, alpha=0.2)

plt.suptitle('Merger Impact: Firm 1 + Firm 2', fontsize=15, y=1.02)
plt.tight_layout()
plt.show()

# Price changes by firm (who benefits from the merger?)
price_change_df = pd.DataFrame({
    'firm_id': product_data['firm_ids'],
    'product_id': product_data['product_ids'],
    'pre_price': product_data['prices'],
    'post_price': post_prices.flatten(),
    'price_change_pct': price_changes.flatten(),
    'merging': product_data['firm_ids'].isin([1, 2]),
})

fig, ax = plt.subplots(figsize=(10, 5))

merging = price_change_df[price_change_df['merging']]
non_merging = price_change_df[~price_change_df['merging']]

ax.hist(merging['price_change_pct'], bins=30, alpha=0.6, color='#440154', 
        label=f'Merging firms (mean: {merging["price_change_pct"].mean():.2f}%)', 
        density=True, edgecolor='white')
ax.hist(non_merging['price_change_pct'], bins=30, alpha=0.6, color='#35b779', 
        label=f'Non-merging firms (mean: {non_merging["price_change_pct"].mean():.2f}%)', 
        density=True, edgecolor='white')

ax.set_xlabel('Price Change (%)', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title('Post-Merger Price Changes: Merging vs Non-Merging Firms', fontsize=14)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.2)

plt.tight_layout()
plt.show()

Micro Moments: Incorporating Individual-Level Data

Demographics vs. Micro Moments

A common point of confusion: the agent demographics we used above ($d_i$ = income, age, child) are not micro moments. They play very different roles:

	Agent Demographics ($\Pi d_i$)	Micro Moments
What they are	Draws from the population distribution of consumer types	Empirical statistics computed from individual-level survey data
Role in estimation	Define the integration measure for computing predicted shares	Provide additional moment conditions in the GMM objective
What they match	Nothing directly — they specify how to integrate over heterogeneity	Match model-predicted statistics to observed survey statistics
Example	“20 draws of (income, age) per market, used to approximate the share integral”	“The average income of consumers who bought brand X is $45,000”
Identification	Help compute $s_j(\delta, \theta_2)$ but don’t add moment conditions	Add new moments $\bar{g}_M$ that help pin down $\Sigma$ and $\Pi$

Key insight: In the standard BLP setup, the only moments matched are the demand-side conditions $\mathbb{E}[Z'\xi] = 0$ (instruments orthogonal to the structural error). Demographics help compute predicted shares, but they don’t provide additional data to match. Micro moments add extra targets: they say “the model must also predict that the average age of mushy cereal buyers is 35 and that high-income consumers are 20% more likely to buy premium brands.”

The Math

The standard BLP GMM stacks demand (and possibly supply) moments:

\[ \bar{g} = \begin{bmatrix} \bar{g}_D \end{bmatrix} = \begin{bmatrix} Z'\xi(\theta) / N \end{bmatrix} \]

With micro moments, we augment the moment vector:

\[ \bar{g} = \begin{bmatrix} \bar{g}_D \\ \bar{g}_M \end{bmatrix}, \quad \bar{g}_{M,m} = f_m(\bar{v}) - f_m(v(\theta)) \]

where $f_m(\bar{v})$ is the observed statistic from survey data (e.g., $\mathbb{E}[\text{income} \mid \text{bought mushy cereal}] = 0.35$) and $f_m(v(\theta))$ is the model-predicted equivalent.

Each micro moment part $v_p$ is a weighted average over agents, products, and markets:

\[ v_p(\theta) = \frac{\sum_{t} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} \, s_{ijt}(\theta) \, w^d_{pijt} \, v_{pijt}}{\sum_{t} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} \, s_{ijt}(\theta) \, w^d_{pijt}} \]

where $s_{ijt}(\theta)$ is the model-predicted probability that agent $i$ buys product $j$ in market $t$, and $w^d_{pijt}$ and $v_{pijt}$ are dataset-specific weights and values.

Types of Micro Data

Common sources of micro moments include:

Consumer surveys (e.g., Consumer Expenditure Survey): “What is the average income of buyers of product category X?”
Scanner data with demographics: “What fraction of high-income households buy premium brands?”
Second-choice data: “Among consumers who bought brand A, what fraction would have bought brand B as their second choice?” (very powerful for identifying substitution patterns)
Purchase probabilities by segment: “What is the probability a household with children buys a sugary cereal?”

Why Micro Moments Help

Market-level shares alone may not strongly identify the nonlinear parameters $(\Sigma, \Pi)$. Many combinations of $(\Sigma, \Pi)$ can generate similar aggregate shares. Micro moments break this degeneracy by providing additional targets:

$\Sigma$ identification: Conditional purchase probabilities by product type pin down how much unobserved heterogeneity there is.
$\Pi$ identification: Conditional demographic means (e.g., average age of mushy cereal buyers) directly inform how demographics shift tastes.
Substitution patterns: Second-choice data directly constrains cross-elasticities, which are otherwise weakly identified from shares alone.

Example: Micro Moments with Nevo Data

Since we don’t have real survey data for these cereals, we’ll use a practical workflow: simulate micro data from our estimated model, then use those simulated statistics as targets for re-estimation. This demonstrates the full PyBLP micro moments API.

The idea: use simulate_micro_data() to generate individual-level purchase records consistent with our estimates, compute statistics from them, then pass those as micro moment targets.

We’ll construct two types of micro moments: 1. Conditional demographic means: $\mathbb{E}[\text{income} \mid \text{bought mushy cereal}]$ 2. Purchase probability by segment: $\mathbb{E}[\mathbf{1}\{\text{bought any cereal}\} \mid \text{has children}]$

# Step 1: Define a micro dataset
# This represents a hypothetical consumer survey of 5,000 observations
# compute_weights returns a (I_t x 1+J_t) matrix of sampling weights
# Here: all agent-product pairs are equally likely to be sampled
micro_dataset = pyblp.MicroDataset(
    name="Consumer Survey",
    observations=5000,
    compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
)

# Step 2: Simulate micro data from our BLP estimates
# This generates individual purchase records consistent with the model
micro_data = blp_results.simulate_micro_data(dataset=micro_dataset, seed=42)
micro_df = pd.DataFrame(pyblp.data_to_dict(micro_data))

print(f'Simulated micro data: {micro_df.shape[0]} observations')
print(f'Columns: {list(micro_df.columns)}')
print(f'Outside option (choice=0): {(micro_df["choice_indices"] == 0).sum()} / {len(micro_df)}')
micro_df.head()

Simulated micro data: 5000 observations
Columns: ['micro_ids', 'market_ids', 'agent_indices', 'choice_indices']
Outside option (choice=0): 2594 / 5000

	micro_ids	market_ids	agent_indices	choice_indices
0	0	C26Q2	4	2
1	1	C61Q2	7	0
2	2	C45Q1	16	0
3	3	C38Q1	5	11
4	4	C12Q1	13	0

# Step 3: Compute target statistics from simulated micro data
# In practice, these would come from a real consumer survey (e.g., CEX, Nielsen).
# Here we merge the simulated purchase records with agent demographics and
# product characteristics to compute conditional means.

# Add agent demographics
agent_indexed = agent_data.copy()
agent_indexed['agent_idx'] = agent_indexed.groupby('market_ids').cumcount()
micro_enriched = micro_df.merge(
    agent_indexed[['market_ids', 'agent_idx', 'income', 'age', 'child']],
    left_on=['market_ids', 'agent_indices'],
    right_on=['market_ids', 'agent_idx'],
    how='left',
)

# Add product characteristics (choice_indices: 0 = outside, 1-24 = inside products)
prod_indexed = product_data.copy()
prod_indexed['prod_idx'] = prod_indexed.groupby('market_ids').cumcount() + 1
micro_enriched = micro_enriched.merge(
    prod_indexed[['market_ids', 'prod_idx', 'sugar', 'mushy']],
    left_on=['market_ids', 'choice_indices'],
    right_on=['market_ids', 'prod_idx'],
    how='left',
)
micro_enriched['sugar'] = micro_enriched['sugar'].fillna(0)
micro_enriched['mushy'] = micro_enriched['mushy'].fillna(0)

# Compute conditional means — these are our "survey statistics"
buyers_mushy = micro_enriched[micro_enriched['mushy'] == 1]
buyers_sugary = micro_enriched[micro_enriched['sugar'] > 10]

target_income_mushy = buyers_mushy['income'].mean()
target_income_sugary = buyers_sugary['income'].mean()
target_age_mushy = buyers_mushy['age'].mean()

print("Target statistics from simulated micro data:")
print(f"  E[income | mushy cereal]:  {target_income_mushy:.4f}  (n={len(buyers_mushy)})")
print(f"  E[income | sugary cereal]: {target_income_sugary:.4f}  (n={len(buyers_sugary)})")
print(f"  E[age | mushy cereal]:     {target_age_mushy:.4f}  (n={len(buyers_mushy)})")

Target statistics from simulated micro data:
  E[income | mushy cereal]:  0.4387  (n=874)
  E[income | sugary cereal]: 0.1991  (n=1344)
  E[age | mushy cereal]:     -0.0636  (n=874)

# Step 4: Define MicroParts and assemble MicroMoments
#
# Each conditional moment E[X|Y] = E[X*Y] / E[Y] requires:
#   - A numerator MicroPart computing E[X_i * Y_j] across (agent, product) pairs
#   - A denominator MicroPart computing E[Y_j]
#
# compute_values returns an (I_t x 1+J_t) matrix:
#   - rows = agents (I_t per market)
#   - cols = outside option (col 0) + J_t inside products
#   - np.r_[0, ...] prepends a 0 for the outside option

# --- MicroParts ---
# Numerator: income_i * mushy_j
income_mushy_part = pyblp.MicroPart(
    name="E[income * mushy]", dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(
        a.demographics[:, 0],     # income
        np.r_[0, p.X2[:, 3]],     # mushy indicators (0 for outside option)
    ),
)
# Denominator: mushy_j
mushy_part = pyblp.MicroPart(
    name="E[mushy]", dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(
        np.ones(a.size),
        np.r_[0, p.X2[:, 3]],
    ),
)
# Numerator: income_i * high_sugar_j
income_sugary_part = pyblp.MicroPart(
    name="E[income * high_sugar]", dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(
        a.demographics[:, 0],
        np.r_[0, (p.X2[:, 2] > 10).astype(float)],
    ),
)
# Denominator: high_sugar_j
sugary_part = pyblp.MicroPart(
    name="E[high_sugar]", dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(
        np.ones(a.size),
        np.r_[0, (p.X2[:, 2] > 10).astype(float)],
    ),
)
# Numerator: age_i * mushy_j
age_mushy_part = pyblp.MicroPart(
    name="E[age * mushy]", dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(
        a.demographics[:, 2],     # age
        np.r_[0, p.X2[:, 3]],
    ),
)

# --- Assemble MicroMoments ---
# Helper for conditional expectation: f(v) = v[0] / v[1]
compute_ratio = lambda v: v[0] / v[1]
compute_ratio_gradient = lambda v: [1 / v[1], -v[0] / v[1]**2]

micro_moments = [
    pyblp.MicroMoment(
        name="E[income | mushy cereal]",
        value=target_income_mushy,
        parts=[income_mushy_part, mushy_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[income | sugary cereal]",
        value=target_income_sugary,
        parts=[income_sugary_part, sugary_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age | mushy cereal]",
        value=target_age_mushy,
        parts=[age_mushy_part, mushy_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
]

print(f"{len(micro_moments)} micro moments configured:")
for mm in micro_moments:
    print(f"  {mm.name}: target = {mm.value:.4f}")

3 micro moments configured:
  E[income | mushy cereal]: target = 0.4387
  E[income | sugary cereal]: target = 0.1991
  E[age | mushy cereal]: target = -0.0636

# Step 5: Re-estimate with micro moments
# The micro moments are passed to solve() as additional moment conditions
blp_micro_results = blp_problem.solve(
    initial_sigma,
    initial_pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
    method='1s',
    micro_moments=micro_moments,
)
blp_micro_results

Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number 
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +7.9E+00   +2.2E-06     +1.8E-04        +4.9E+04        0         +5.9E+07          +4.3E+08     
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:28       Yes          57           67          49350       153614   
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child   
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +5.6E-01                                       |    1      +2.2E+00      +0.0E+00      +1.4E+00    +0.0E+00 
        (+1.4E-01)                                      |          (+9.5E-01)                  (+3.7E-01)            
                                                        |                                                            
prices   +0.0E+00    +3.2E+00                           |  prices   +6.0E+02      -3.1E+01      +0.0E+00    +1.2E+01 
                    (+1.0E+00)                          |          (+1.4E+02)    (+7.2E+00)                (+3.6E+00)
                                                        |                                                            
sugar    +0.0E+00    +0.0E+00    -6.4E-03               |  sugar    -3.8E-01      +0.0E+00      +5.2E-02    +0.0E+00 
                                (+1.2E-02)              |          (+5.8E-02)                  (+2.3E-02)            
                                                        |                                                            
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +7.4E-02   |  mushy    +6.0E-01      +0.0E+00      -1.4E+00    +0.0E+00 
                                            (+1.9E-01)  |          (+2.8E-01)                  (+3.0E-01)            
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices  
----------
 -6.3E+01 
(+8.1E+00)
==========

Estimated Micro Moments:
==========================================================================================================================
Observed  Estimated  Difference           Moment                     Part               Dataset      Observations  Markets
--------  ---------  ----------  -------------------------  ----------------------  ---------------  ------------  -------
+4.4E-01  +4.4E-01    -1.3E-03   E[income | mushy cereal]     E[income * mushy]     Consumer Survey      5000        All  
                                                                   E[mushy]         Consumer Survey      5000        All  
+2.0E-01  +2.0E-01    +1.0E-03   E[income | sugary cereal]  E[income * high_sugar]  Consumer Survey      5000        All  
                                                                E[high_sugar]       Consumer Survey      5000        All  
-6.4E-02  -6.3E-02    -2.9E-04     E[age | mushy cereal]        E[age * mushy]      Consumer Survey      5000        All  
                                                                   E[mushy]         Consumer Survey      5000        All  
==========================================================================================================================

# Step 6: Compare BLP vs BLP + Micro Moments
print("=" * 60)
print("Parameter Comparison: BLP vs BLP + Micro Moments")
print("=" * 60)

# Sigma diagonal
print("\nSigma (diagonal — std devs of random coefficients):")
labels = ['constant', 'price', 'sugar', 'mushy']
for i, label in enumerate(labels):
    print(f"  {label:10s}: BLP = {blp_results.sigma[i,i]:+.4f}  |  +Micro = {blp_micro_results.sigma[i,i]:+.4f}")

# Beta
print(f"\nBeta (price): BLP = {float(blp_results.beta[0]):+.4f}  |  +Micro = {float(blp_micro_results.beta[0]):+.4f}")

# Objective
print(f"\nGMM Objective: BLP = {float(blp_results.objective):.4f}  |  +Micro = {float(blp_micro_results.objective):.4f}")
print("  (Higher objective expected — more moments to match)")

# Own-price elasticities comparison
micro_elasticities = blp_micro_results.compute_elasticities()
micro_own = blp_micro_results.extract_diagonal_means(micro_elasticities)
print(f"\nMean own-price elasticity: BLP = {own_elasticities.mean():.3f}  |  +Micro = {micro_own.mean():.3f}")

============================================================
Parameter Comparison: BLP vs BLP + Micro Moments
============================================================

Sigma (diagonal — std devs of random coefficients):
  constant  : BLP = +0.5581  |  +Micro = +0.5644
  price     : BLP = +3.3125  |  +Micro = +3.2430
  sugar     : BLP = -0.0058  |  +Micro = -0.0064
  mushy     : BLP = +0.0934  |  +Micro = +0.0739

Beta (price): BLP = -62.7299  |  +Micro = -63.2512

GMM Objective: BLP = 4.5615  |  +Micro = 7.8621
  (Higher objective expected — more moments to match)

Mean own-price elasticity: BLP = -3.618  |  +Micro = -3.612

Summary

	Logit	BLP (Random Coefficients)	BLP + Micro Moments
Consumer heterogeneity	None (representative consumer)	Random coefficients + demographics	Same, but better identified
Substitution pattern	IIA: proportional to market shares	Flexible: driven by product similarity	Same, more precise
Data used	Market shares only	Market shares + demographic draws	+ Individual-level survey data
Moment conditions	$\mathbb{E}[Z'\xi] = 0$	$\mathbb{E}[Z'\xi] = 0$	$\mathbb{E}[Z'\xi] = 0$ + $\bar{g}_M$
Estimation	Linear IV	Nonlinear GMM with contraction mapping	Same + micro moment matching

Key takeaways:

Logit is fast and easy but imposes unrealistic substitution patterns (IIA). All products are equally substitutable conditional on shares.
BLP breaks IIA via random coefficients: products similar in characteristics are closer substitutes. This is critical for realistic merger simulation and pricing counterfactuals.
Agent demographics ($\Pi d_i$) are part of the model specification — they define the integration measure for computing predicted shares, but they don’t add moment conditions.
Micro moments are additional data from individual-level surveys that provide extra moment conditions in the GMM objective. They improve identification of $\Sigma$ and $\Pi$ by matching conditional statistics like “average income of mushy cereal buyers.”
Post-estimation is where BLP earns its keep: elasticity matrices, diversion ratios, markups, and full merger simulations with equilibrium price recomputation.

References

Berry, S. T. (1994). Estimating Discrete-Choice Models of Product Differentiation. RAND Journal of Economics, 25(2), 242–262.
Berry, S., Levinsohn, J., & Pakes, A. (1995). Automobile Prices in Market Equilibrium. Econometrica, 63(4), 841–890.
Berry, S., Levinsohn, J., & Pakes, A. (2004). Differentiated Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market. Journal of Political Economy, 112(1), 68–105.
Petrin, A. (2002). Quantifying the Benefits of New Products: The Case of the Minivan. Journal of Political Economy, 110(4), 705–729.
Nevo, A. (2000). A Practitioner’s Guide to Estimation of Random-Coefficients Logit Models of Demand. Journal of Economics & Management Strategy, 9(4), 513–548.
Conlon, C., & Gortmaker, J. (2020). Best Practices for Differentiated Products Demand Estimation with PyBLP. RAND Journal of Economics, 51(4), 1108–1161.
Conlon, C., & Gortmaker, J. (2023). Incorporating Micro Data into Differentiated Products Demand Estimation with PyBLP. Working Paper.
PyBLP Documentation