GHL Demand Data

What is GHL Demand?

GHL (Generalized Hedonic-Linear) is a tractable, scalable hedonic demand system, developed in Pellegrino (2021), that can be estimated for the universe of US publicly-traded corporations. On this page, I describe the GHL data and how other researchers can access it.

Why do we need GHL?

In order to understand product market power, we must understand product demand and product substitution. In everyday parlance as well as macroeconomic models, we assume that firms populate "industries" or "sectors". Two firms compete if they belong the same industry.

While this is a useful simplification, we know things aren't that simple in reality. Take an automobile producer and a motorcycle producer, for example: do they belong to the same industry or to different industries?

Firms do not exist in "buckets" called industries. They exist in a fuzzy product space, where products can be perfect substitutes, perfect complements, or anything in between. This fuzziness is critically important in antitrust litigations. If the FTC or the DOJ sue some company X for attempted monopolization of a market, company X (to defend itself in the litigation) will usually define its market very broadly. The FTC/DOJ, on the other hand will define the market much more narrowly.

To measure the substitutability of different products in an objective way, Industrial Organization (IO) economists use hedonic demand systems. "Hedonic" means that products are modeled as bundles of characteristics. Two products are substitutable if they provide the consumer with similar characteristics. 

The problem that GHL was developed to address is that standard hedonic demand models used in IO (such as BLP) are not implementable in a multi-industry setting - ergo, they are of little use to macro/finance economists. This is because: 1) the data needed to estimate them doesn't exist except for a few narrow industries (autos, ready-to-eat cereals, etc...); 2) even if the data for multiple industries did exist, they are computationally expensive and subject to the curse of dimensionality, which makes them impossible to estimate when the number of firms grows into the hundreds or the thousands.

GHL makes it possible for researchers to study the role of product market competition in a range of settings (particularly, macroeconomics and finance) outside of IO.

What assumptions underlie GHL?

GHL assumes that consumer utility is a quadratic function of the characteristics embedded in the products. Also, an assumption of firm conduct must be made to identify the demand system. Pellegrino (2021) assumes that firms compete in Cournot Oligopoly, while Ederer and Pellegrino (2021) assume that firms compete Cournot with common ownership. If you assume something else (say, Bertrand), you get different demand elasticities.

What data is needed to estimate GHL?

The first thing to know is that you don't need to: I am happy to share the data. However, if you want to replicate my estimation of GHL from my 2019, You will need:

I plan to share a replication package at some point in the future.

How is GHL related to TNIC

The TNIC cosine similarity measures developed by Hoberg and Phillips are one of the two data inputs required to estimate GHL. They are a-theoretical: they are not based on any particular demand framework. GHL is a theory of demand, and the GHL data is the resulting system of markups and cross-price demand elasticities.

In GHL, cosine similarities are proportional to price-quantity derivatives (∂pᵢ/qⱼ). However, these derivatives depend on a certain volumetric normalization. What that means in practice is that, in order to be able to interpret cosine similarities as price-quantity derivatives you need to measure the output of each product in a specific unit (pounds, kilogram).

This dataset contains demand elasticities. The advantage of using elasticities is that elasticities do not depend on volumetric units - i.e. the cross-price elasticity of demand of bread and milk is the same regardless of whether you measure bread in pounds or kilograms, and regardless of whether you measure milk in milliliters or gallons.

Where do I find the full methodology?

The full methodology is described in Pellegrino (2021).

Do you share the data? What data is available? Can I match it to Compustat?

As of July 2022, I am sharing with other researchers the following GHL demand data:

for the Compustat universe in 1996-2019. This data can be matched to Compustat using the gvkey firm identifier. 

How do I access the data?

This data is LARGE (even larger than TNIC). If you wish to access this data, or to collaborate with me on a research project, please write to me at bpellegr [at] (I am usually very responsive). Based on your data needs, I'll share the data in the most efficient way.

All errors are mine but be aware that the methodology might be updated over time. I provide the most recent implementation of the data.

What paper should I cite if I use the data?

Pellegrino, Bruno: "Product Differentiation and Oligopoly: a Network Approach." (Working paper, 2021).