We have recently completed a paper that proposes a novel measure of complexity. The paper is available on SSRN here and the abstract is below.
In business research, firm size is both ubiquitous and readily measured. In contrast, complexity, another firm-related construct, is frequently relevant, but difficult to measure and not well defined. As a result, complexity is seldom incorporated in empirical designs. Measures such as the number of firm segments or the readability of a firm’s financial filings are often used as proxies for some aspect of complexity. We argue that most extant measures of complexity are misspecified, one-dimensional, and/or not widely available. We propose a text-based solution as a widely available, omnibus measure of this multidimensional concept and use audit fees—which are well established as being largely driven by size and complexity—as the primary empirical framework for evaluation. Because this is a new measure, we also consider alternative contexts, including returns around 10‑K filings, initial public offerings, unexpected earnings, and the COVID‑19 crisis.
LoughranMcDonald_ComplexityWords_2020_v1.7.xlsx - The list of complexity words and their lemmas.
LoughranMcDonald_10KComplexity_2001-2018.csv - A comma-delimited file that has the Complexity measure for all 10-K filings from 2001-2018. The variables included are: CIK, filing_date, complexity, total_words, ff_ind, filing_year, company_name