SEC/EDGAR Data

  1. 10X Summaries - file containing all summary data for all 10-X filings, including header information, sentiment word counts, and file statistics.

  2. 10X Document Dictionaries - file containing header information and word counts for all 10-X filings (14.4 GB).

  3. All SEC/EDGAR Filings Tabulation - an Excel spreadsheet with a tabulation of all SEC/EDGAR filings from 1993-2021.

  4. Master Index Data - the SEC/EDGAR master index files used to create for all of the downloads.

  5. Cleaned and Raw 10-X Files - all 10-X filings for all years. The cleaned files have the extraneous characters removed which provides for substantial compression. The raw files are those downloaded directly from the SEC/EDGAR website.
     

Note: We use the label "10X" to refer to all 10-K/Q filings. Specifically this includes the following forms: 

f_10K = {10-K', '10-K405', '10KSB', '10-KSB', '10KSB40'}

f_10KA = {'10-K/A', '10-K405/A', '10KSB/A', '10-KSB/A', '10KSB40/A'}

f_10KT = {'10-KT', '10KT405', '10-KT/A', '10KT405/A'}

f_10Q = {'10-Q', '10QSB', '10-QSB'}

f_10QA = {'10-Q/A', '10QSB/A', '10-QSB/A'}

f_10QT = {'10-QT', '10-QT/A'}