This program is used to parse and extract information from SC13D filings from SEC EDGAR database for the further study of trading activities of blockholders.
Download all the filings from SEC EDGAR from 1994 to the latest.
Retrieve the .txt file of each filing through the index file downloaded from the first program, then parse each file and extract information (CIK
, Name
, Type
, Date
, Link
, File
, Subject Company
, Subject Company CIK
, Filed Company
, Filed Company CIK
, Name of Issuer
, Title of Class of Securities
, CUSIP Number
, Date of Event
, Names of Reporting Persons
, Sole Voting Power
, Shared Voting Power
, Sole Dispositive Power
, Shared Dispositive Power
, Aggregate Amount
, Percent
and purposes of transaction, which include 42 categories).
Merge the files generated by previous Python programs.
Further fix the missing values/incorrect information in the outputs.