The first database of extracted and structured text from global company filings. Optimised for investors.
Text is extracted from underlying PDF filings (annual and interim reports) and unwanted elements are identified and removed.
Text is manually labelled using a custom taxonomy to enable nuanced analysis and like-for-like text comparison.
A hybrid human machine process allows for highly accurate identification of elements, including:
● Management Discussion & Analysis
● Risk Disclosures
● Guidance
● Speaker Information
The data has been designed with the quantitative investor in mind, key features include:
● Long & Bias-free History
● Point-in-time entity identifiers
● Full versioning of all changes
● Easy access via AWS S3
To set up a trial or backtest the data, please reach out