Description
Synnax is proud to present the evolution of our financial indicators modeling task with the introduction of Datathon 3! Datathon 3 is a regression problem with multiple targets - a traditional ML problem. All the datasets are conveniently structured - Kaggle style.
In this competition, Synnax challenges participants to predict the financial metrics for the most recent quarter reported for each company. The dataset categorizes each quarter with 141 financial components, such as TOTAL ASSETS
, TOTAL LIABILITIES
, EBITDA
, along with companies metadata and a rich set of macroeconomical indicators. Features names are encrypted for security. Each quarter is prefixed (Q_0
, Q_1
, …, Q_4
), where _0
denotes the latest quarter in the dataset. The target variables, which were extracted from the Q_0
statistics, are found in the targets_train.csv
file.
The task type is identical to the previous competition (Datathon 2), which included the supplementary macroeconomical data as additional potential training features, but we have a few meaningful changes this time:
We have revealed the actual dates of financial indicators reported across quarters which are a part of the training features and training targets. There are two new columns in the
X_train
:lastUpdatedQuarterEndDate
which indicates the date whenQ_0
indicators, which are a part oftargets_train.csv
were reported. One can infer the dates when indicators of all the subsequent quarters (Q_1
,Q_2
,Q_3
,Q_4
) were reported by subtracting three months fromlastUpdatedQuarterEndDate
to get the date ofQ_1
features and so on.lastUpdatedAnnumEndDate
refers to the date of all the features withY_0
prefix. Accordingly one may infer thatY_1
indicators had been reported exactly one year prior.
Test dataset:
X_forward_looking
contains the same companies which are a part ofX_train
shifted one quarter forward. It does not introduce any specific data wrangling: just train on theX_train/targets_train
and predict on theX_forward_looking
as you normaly would in any other traditional ML problem, but additional explanation may be useful:Contestants use as features
X_train
which represent companies' financial indicators reported inQ_1
-Q_4
andtargets_train
which represent financial indicators reported inQ_0
.The
X_forward_looking
presents the same structure asX_train
only the values inQ_1
-Q_4
had been shifted one quarter forward. This way theX_train/Q_1
values are placed in theX_forward_looking/Q_2
.The task is to predict the next quarter's financial indicators in the
X_forward_looking
dataset.
Public and Private leaderboards are introduced for the first time. From the start and throughout the the whole competition contestants will be scored using the private subset of testing data. At the end of the competition the highest scoring (on the Public portion) submissions will be scored agaist the Private part of testing data.
The Synnax Datathon 3 task is a highly creative endeavor that can be tackled as a conventional machine learning problem. Participants will train models using a variety of features to predict specific targets. Our dataset offers comprehensive financial performance statistics for each company, derived from components of income statements and balance sheets over the last five quarterly and four annual reports. Additional metadata, such as industry, sector, country, and city, provide further insight into the companies' characteristics.
Contestants may also opt for a time series approach, utilizing values from previous quarters to predict future financial outcomes.
Competition Timeframe
Start date: 2nd August 2024
End date: 12.00pm (noon) UTC 30th August 2024
Important note: Synnax has innovated a novel method to assess the probability of default among private companies in the web3 domain, leveraging distributed machine learning efforts from numerous individual contributors. This competition serves as a precursor to the production phase, acquainting participants with the type of data they will encounter and helping to expand our community. Synnax will invite all top-performing participants to join our production pipeline, which includes opportunities for periodic profit sharing.
Last updated