Forecasting stock returns with liquid neural networks using the LNN SageMaker algorithm¶

Stock return forecasting has been extensively studied by both academic researchers and industry practitioners. Numerous machine learning models have been proposed for this purpose, ranging from simple linear regressions to complex deep learning models [1]. In this post, we examine the performance of liquid neural networks (LNNs) [5] [6], a new neural network architecture for sequential data.

LNNs belong to the class of continuous-time recurrent neural networks (CT-RNNs) [3] [4], which assume that the evolution of the hidden state over time follows an Ordinary Differential Equation (ODE). LNNs are based on the Liquid Time Constant (LTC) ODE [5], where both the derivative and the time constant of the hidden state are determined by a neural network. LNNs can capture more complex patterns and relationships than other RNNs and, as a result, often outperform modern deep learning models on time series prediction tasks.

LNNs were initially implemented as LTC networks or LTCs [5]. Similar to other CT-RNNs, LTCs use a numerical solver for finding the ODE solution, which results in slow training and inference performance. In this post, we use the closed-form continuous-depth (CfC) implementation of LNNs [6]. CfCs use an approximate closed-form solution of the LTC ODE and, therefore, are significantly faster than LTCs and other CT-RNNs.

We will use our Amazon SageMaker implementation of CfCs for probabilistic time series forecasting, the LNN SageMaker algorithm. We will forecast the conditional mean and the conditional standard deviation of the 30-day returns of the S&P 500 using as input the S&P 500 realized volatility as well as different implied volatility indices, as in [2].

We will use the data from the 30^th of June 2022 to the 28^th of June 2024, which we will download from Yahoo! Finance. We will train the model on the data up to the 8^th of September 2023, and use the trained model to predict the subsequent data up to the 28^th of June 2024. We will find that the LNN SageMaker algorithm achieves a mean absolute error of 1.4% and a mean directional accuracy of 95.8% over the considered time period.

Data¶

Outputs¶

The model outputs are the 30-day returns of the S&P 500, which are calculated as follows

\[y(t) = \ln{P(t)} - \ln{P(t - 30)}\]

for each day \(t\), where \(P(t)\) is the close price of the S&P 500 on day \(t\). We will use a prediction length of 30 days, meaning that the model will output the 30-day returns over the subsequent 30 days. Given that we use overlapping (or rolling) returns, the predicted 30-day return from day \(t\) to day \(t + 30\) is the last return in the output sequence.

Inputs¶

The model uses as input the previous 30-day returns of the S&P 500, as well as the past values of the following volatility indicators:

The realized volatility of the S&P 500 (RVOL), which is calculated as the 30-day rolling sample standard deviation of the S&P 500 daily returns.
The VIX index, which measures the 30-day implied volatility of S&P 500 options.
The VVIX index, which reflects the 30-day expected volatility of the VIX.
The VXN index, which measures the 30-day implied volatility of NASDAQ 100 options.
The GVZ index, which measures the 30-day implied volatility of SPDR Gold Shares ETF (GLD) options.
The OVX index, which measures the 30-day implied volatility of United States Oil Fund (USO) options.

RVOL is a backward-looking indicator, as it estimates the volatility over the past 30 days, while VIX, VVIX, VXN, GVZ, and OVX are forward-looking indicators, as they reflect the market’s expectation of what the volatility will be over the next 30 days.

Note that we use the same inputs as in [2], with the exception of the PUTCALL index, which we had to exclude as its historical time series is not publicly available. Note also that, as discussed in [2], we exclude the (short-term) term structure of the VIX index (VIX9D, VIX3M, VIX6M) as the different tenor points are highly correlated with each other and with the VIX index, resulting in high multicollinearity and low predictive power.

We will use a context length of 30 days, meaning that the model will use as input the 30-day returns and the volatility indicators over the previous 30 days in order to predict the 30-day returns over the subsequent 30 days.

Code¶

Warning

To be able to run the code below, you need to have an active subscription to the LNN SageMaker algorithm. You can subscribe to a free trial from the AWS Marketplace in order to get your Amazon Resource Name (ARN). In this post we use version 1.6 of the LNN SageMaker algorithm, which runs in the PyTorch 2.1.0 Python 3.10 deep learning container.

Environment Set-Up¶

We start by importing all the dependencies and setting up the SageMaker environment.

import io
import sagemaker
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
from sklearn.metrics import root_mean_squared_error, mean_absolute_error, accuracy_score, f1_score

# SageMaker algorithm ARN, replace the placeholder below with your AWS Marketplace ARN
algo_arn = "arn:aws:sagemaker:<...>"

# SageMaker session
sagemaker_session = sagemaker.Session()

# SageMaker role
role = sagemaker.get_execution_role()

# S3 bucket
bucket = sagemaker_session.default_bucket()

# EC2 instance
instance_type = "ml.m5.4xlarge"

After that we define the neural network’s context length and prediction length. The context length is the number of past time steps used as input, while the prediction length is the number of future time steps to be predicted. We set both of them equal to 30 days, that is we use the previous 30 values of the inputs and output to predict the subsequent 30 values of the output.

# number of time steps used as input
context_length = 30

# number of time steps to output
prediction_length = 30

We also define all the remaining hyperparameters. Note that we use a relatively small model with less than 5k parameters. A detailed description of the model architecture and of its hyperparameters is provided in our GitHub repository.

# neural network hyperparameters
hyperparameters = {
    "context-length": context_length,
    "prediction-length": prediction_length,
    "sequence-stride": 1,
    "hidden-size": 20,
    "backbone-layers": 1,
    "backbone-units": 40,
    "backbone-activation": "lecun",
    "backbone-dropout": 0,
    "minimal": True,
    "no-gate": True,
    "use-mixed": False,
    "use-ltc": False,
    "batch-size": 32,
    "lr": 0.0001,
    "lr-decay": 0.9999,
    "epochs": 800,
}

Data Preparation¶

Next, we download the daily close price time series of the S&P 500 from the 30^th of June 2022 to the 28^th of June 2024 using the Yahoo! Finance Python API. The dataset contains 502 daily observations.

# download the data
tickers = ["^SPX", "^VIX", "^VVIX", "^VXN", "^GVZ", "^OVX"]
dataset = yf.download(" ".join(tickers), start="2022-06-30", end="2024-06-29")

# extract the close prices
dataset = dataset.loc[:, dataset.columns.get_level_values(0) == "Close"]
dataset.columns = dataset.columns.get_level_values(1)

# forward fill any missing values
dataset.ffill(inplace=True)

We then calculate the 30-day returns and 30-day realized volatility.

# calculate the returns
dataset["Return30"] = np.log(dataset["^SPX"]).diff(periods=30)

# calculate the realized volatility
dataset["RVOL"] = np.log(dataset["^SPX"]).diff(periods=1).rolling(window=30).std(ddof=1)

# drop the prices
dataset.drop(labels=["^SPX"], axis=1, inplace=True)

# drop the missing values
dataset.dropna(inplace=True)

# move the returns to the first column
dataset = dataset[["Return30"] + dataset.columns.drop("Return30").tolist()]

After dropping the missing values resulting from the calculation of the returns and of the realized volatility, the number of daily observations is reduced to 472.

dataset.shape

(472, 7)

dataset.head()

dataset.tail()

30-day returns, 30-day realized volatility and volatility indices from 2022-08-12 to 2024-06-28.

We now proceed to renaming the columns in the format expected by the LNN SageMaker algorithm, where the output names should start with "y" and the input names should start with "x".

dataset.columns = ["y"] + [f"x{i}" for i in range(dataset.shape[1] - 1)]

dataset.head()

dataset.tail()

Note

Note that the algorithm’s code always includes the past values of the outputs among the inputs and, therefore, there is no need to add the lagged values of the outputs when preparing the data for the model.

Testing¶

For the purpose of validating the model, we split the data into a training set and a test set. The training set includes the first 70% of the data, while the test set includes the last 30% of the data.

# define the size of the test set
test_size = int(0.3 * len(dataset))

# extract the training data
training_dataset = dataset.iloc[:- test_size - context_length - prediction_length - 1]

# extract the test data
test_dataset = dataset.iloc[- test_size - context_length - prediction_length - 1:]

Note

Note that the data is scaled internally by the algorithm, there is no need to scale the data beforehand.

We now save the training data in S3, build the SageMaker estimator and run the training job.

# upload the training data to S3
training_data = sagemaker_session.upload_string_as_file_body(
    body=training_dataset.to_csv(index=False),
    bucket=bucket,
    key="training_data.csv"
)

# create the estimator
estimator = sagemaker.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    role=role,
    instance_count=1,
    instance_type=instance_type,
    input_mode="File",
    sagemaker_session=sagemaker_session,
    hyperparameters=hyperparameters
)

# run the training job
estimator.fit({"training": training_data})

After the training job has been completed, we deploy the model to a real-time endpoint that we can use for inference.

# define the endpoint inputs serializer
serializer = sagemaker.serializers.CSVSerializer(content_type="text/csv")

# define the endpoint outputs deserializer
deserializer = sagemaker.base_deserializers.PandasDeserializer(accept="text/csv")

# create the endpoint
predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    serializer=serializer,
    deserializer=deserializer,
)

Once the endpoint has been created, we can generate the test set predictions. Given that the returns are overlapping, we are only interested in the last element of each predicted sequence (recall that the prediction length is 30 days, the same as the horizon of the returns).

# create a list for storing the predictions
predictions = []

# loop across the dates
for t in range(context_length, len(test_dataset) - prediction_length + 1):

    # extract the inputs
    payload = test_dataset.iloc[t - context_length: t]

    # invoke the endpoint
    response = sagemaker_session.sagemaker_runtime_client.invoke_endpoint(
        EndpointName=predictor.endpoint_name,
        ContentType="text/csv",
        Body=payload.to_csv(index=False)
    )

    # deserialize the endpoint response
    response = deserializer.deserialize(response["Body"], content_type="text/csv")

    # extract the predicted 30-day return
    prediction = response.iloc[-1:]

    # extract the date corresponding to the predicted 30-day return
    prediction.index = [test_dataset.index[t + prediction_length - 1]]

    # save the prediction
    predictions.append(prediction)

# cast the predictions to data frame
predictions = pd.concat(predictions)

# add the actual values
predictions["y"] = test_dataset["y"]

Actual and predicted 30-day returns from 2023-12-04 to 2024-06-28

Actual and predicted 30-day returns over the test set (from 2023-12-04 to 2024-06-28).

We evaluate the test set predictions using the following metrics:

The root mean squared error (RMSE) of the predicted values of the returns.
The mean absolute error (MAE) of the predicted values of the returns.
The accuracy of the predicted signs of the returns.
The F1 score of the predicted signs of the returns.

# calculate the model performance metrics
metrics = pd.DataFrame(
    columns=["Metric", "Value"],
    data=[
        {"Metric": "RMSE", "Value": root_mean_squared_error(y_true=predictions["y"], y_pred=predictions["y_mean"])},
        {"Metric": "MAE", "Value": mean_absolute_error(y_true=predictions["y"], y_pred=predictions["y_mean"])},
        {"Metric": "Accuracy", "Value": accuracy_score(y_true=predictions["y"] > 0, y_pred=predictions["y_mean"] > 0)},
        {"Metric": "F1", "Value": f1_score(y_true=predictions["y"] > 0, y_pred=predictions["y_mean"] > 0)},
    ]
)

We find that the model achieves a mean absolute error of 1.4% and a mean directional accuracy of 95.8% on the test set.

Performance metrics of predicted 30-day returns over the test set (from 2023-12-04 to 2024-06-28).

We can now delete the model and the endpoint.

# delete the model
predictor.delete_model()

# delete the endpoint
predictor.delete_endpoint(delete_endpoint_config=True)

Forecasting¶

We now retrain the model using all the available data, and generate the out-of-sample forecasts, that is we predict the 30-day returns over 30 business days beyond the current date (2024-06-28).

# upload the training data to S3
data = sagemaker_session.upload_string_as_file_body(
    body=dataset.to_csv(index=False),
    bucket=bucket,
    key="dataset.csv"
)

# create the estimator
estimator = sagemaker.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    role=role,
    instance_count=1,
    instance_type=instance_type,
    input_mode="File",
    sagemaker_session=sagemaker_session,
    hyperparameters=hyperparameters
)

# run the training job
estimator.fit({"training": data})

Given that we only need a single predicted 30-day sequence, we use batch transform for generating the forecasts. The forecasts are saved to a CSV file in S3 with the same name as the input CSV file but with the ".out" file extension.

# upload the input data to S3
inputs = sagemaker_session.upload_string_as_file_body(
    body=dataset.iloc[- context_length:].to_csv(index=False),
    bucket=bucket,
    key="inputs.csv"
)

# create the transformer
transformer = estimator.transformer(
    instance_count=1,
    instance_type=instance_type,
)

# run the transform job
transformer.transform(
    data=inputs,
    content_type="text/csv",
)

After the batch transform job has been completed, we can load the forecasts from S3.

# download the forecasts from S3
forecasts = sagemaker_session.read_s3_file(
    bucket=bucket,
    key_prefix=f"{transformer.latest_transform_job.name}/inputs.csv.out"
)

# cast the forecasts to data frame
forecasts = pd.read_csv(io.StringIO(forecasts), dtype=float).dropna()

# add the forecast dates
forecasts.index = pd.date_range(
    start=dataset.index[-1] + pd.Timedelta(days=1),
    periods=prediction_length,
    freq=CustomBusinessDay(calendar=USFederalHolidayCalendar())
)

30-day returns out-of-sample forecasts (from 2024-07-01 to 2024-08-12).

We can now delete the model.

# delete the model
transformer.delete_model()

Tip

You can download the notebook with the full code from our GitHub repository.

References¶

[1] Kumbure, M.M., Lohrmann, C., Luukka, P. and Porras, J., (2022). Machine learning techniques and data for stock market forecasting: A literature review. Expert Systems with Applications, 197, p. 116659. doi: 10.1016/j.eswa.2022.116659.

[2] Campisi, G., Muzzioli, S. and De Baets, B., (2024). A comparison of machine learning methods for predicting the direction of the US stock market on the basis of volatility indices. International Journal of Forecasting, 40(3), pp. 869-880. doi: 10.1016/j.ijforecast.2023.07.002.

[3] Chow, T.W. and Li, X.D., 2000. Modeling of continuous time dynamical systems with input by recurrent neural networks. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 47(4), pp.575-578. doi: 10.1109/81.841860.

[4] Funahashi, K.I. and Nakamura, Y., (1993). Approximation of dynamical systems by continuous time recurrent neural networks. Neural networks, 6(6), pp.801-806. doi: 10.1016/S0893-6080(05)80125-X.

[5] Hasani, R., Lechner, M., Amini, A., Rus, D., & Grosu, R. (2021). Liquid time-constant networks. In Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), pp. 7657-7666. doi: 10.1609/aaai.v35i9.16936.

[6] Hasani, R., Lechner, M., Amini, A., Liebenwein, L., Ray, A., Tschaikowski, M., Teschl, G. and Rus, D., (2022). Closed-form continuous-time neural networks. Nature Machine Intelligence, 4(11), pp. 992-1003. doi: 10.1038/s42256-022-00556-7.