ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial MarketsAmrouni, Selim;Moulin, Aymeric;Vann, Jared;Vyetrenko, Svitlana;Balch, Tucker;Veloso, Manuela
doi: N/Apmid: N/A
Abstract: Model-free Reinforcement Learning (RL) requires the ability to sample trajectories by taking actions in the original problem environment or a simulated version of it. Breakthroughs in the field of RL have been largely facilitated by the development of dedicated open source simulators with easy to use frameworks such as OpenAI Gym and its Atari environments. In this paper we propose to use the OpenAI Gym framework on discrete event time based Discrete Event Multi-Agent Simulation (DEMAS). We introduce a general technique to wrap a DEMAS simulator into the Gym framework. We expose the technique in detail and implement it using the simulator ABIDES as a base. We apply this work by specifically using the markets extension of ABIDES, ABIDES-Markets, and develop two benchmark financial markets OpenAI Gym environments for training daily investor and execution agents. As a result, these two environments describe classic financial problems with a complex interactive market behavior response to the experimental agent's action.
Towards a fully RL-based Market SimulatorArdon, Leo;Vadori, Nelson;Spooner, Thomas;Xu, Mengda;Vann, Jared;Ganesh, Sumitra
doi: N/Apmid: N/A
Abstract: We present a new financial framework where two families of RL-based agents representing the Liquidity Providers and Liquidity Takers learn simultaneously to satisfy their objective. Thanks to a parametrized reward formulation and the use of Deep RL, each group learns a shared policy able to generalize and interpolate over a wide range of behaviors. This is a step towards a fully RL-based market simulator replicating complex market conditions particularly suited to study the dynamics of the financial market under various scenarios.
A sentiment-based modeling and analysis of stock price during the COVID-19: U- and Swoosh-shaped recoveryRai, Anish;Mahata, Ajit;Nurujjaman, Md.;Majhi, Sushovan;debnath, Kanish
doi: N/Apmid: N/A
Abstract: Recently, a stock price model is proposed by A. Mahata et al. [Physica A, 574, 126008 (2021)] to understand the effect of COVID-19 on stock market. It describes V- and L-shaped recovery of the stocks and indices, but fails to simulate the U- and Swoosh-shaped recovery that arises due to sharp crisis and prolong drop followed by quick recovery (U-shaped) or slow recovery for longer period (Swoosh-shaped recovery). We propose a modified model by introducing a new variable $\theta$ that quantifies the sentiment of the investors. $\theta=+1,~0,~-1$ for positive, neutral and negative sentiment, respectively. The model explains the movement of sectoral indices with positive $\phi$ showing U- and Swoosh-shaped recovery. The simulation using synthetic fund-flow ($\Psi_{st}$) with different shock lengths ($T_S$), $\phi$, negative sentiment period ($T_N$) and portion of fund-flow ($\lambda$) during recovery period show U- and Swoosh-shaped recovery. The results show that the recovery of the indices with positive $\phi$ becomes very weak with the extended $T_S$ and $T_N$. The stocks with higher $\phi$ and $\lambda$ recover quickly. The simulation of the Nifty Bank, Nifty Financial and Nifty Realty show U-shaped recovery and Nifty IT shows Swoosh-shaped recovery. The simulation result is consistent with the real stock price movement. The time-scale ($\tau$) of the shock and recovery of these indices during the COVID-19 are consistent with the time duration of the change of negative sentiment from the onset of the COVID-19. This study may help the investors to plan their investment during different crises.
Deep Learning for Principal-Agent Mean Field GamesCampbell, Steven;Chen, Yichao;Shrivats, Arvind;Jaimungal, Sebastian
doi: N/Apmid: N/A
Abstract: Here, we develop a deep learning algorithm for solving Principal-Agent (PA) mean field games with market-clearing conditions -- a class of problems that have thus far not been studied and one that poses difficulties for standard numerical methods. We use an actor-critic approach to optimization, where the agents form a Nash equilibria according to the principal's penalty function, and the principal evaluates the resulting equilibria. The inner problem's Nash equilibria is obtained using a variant of the deep backward stochastic differential equation (BSDE) method modified for McKean-Vlasov forward-backward SDEs that includes dependence on the distribution over both the forward and backward processes. The outer problem's loss is further approximated by a neural net by sampling over the space of penalty functions. We apply our approach to a stylized PA problem arising in Renewable Energy Certificate (REC) markets, where agents may rent clean energy production capacity, trade RECs, and expand their long-term capacity to navigate the market at maximum profit. Our numerical results illustrate the efficacy of the algorithm and lead to interesting insights into the nature of optimal PA interactions in the mean-field limit of these markets.
Bank transactions embeddings help to uncover current macroeconomicsBegicheva, Maria;Zaytsev, Alexey
doi: N/Apmid: N/A
Abstract: Macroeconomic indexes are of high importance for banks: many risk-control decisions utilize these indexes. A typical workflow of these indexes evaluation is costly and protracted, with a lag between the actual date and available index being a couple of months. Banks predict such indexes now using autoregressive models to make decisions in a rapidly changing environment. However, autoregressive models fail in complex scenarios related to appearances of crises. We propose to use clients' financial transactions data from a large Russian bank to get such indexes. Financial transactions are long, and a number of clients is huge, so we develop an efficient approach that allows fast and accurate estimation of macroeconomic indexes based on a stream of transactions consisting of millions of transactions. The approach uses a neural networks paradigm and a smart sampling scheme. The results show that our neural network approach outperforms the baseline method on hand-crafted features based on transactions. Calculated embeddings show the correlation between the client's transaction activity and bank macroeconomic indexes over time.