Rust Backtester - System Architecture - Part 1

I’ve just released v0.1.0 of a Rust-based Backtesting engine on:

It’s a rewrite and enhancement of the previous version written in Python, so in terms of application-level architecture you can follow the schematic at the top of that article as a starting reference.

Why bother rewriting it in Rust from Python?
#

“If it ain’t broke don’t fix it”, right? It wasn’t broken but could be better.

Python and its ecosystem of libraries is king when it comes to rapid prototyping, especially in domains where numbers matter and decisions need to be made based on quantitative data, e.g.:

analytics
quantitative finance
sciences
engineering

but when building with production in mind, and performance is a concern from the beginning, it cannot compete with a pure compiled language like Rust. Part of the enhancements mentioned above will involve fiddling with threads, references, memory management, networking, and other bits that Python is not designed to handle – it’s an interpreted language after all.

Why Rust and not C++ for a Finance-focused application?
#

Rust performs like low-level languages, e.g., C, C++, etc., while also providing higher level abstractions of interpreted languages like Ruby, Python, Javascript, etc.

Its ecosystem is not as mature as that of Python or C++, but the language itself gives the user the tools to fill the gaps.

I also considered other candidates like: Go, C#, Java, but needed a language that allowed for fine-tuned performance optimisations down the line if needed.

Before diving into the application architecture, an overview of the system architecture is fitting.

System Architecture
#

%%{init: {"flowchart": {"useMaxWidth": true}, "themeVariables": {"fontSize": "32px"}}}%%
flowchart LR

U[Users / Clients]

LB[Proxy Server & Load Balancer]

subgraph CI[CI & CD]
  C1["CI Server (Build & Test)"]
  C2["CD Server (Deploy)"]
end

subgraph APP[Application Servers]
  A1[App Server #1]
  A2[App Server #2]
  A3[App Server #N]
end

API[API]

DB[(Database)]

U --> LB

LB --> A1

LB --> A2

LB --> A3

A1 --> API

A2 --> API

A3 --> API

API --> DB

C1 --> C2
C2 -.->|Deploy| A1
C2 -.->|Deploy| A2
C2 -.->|Deploy| A3

Pros
#

I’m fond of this structure because it encourages system-wide separation of concerns:

proxy server doing proxy server things and load balancing;
“dumb” application servers only focused on serving the apps to users;
“smarter” API only focused on serving back data to the app and interacting with the database;
ease of development and debugging;
scalability – can add more application servers if the load demands it;
resilience – if one of the application servers is down another can pick up the slack (*);
high availability – similar to the consideration for the resilience point;
security – the API and database are behind multiple layers

(*) this helps with fault tolerance and redundancy only partially, as it is all on the same machine. For proper redundance another machine would be required.

Cons
#

can be hard to reason about all the moving parts;
harder to set up
single point of failure in the load balancer.

The last point can be mitigated by making the load balancer itself redundant, but it’s similar to the resilience point: if it’s all on the same machine and it’s down for whatever reason, e.g. power outage, etc., the whole thing won’t work.

DevOps
#

The Continuous Integration (CI) & Continuous Delivery (CD) flows are parallel to the traffic/production flow, and only relevant in development and testing. How it all comes together is new code is pushed to the CI server which runs the test suite against both new and pre-existing code, and if the tests pass the new code is accepted and merged into the rest of the codebase.

Once the CI server green lights the changes, the CD flow can take over and deploy them to the live environment. Note: you can have Continuous Delivery without the delivery being automated. To mitigate risk I prefer to manually click the deploy button, even if the actual deployment is automated.

To container or not to container; that is the question
#

I wanted to be so trendy and containerise everything since Docker seems to be what everyone is using today, but ran a cost benefit analysis and noped out since it adds too much system complexity and no benefit in terms of performance, ease of development or security, but actually negatively impacts those.

And that’s for it the system architecture.

In the next article we will cover the application-level architecture.

Feel free to share the article on socials:

Why bother rewriting it in Rust from Python?#

Why Rust and not C++ for a Finance-focused application?#

System Architecture#

Pros#

Cons#

DevOps#

To container or not to container; that is the question#