Are full stack web frameworks worth it?

Full stack web frameworks, like Reflex (Python) or Vaadin (Java), enable backend-focused developers to build web applications, including the frontend UI, using a backend language like Java. These frameworks include a (large) UI component library and require no knowledge of frontend technologies. However, using such frameworks comes with major caveats that should give you pause and think before using them. In this article, I present my journey using Reflex and discuss both Reflex-specific and general caveats of full stack web frameworks.

Table Of Contents

Introduction
Case study: Reflex
General caveats of full-stack web application frameworks
Conclusion

Introduction

If you have a team to create a web application (with a frontend and backend), you typically have specialists such as frontend and backend engineers, who know the corresponding programming languages/frameworks/build tools in detail. But if you only have backend expertise (e.g. when working alone), you might still want to develop a web application, but you likely don’t have time (or want) to learn the languages, frameworks, and tools of the frontend world. Instead, you want to do everything in the backend language you already know (e.g. Python, Java, C#, Go, or PHP).

After researching, you’ll quickly find frameworks that solve exactly this need, e.g. Streamlit, Anvil or Reflex (for Python), Vaadin Flow (for Java), Fyne (for Golang), Uno or Blazor (for C#/.NET) or Laravel Livewire (for PHP). These frameworks promise that you can define both the frontend and backend in your favorite (backend-oriented) programming language, offering a (sometimes large) UI component library, and making it easy to call the backend from the frontend (and vice versa).

But is it a good idea to use such frameworks? If these frameworks were a panacea, wouldn’t everyone use them?

As this article explains, there are serious concerns when using these frameworks, from a senior developer’s perspective. As a case study, I share my hands-on experience of developing a web app, Docker Tag Monitor, in Python using Reflex. But I also offer a more general perspective regarding the caveats you need to look out for.

Case study: Reflex

Having worked with Reflex for several months now, I am generally happy with it, but I also did encounter several drawbacks.

Technical internals of Reflex

Before I discuss these drawbacks, l will briefly present how Reflex, a full-stack web framework, works.

Reflex is written in Python, targeting Python (backend) developers who want to build a backend and frontend, using UI components from Reflex’s large UI component catalog.

This is a minimal example of a web app created using Reflex:

import reflex as rx

# main.py - here we define the wsgi application
app = rx.App()


# pages.py
# Defines one or more views, which need to return a single "root" rx.Component
@rx.page(route="/", title="Demo")
def index() -> rx.Component:
    # hstack = horizontal stack --> the first N (non-keyword) args are the immediate children
    # that all appear in one (horizontal) row
    return rx.hstack(
        rx.button("Decrement", on_click=CountState.decrement),
        rx.heading(CountState.count),  # updates the HTML whenever the value of the CountState.count attribute changes
        rx.button("Increment", on_click=CountState.increment),
        spacing="3",  # many other configuration kwargs exist, e.g. related to responsive design
    )


# state.py
# Server maintains one individual State instance per user (session)
class CountState(rx.State):
    count: int = 0  # state attribute which is observed by the frontend (rx.heading)

    # Event handler methods which manipulate state and may perform HTTP calls / DB queries.
    # This is the "backend" logic!
    @rx.event
    def increment(self):
        self.count += 1

    @rx.event
    def decrement(self):
        self.count -= 1Code language: Python (python)

Under the hood:

Reflex transpiles those parts of your Python code that implement the frontend into React/TypeScript components, producing a Single-Page-Application. If you run the Reflex dev-server (“reflex run“), this transpilation happens on-demand, but for a production deployment, you can precompile the frontend to an HTML/JS bundle and serve that bundle with a web server such as Nginx.
The backend is based on FastAPI. While you could create (and call) “normal” FastAPI endpoints/routes from your frontend, the “Reflex way” of calling the backend is to declare one (or more) State classes, with attributes (“variables”) and methods. Whenever a browser client opens the web app, it establishes a permanent WebSockets connection to the backend, which starts a stateful session on the backend (instantiating a session-specific instance of that State class). When the user clicks a frontend button and wants this action to call the backend, you’ll register one of the State‘s methods as on_click handler. That State method can do whatever you want (e.g. perform HTTP requests or query a database) and update the State‘s attributes. You configure your frontend’s UI components to reactively observe these State attributes, causing your DOM to change accordingly. There are methods such as rx.cond() (conditional rendering) or rx.foreach() to dynamically build the tree of UI components, based on the attribute values. Each UI component supports different kinds of events (e.g. on_click for a button, on_change for an <input> field, etc.). Whenever these events trigger, the React component sends a JSON object (representing that event) over the WebSocket connection to the server, which routes it to your registered State handler method. If the content of any State attribute changes as a result of running a handler method, these changes are streamed back to the client via the same WebSocket connection.

In the following sections, I’ll discuss issues I’ve encountered while working with Reflex.

Steep learning curve

I encountered several challenges both in the frontend and backend:

In the frontend, it took me a while to realize that State attributes work differently, depending on whether I access them from the backend or frontend code. For instance, suppose you declare a State attribute such as
dates: list[datetime.datetime] = [] and fill it with a list of datetime objects in the backend (via a State method). Inside (backend) State methods, you can call functions on the datetime objects, e.g. datetime.strftime(), because you are working with actual Python objects of that class. But Reflex will only stream/serialize primitive data types to the frontend view code, (e.g. int, str, …), or compositions of them (e.g. lists, dicts or dataclasses). Reflex would implicitly convert non-serializable data types (like datetime) to a string and show weird error messages if I tried to call a function (such as datetime.strftime()) on the (supposed) “date object” from the frontend code.
- Another frontend-related issue I had was this: the syntax you need to use for rx.cond() is not intuitive! In the frontend code, you cannot use typical “Pythonic” expressions such as
  “len(State.dates) > 3 or State.some_bool“, because Reflex transpiles such expressions on State attributes to some Observable JavaScript code. The Reflex documentation’s search (“Ask AI”) will happily lie to you when you ask it 😂 and claim that such statements (e.g. len()) would work (lying documentation is always bad…). While Reflex supports transpiling some simple boolean logic operators in rx.cond(), e.g. AND/OR/negation or comparing numbers, you must extract more complicated conditions into a dedicated (boolean) State attribute and set its value in the backend method.
In the backend, I found that accessing a relational database was more complicated than I thought. I was used to Django’s ORM, which is well documented and there is only a single documentation site you need to read. With Reflex, however, once you need features beyond the absolute basics, you need to understand the entire stack of frameworks that Reflex uses under the hood (and find the correct documentation pages). This increases the mental load. Here are concrete examples:
- Reflex has a thin layer on top of the sqlmodel library (see Reflex docs). This layer establishes a database connection (session object) with a few basic methods (exec(), add(), delete(), commit()) and allows you to define data models (by defining classes that inherit from rx.Model).
- To get more advanced queries, you’ll need to learn sqlmodel more deeply. See e.g. here for how to retrieve the count of objects.
- For even more advanced features, you need to learn sqlalchemy (which sqlmodel is based on), see e.g. here for how to define a unique constraint covering multiple columns.

Limited frontend customizability

While developing the Docker Tag Monitor, which uses vertical bar charts (see Reflex docs), I discovered that it is not possible to set the height property of any Reflex chart to a dynamic value. You must hard-code the height. This makes no sense because the number of rows or bars I want to show is dynamic (depending on the number of rows in a database table). I asked for help on Discord and never got a response. I’ve filed an issue since then, let’s see how it goes. Until then, enjoy huge ugly bars 😆

Difficult debugging

I remember two “simple” things that didn’t work, even though they should have been straightforward.

Example 1: A few months ago, I was unable to upgrade to a newer Reflex version (I think it was 0.6.4), because there were always errors regarding the installation of Node. Under the hood, Reflex downloads and installs several tools (e.g. fnm and Bun) to manage a local Node.js installation. Somewhere deep down the stack, one of these tools had problems downloading the pinned Node.js version. I had to wait until a newer Reflex version came out, which bumped the pinned Node.js version, effectively resolving the problem.

Example 2: I wanted to implement a form where the backend generates a list of items (from a database query) that should be shown as checkboxes in the frontend. Because there could be many entries, I also wanted to add a “select/unselect all” checkbox, so that the user can click it and then check only a few options before submitting the form. Implementing this “select/unselect all” checkbox turned out to be very time-consuming (~4 hours!) for two reasons:

Reason 1: Reflex frontend components cannot manipulate other frontend components directly, because they cannot reference them. Instead, it took me a while to realize that any interaction between frontend components happens indirectly via (backend) State attributes. Consequently, I had to change my State attribute to not just be a simple list of strings, but a list of objects where each object stores the checkbox name and whether it is checked. The code looked as follows:

from dataclasses import dataclass

import reflex as rx


@dataclass
class MyCheckbox:
    value: str
    checked: bool


class State(rx.State):
    checkboxes: list[MyCheckbox] = []

    @rx.event
    async def handle_submit(self, form_data: dict):
        print(str(form_data))

    @rx.event
    async def on_checkall_change(self, checked: bool):
        for cb in self.checkboxes:
            cb.checked = checked

    def populate_data(self):
        self.checkboxes.clear()
        self.checkboxes.append(MyCheckbox(value="1", checked=True))
        self.checkboxes.append(MyCheckbox(value="2", checked=False))
        self.checkboxes.append(MyCheckbox(value="3", checked=True))

    @rx.event
    async def set_checkbox(self, index: int, checked: bool):
        print(f"set_checkbox: {index} / {checked}")
        self.checkboxes[index].checked = checked


def index() -> rx.Component:
    return rx.container(
        rx.form.root(
            rx.vstack(
                rx.checkbox("Select/unselect all", name="checkall", on_change=State.on_checkall_change),
                rx.foreach(State.checkboxes,
                           lambda field, idx: rx.hstack(
                               rx.checkbox(text=field.value, name=field.value, checked=State.checkboxes[idx].checked,
                                           on_change=lambda checked: State.set_checkbox(idx, checked)
                                           )
                           )
                           ),
                rx.button("Submit", type="submit"),

            ),
            on_submit=State.handle_submit,
            reset_on_submit=False,
        ),
    )


app = rx.App()
app.add_page(index, on_load=State.populate_data)
Code language: Python (python)

Reason 2: The above code, while correct, did not work. Neither would any checkbox change its state when clicking it, nor did the “select / unselect all” checkbox do anything. Fortunately, I got quick help on Discord. I only needed to replace lines 6-7: instead of using a dataclass, I needed to inherit from rx.Base. According to the docs, it should not matter whether I use dataclass or rx.Base to define my custom data structures for the State. But in practice, dataclasses don’t properly implement “inner-model change tracking” yet!

Broken hot-reloading

The development server of Reflex (started via “reflex run”) supports hot-reloading on every OS. Whenever you change Python code, the dev-server quickly re-transpiles only the changed files. Hot reloading is vital to get a smooth and reactive development workflow (since a “cold start” can take 10 seconds or longer).

I developed Docker Tag Monitor on my Windows 11 workstation. The hot reloading would work after a fresh boot of Windows for a few minutes, but then quickly deteriorate. Hot reloading stopped working, and the “sigterm” (Ctrl+C) signal was no longer processed correctly either (the process would just hang, such that I had to manually kill all related processes).

A workaround is to use WSL, where these problems do not seem to occur.

Unclear scalability

To me, it is unclear how well Reflex scales to a large number of concurrent users.

For one, the Reflex docs don’t explain how much load is generated by the client side. For instance, how long does the generated JavaScript code keep a user’s WebSocket connection open? Maybe it is closed once the browser tab is no longer active, or after a certain timeout? Who knows.

As for scaling the backend:

Python (in general) still scales poorly on multi-core hardware due to Python’s Global Interpreter Lock. To work around this limitation, Reflex uses the same approach as many other Python web frameworks: Start a master process with a web server like Gunicorn (using uvicorn under the hood) which then starts and maintains a (configurable) number of child processes to which incoming requests are distributed.
In Reflex, each Gunicorn worker is configured to automatically restart after exceeding 100 connections by default (that number is configurable), to avoid memory leaks (apparently, most Python web workloads are coded so poorly, leaking so much memory, that this is a “reasonable default” behavior 😥). This would interrupt a WebSocket connection. Uvicorn does not seem to limit the number of concurrent connections (docs). Still, the scaling implications are not clear to me yet, nor are they documented in the Reflex self-hosting docs.

In any case, Reflex’s architecture is definitely not an ideal prerequisite to scale to a large number of users. It keeps an open WebSockets connection the entire time a user browses the site, and requires a Redis-compatible cache to share the user session data between server processes to recover from broken connections. Keeping open connections and sessions is costly.

General caveats of full-stack web application frameworks

If you think about whether to use such a full stack web framework (or while you are choosing a specific one), consider these potential caveats (in no particular order):

Limited flexibility/customization of the frontend: does the framework’s UI component library satisfy all your requirements? If not, does it offer customization of the components? And would you even have the skills to develop them? If not, you’ll need to ask the framework vendor to implement the customizations for you, which may take a long time, or they might not do it at all, if they have other priorities.
Performance bottlenecks: expect that you have little control over how well the backend or frontend code scales (to many users or complex layouts).
Vendor lock-in: suppose the framework becomes abandoned, or you need / must switch to another one for whatever reason: you’ll likely have to start from scratch again! Also, the odds are against you. What is more likely: that Reflex development is stopped, or that React development stops? After all, these full-stack web frameworks are a niche market!
Smaller community and ecosystem: in contrast to backend-only frameworks (e.g. Python’s Django) or frontend-only frameworks (e.g. React), full-stack web frameworks will have a significantly smaller community, which may result in issues like these:
- Long(er) response times in chats or forums (or no response at all) if you need help
- Missing or unclear documentation
- Fewer or outdated (broken) sample code
- Frequent introduction of breaking changes (even in patch versions, e.g. here for Reflex)
- Fewer (or no) IDE plug-ins to simplify your development
- Higher chance for the project to become unmaintained
Difficult debugging: if there is a bug somewhere deep down in the framework’s generated frontend code, its backend middleware, or the frontend-to-backend communication layer (basically: anywhere in the “magic” that abstracts the hard things away), you’ll have a very hard time to pinpoint the problem. Especially if it affects frontend code where you don’t have expertise anyway. For instance, I wouldn’t know how to debug the React code that Reflex generates – I don’t think it’s possible. The consequence is that whenever you run into problems, you won’t immediately know whether you’re just using the framework incorrectly, or if it’s an actual bug in the framework.
Lack of Frontend Best Practices: do your nonfunctional requirements include aspects like accessibility, responsive design, or SEO optimization? If yes, you better check whether your framework has you covered. For instance, Reflex’s web accessibility features are abysmal, and if you wanted to change this, you’d probably have to get into the weeds and become a contributor.
Steep learning curve: these frameworks do not really enable “rapid prototyping” for backend developers! You still need to learn how to do frontend development, just this time, using proprietary abstractions of your chosen framework. This may be less work than having to learn a new frontend language and framework (e.g. TypeScript + React), but still, it will take time (which will highly correlate with the quality of the framework’s documentation). There is no free lunch.

Conclusion

So, should you use a full-stack web framework for your project or product? As always, it will depend on your evaluation of the caveats I presented above.

My takeaway is that such frameworks are fine for smaller projects that are not time- or mission-critical. But for “real” professional web applications, where you need to count on the reliability of your software dependencies, I would prefer assembling a diverse team with expertise in frontend and backend technologies.