You asked, we built. Notes from the first release cycle that took user feedback seriously. | The Observing Ego

Building in Public

You asked, we built.

Notes from the first release cycle that took user feedback seriously. Most apps treat user feedback like marketing data. The Observing Ego is run more like a clinical practice.

Casey Simon, Psy.D., M.S., LMFT May 27, 2026 9 min read

Most consumer apps treat user feedback like marketing data. They collect it in batches, sort it by sentiment score, and use it to inform a quarterly roadmap presentation. The user almost never finds out what happened to their specific message.

The Observing Ego is run differently. The model I have tried to imitate is the one I know best, which is the clinical relationship. A patient tells me something is wrong. I take it seriously. I do the work. I check back. The fact that I am running this loop with hundreds of users at once instead of one patient at a time does not change the basic posture. If you tell me the app is broken in a specific way, I would like the broken thing to be different the next time you open the app.

This is a retrospective on the most recent release cycle of the app, written in the moment that the work is settling and before I forget how it actually happened. It is also an open invitation to participate in the next one.

Why the standard model is broken for mental health apps

The conventional product-feedback loop in consumer software depends on a stack of assumptions that mostly do not hold for mental health work.

The first assumption is that user behavior is the most reliable signal of what users want. The behavior of a person logging their depression is not a clean signal. It is partial, contradicted by the same person's verbal reports, and shaped by all the same factors that shape any kind of avoidant behavior. The most-used feature in the app is not necessarily the most useful feature. The least-used feature is not necessarily the least useful feature. Behavior is one signal among several, not the answer.

The second assumption is that aggregate patterns generalize to individuals. A mood-tracking population is not a homogeneous user base. A patient with active suicidality and a college student trying to understand their seasonal affect have basically nothing in common from a product perspective except that they are both opening the same app. Designing for the mean produces an app that serves nobody very well. The categorical refusal to use aggregate behavioral data, which is the privacy posture I have already committed to, makes this point somewhat moot in our case, but the principle remains.

The third assumption is that more features is better. In a clinical tool, more features means more cognitive load, more decisions, more places for someone in a vulnerable state to get lost. The right number of features in any given screen is usually fewer than your product instincts suggest. Most useful design changes in the last few months have been about removal, simplification, or restructuring of things that already existed, not about adding.

The fourth assumption is that feedback should be collected anonymously, in bulk, through surveys and analytics. The most useful feedback I have ever received about this app has come in the form of named individual messages from people who used the app and found something wrong. A single careful email is worth more than a thousand event-tracking pings.

The actual loop

What I have settled into is not a methodology in any formal sense. It is closer to a clinical case-management posture, ported to software. Four steps.

Receive

A message comes in through the in-app Request a Feature link, TestFlight feedback, direct email, Reddit, or a public review. I read all of them.

Translate

Figure out what the message actually points to. Sometimes requires going back to the user for clarification. The same move a clinician makes with "the week was bad."

Ship

Small fixes go in the next release. Structural changes get sat with first. Release cadence has been one to three weeks.

Close the loop

Reply to the original message with what shipped. What gets noticed is not the fix. It is that the message landed somewhere it mattered.

The release cadence has been one to three weeks. This is slower than the "deploy continuously" culture of most consumer software and faster than what most clinically-oriented projects manage. I am currently the only person reading the incoming messages, which is sustainable at the current scale and is something I will need to think about as the user base grows.

Three things that recently shipped because users asked for them

To make this concrete, three changes from the recent release cycle. Each was driven by a specific user telling me something was wrong, and each shipped within a few weeks of the original message.

Example 1 / Smallest category

The journal prompt that would not stay put

The journal has a feature where you can pick a prompt to write to, or pass on a prompt and just freewrite. A user reported that when they re-opened a saved journal entry to edit it, the original prompt they had written to was no longer visible. The entry showed a new prompt, or sometimes none at all, depending on internal logic that I had not realized was producing the inconsistency.

The fix took an evening. The relevant code now treats the saved entry's prompt as an immutable property of that entry, displayed when you re-open it for reading or editing, regardless of what the current prompt rotation is doing. There is a clean unit-test suite around the new behavior. The user got a reply explaining what changed and confirming that the next build would have the fix.

This is the smallest possible category of "you asked, we built" change. It was a real bug, identified by a user who took the time to write, and it is fixed now.

Example 2 / Structural restructuring

The insights dashboard that felt too noisy

A small group of users had been telling me, over weeks, that the Insights tab felt overwhelming. The complaints were not all worded the same way. Some said the page felt cluttered. Some said they did not know which insights to focus on. Some said the page seemed to be showing them the same kind of insight repeatedly without organizing it.

What they were converging on, under different vocabularies, was a structural problem. The dashboard was rendering insights as a single mixed list, sorted by relevance, with no visual categorization. From an information-architecture perspective, this is the digital equivalent of handing someone a stack of unfiled notes.

The shipped change split the dashboard into seven typed sections, each with its own header and its own visual treatment. Sleep patterns are in one section. Mood patterns are in another. Habit correlations are in a third. The insights themselves did not change; the way they are presented did. The result is a page that feels half as busy with the same amount of underlying content.

This is a larger category of "you asked, we built" change. The fix required design judgment, not just a code change. It would not have shipped without users telling me, in a half-dozen different ways, that something was wrong.

Example 3 / The meta case

The Request a Feature link itself

The most meta example. For most of the app's life, there was no in-app way to request a feature. Users who wanted to send feedback had to find my email address on the website or post to Reddit. Several users mentioned, separately, that this felt like a barrier and that they assumed I did not want feature requests.

The shipped change is in the Settings screen. A Feedback section with two rows: "Request a Feature" deep-links to the public GitHub issue tracker with the feature-request template pre-loaded. "TestFlight Feedback" goes to the standard Apple flow for testers.

The strategic case for using GitHub as the feature board is privacy and transparency. Feature requests are public; users can upvote each other's requests; I respond in the open. The alternative posture, which is the more common one in consumer software, would be a closed customer feedback portal that I administer. The open posture is more work and produces a more durable artifact. Every request lives somewhere a future user can find, link to, or build on.

This is the largest category of "you asked, we built" change. The original asks were not for a specific feature. They were for a clearer signal that the channel was open. The right response was not a feature in the conventional sense, but a piece of infrastructure that made the existing channel visible.

Where this model has limits

I want to be honest about the parts of the app where user voting is not the right mechanism for prioritization, because the open methodology can otherwise be misread.

Safety design is not user-driven. The crisis flow, the suicide-screening items, the threshold logic that decides when to surface emergency resources, and the data-minimization rules around safety events are all designed against clinical literature and validated instrument logic. They are not features users vote on. If a user told me they wished the safety flow happened less often, I would consider what they meant by it, and I would also retain the clinical judgment about when it should fire. Some things are not democracy.

Clinical accuracy is not user-driven. The PHQ-9 has the items it has. The thresholds it scores at are the thresholds the literature has established. The disclaimers it carries are the disclaimers the field requires. If a user told me they wished the scoring were different, I would explain why it is not. The screeners are inherited from clinical practice and are not adjustable by feature request.

Privacy posture is not user-driven. Users do not get to vote for cross-device sync without an Apple ID, because that requires us to operate a server, which requires server-side data, which is the architectural commitment we have made not to do. The posture is the posture. Specific implementations within it are open to discussion.

Things that are intrinsically about clinical judgment, evidence-based design, or privacy architecture are decided by me. Things that are about ergonomics, interface choices, feature scope, content additions, and how the app fits into a particular user's life are decided in conversation with users.

Most things are in the second category. A few important things are in the first.

How to send feedback that will actually land

A practical note, for users who would like to contribute to the next cycle.

The Request a Feature link in the app's Settings screen routes to the public feature-request board on GitHub. Public, durable, easy to upvote, easy to link to. Use this for anything you would like the app to do, anything that feels broken, or anything you would like to discuss.
Email at the address on the contact page. Direct, private, and good for anything that does not belong in public.
The TestFlight feedback flow, for TestFlight users specifically. Good for pre-release builds and lets you attach screenshots automatically.

What makes a feedback message land:

Be specific. "The journal feels weird" is a start. "When I re-open a saved entry, the prompt I wrote to is no longer visible" is a fix in the queue.
Tell me what you were trying to do. The same observed behavior can be a bug or a feature depending on what the user expected. Knowing the expectation is more useful than knowing the outcome.
Tell me what device you are on. iPhone, iPad, Mac, iOS version. Many bugs are device-specific.
Tell me if it is urgent. Crashes and safety-flow issues jump the queue. Cosmetic things wait their turn. I read everything; the order of fixes is set by impact.

I read every message. I do not respond to every message immediately, but I respond to every message that is asking a question or describing a problem.

A closing note

The general claim I want to make is that consumer software does not have to be run the way most consumer software is run. There is a version of this work where the developer is in a real relationship with the people using the app, where messages land, where changes ship, and where the loop closes. It is slower than the alternative. It produces better software.

This is the version I have committed to. I do not know how long it will scale. I do know that for the size of user base the app currently has, it is working, and it is the most clinically coherent way I have found to build a tool that takes mental health seriously.

If you have something to tell me, the channel is open.