Defining Database Developer Experience
Table of Contents
- What's not discussed here?
- The 8 Pillars of Database DX (in no particular order)
- Parting Thoughts
In and out of work, I have a deep interest in all things “Developer Experience” (DX)! It’s been a great passion of mine since the start of my career. Fortunately, there has been a huge trend in recent years for developer tool makers to invest a lot in making their products easier to use. This has happened hand in hand with sales cycles becoming shorter and more low-touch, with engineers choosing and sometimes even committing and paying for dev tools without having to talk to anyone in Sales.
At work (SingleStoreDB), we are heavily investing in the DX for our database product. However, it’s hard to measure the work we’re doing in a very direct way. Of course, we have product telemetry and we can use that to look at how much different features are being used over time. And there’s always customer retention and other metrics such as NPS that we track. As such, I’ll leave the topic of measuring all this to another blog post. Today, I want to try to divide database DX into various pillars. This has been tried by others, but I figured I’d give it my own take.
What’s not discussed here?
Managed Service vs Self-Managed
I will largely consider this factor as orthogonal to the definition of DX for a given database product. The reason for this is that in theory, a self-managed database product could provide better DX than a managed service offering. It all comes down to execution. But if executed properly, managed services can make a lot of things much easier for developers.
Open Source vs Closed Source
Open source database products might attract a larger developer following and that could lead to better “findability” of content on StackOverflow and other websites. However, I also think this should be kept separated from the evaluation of DX for a given database.
Analytical vs Transactional Workloads
Most of this blog post applies to both databases focused on supporting applications (transactional/operational workloads) or analytics. (Of course, databases that can service both of these at once will win in DX because they require less data engineering, but that’s kept out of this post.)
Let’s jump in, shall we?
The 8 Pillars of Database DX (in no particular order)
On the one hand, for queries running against the database, it should be very easy to tell how they’re being executed and whether any improvements can be made. This is typically done through query plans, which help us answer some of these questions
- What’s the largest bottleneck to the performance of this query?
- Are any indexes being used? Would adding an index somewhere help with performance?
- Are any hints being used? Would adding a hint somewhere help with performance?
On the other hand, one should look at how easy it is to analyze the overall state of the database installation. Of course, this will be very different (almost unnecessary) for serverless database solutions. However, for most database products, it’s important for developers to analyze the usage of the database’s system resources over time.
On top of this, it should be possible to setup alerts for different events. And to easily answer questions such as:
- Which queries have run during a certain period of time? And how long did each one take? And how much CPU/RAM/etc. did they use?
- How many connections were there to my database during a certain time range?
- Do I need to scale my deployment to handle the load it is taking on?
Again, serverless promises to solve a lot of these issues for developers. So, for those types of databases, the real DX is that developers don’t even have to ask a lot these questions in the first place.
Finally, it’s important that the observability can be integrated with more broad monitoring tools such as Datadog or Honeycomb. This allows developers to analyze the performance of the entirety of their applications in one system.
This pillar is another one where managed solutions have an extreme advantage over self-managed solutions. The main question to ask here is “How easy is it to deploy the database?”. However, there’s a lot of other interesting angles to look at.
To start with, even for managed database services, it’s important that deployments can be easily automated with APIs and ideally even through things such as Terraform templates. This allows for more reproducible deployments which is important to maintain staging/production/preview instances.
Another important deployment factor is performance. How quickly does a new instance spin up? But also, are upgrades noticeable at all? And how slow does the database instance get during these?
Finally, features like Branching (PlanetScale and Neon both offer this if you haven’t heard of it) make it easier to apply migrations and “deploy” changes to the database’s schema.
#3 Configurability of Tradeoffs
(This pillar could possibly be folded with the Deployment pillar above)
All databases come with their own set of tradeoffs. Some of these will be baked into the core of the engine. However, some of them might be configurable.
How easily can one configure availability vs performance? Or consistency vs availability? Can the data be made available in two different cloud provider regions? Can developers choose between active-active and active-passive for different parts of the schema? These are some of the questions I would ask about a database in order to learn more about its configurability of tradeoffs.
Of course, it’s important that databases provide good defaults here (and that they’re super transparent about them!). But configuration for specific use cases matters too.
#4 Predictable Pricing
Databases can be very expensive. So, it’s important that customers (developers, engineers, whoever it may be!) can keep track of how much they’re spending. For usage-based products, predictability is especially important. Imagine a scenario where your costs spike overnight and you’re not aware of it until the next monthly invoice! That’s not good. The best products allow developers to estimate their costs into the future but also to configure certain budget thresholds and alerts around their costs.
One may also argue that products with flexible billing are simply a better choice than those that do not. I agree, but I personally think that falls outside the realm of DX.
Of course, and this mostly applies for managed services, but paying for the thing has to be easy. Are multiple payment methods supported? Do I need to talk to someone in Sales or can I self-serve?
To sum up, billing has to be simple and predictable.
#5 Programming Language SDKs/Drivers/Clients
For this category, there’s two sub-categories: availability and quality. For availability, we want to look at the programming languages which have well-supported clients for talking to the database. As for the libraries’ quality, that’s much harder to measure. But database drivers have a number of traits/features that are important to get right:
- Security (SSL, Client-Side Encryption, Encoding, etc.)
- Multiple Statements and Prepared Statements
- Error Handling
- … and so much more like Documentation! Just look at the docs for pymongo or node-postgres to go over the vast amount of features that good database clients need to have.
Finally, of course, it’s important that ORMs “just work”. There’s different levels of ORMs, ranging from the simpler ones such as Squirrel or TypeORM to the really powerful ones like Prisma. For many app developers, choosing the ORM comes before choosing the database. So, if a database product doesn’t work with say, Prisma, as an example, that’s something to consider for getting more adoption.
#6 Integrations With Well-Known Data Tools
There’s a long list of well-known tools that allow us to interact with databases. The two main kinds are BI tools and IDEs. It’s important that database products work with a lot of these. There’s not a lot for me to say here, but I’ll list out some tools that it’s important to have integrations with.
#7 Language Ergonomics
What language does the database “speak”? Is it SQL, or some sort of NoSQL? Or is it multi-model and so it speaks more than one language (e.g., CosmosDB)? Whatever the case may be, the ergonomics of the language are very important. This includes things like the ease of use of the syntax as well as the quality of the error messages.
This is all extremely subjective! Even between all the SQL-speaking databases, the ergonomics of their specific SQL dialects differ (especially when it comes to procedural SQL). And the quality of error messages is very different between products too.
#8 Community & Documentation
I left this pillar to the end but it matters a whole lot. The fact is that when using any dev tool, Googling for things has to work really well. That is simply how we software engineers have been accustomed to doing our job for years now. We resort to Google for the simplest of questions! (Perhaps in the future this could be replaced by how well ChatGPT knows the product...)
Database products that have documentation with poor SEO (Search Engine Optimization) will suffer a lot with this. Moreover, the size of the community and the body of StackOverflow questions (particularly the percentage of these that have been properly answered) matters tremendously.
It goes without saying that the official documentation for the database but also for its various clients/SDKs has to be top notch too. And the documentation website itself has to be very accessible to use since engineers might have to spend a lot of time in it (fonts have to be very readable, pages should load fast, dark mode should “just work”, etc.).
Side note: at SingleStore, we recently shipped a feature to make certain code snippets on our Docs website executable! This is how you take it to the next level!
As with most pillars, this one is quite hard to measure objectively. But you can typically just ask an engineer who’s used a certain database product what their experience was! Word of mouth is indeed key.
I expect this blog post will not stand the test of time. (Perhaps I should convert it into a living document on GitHub instead...)
On the one hand, I expect to get some feedback on these ideas over the next few weeks. Besides, as databases evolve, these pillars will undoubtedly have to be changed. Databases have become easier to use over the last 40 years, and the trend is clearly for them to become more "serverless" and "management-less". There is a utopian fantasy where the database is almost invisible, and while that is already possible for simple apps, the same won't be true for more complex use cases for a very long time.
Feel free to reach out on Twitter! Or discuss on Hacker News.