I hate Mondays – my head is in pain, its start of the week, I have only few hours of few time before I go to sleep, I start my PC and everything I usually use – browser, Signal and… Element client. After all, I’ve been a Matrix user for almost 5 years now, I’ve written a blog on what is Matrix and why even use it, I’ve been closely watching every This Week In Matrix blog posts for past 5 years just observing development of this decentralized protocol and network as it grows and gets better. I’ve contributed here and there very small things mostly for my personal use hacking some changes to matrix-discord-bridge. I convinced some of my close friends to join Matrix on my very own hosted Synapse server that federates so I have someone to chat with. And on desktop… I use Element client. It’s the most feature-full client that allows me to use stickers (which is a must have for a furry) and every feature of crypto that is in Matrix (cross-signing, verification with QR codes etc.).
It sucks. I love Element developers, I love everyone who contributes to Matrix ecosystem, I really do, but my experience with Matrix client ecosystem is really a big let down. But I don’t really want this post to be just a rant post, I want to explain my perspective as a user to help developers understand things from perspective of a little bit techy but lazy user of their application. Not only because I hope it makes Element better in the future for me, but because I hope it makes a better Element for everyone.
So back to Monday and how it’s currently going. I started Element and uhhh, it stores keys in KeePass database retrieving them using Secret Service Integration. Honestly, isn’t my favorite method of storing keys from clients but somehow it just happened, it asked me to access my alternative database I’ve setup for Nheko testing and well, like, whatever man, go ahead. I think it prioritizes checking for active secret service integration before trying to setup keys somewhere else, fair. But I forgot to open database with keys, I have a password to it in other database in KeyPass database, so… I need to get them from them, but I can’t because dialogue option blocks me as it now needs to open the alternative database. Okey, fine. I close Element, right click on task bar, Quit. Great, it disappeared. I open my alternative database, start Element again and… Oh no… Oh no no no no. Not again.
When I see this I know my day is ruined. Why it happens? I’m not 100% sure, but my guesses are either because database/storage gets corrupted in some way or because Element didn’t fully close and another Element process has been started at the same time and because Element on my system doesn’t account for that it leads to two processes using same database and breaking the application. I don’t know if it’s just me problem, or Element does not implement any kind of concurrency check but in the past I were able to reliably recreate it by opening Element while it was already running. Since it can be done accidentally it’s amazingly annoying thing because…
You have to reverify yourself with other device. That’s good. I don’t want to give that security feature up for convenience. I understand benefits of cross-signing and this process doesn’t take a very long time anyways, but it does create another session and since previous one was lost to the void you have to clean that up. Well, that would have been fine on its own but Element is now in light theme. Are all of the settings stored in the same “corrupted” database? Do I seriously have to re enter every labs feature? Do I have to again use settings explorer to get even more features available only behind developer flags? I want to cry. And it’s not even the end because…
My client has to import all of the keys from the backup. New keys for a single room generate in a few scenarios such as 100 messages have been exchanged, someone joined the room, someone reset them manually and probably in many more. I’m not an expert, that’s what I’ve gathered in my time and I might be wrong but the point it – there are a lot of keys. Especially if your account is there for almost 5 years now and you mostly chat in encrypted rooms. That’s a looot of keys. And you know how fast they are being imported to IndexedDB database? Maybe like 7 keys per second on my machine, which gives us hours of wait time. Funniest part? It’s a foreground operation blocking all crypto operations which means you cannot read or write messages during this process AT ALL. Oh, did I say it’s the funniest? I think the fact that regular user does not even know what is happening after they verify their device is the funniest part. Because I’m a tech person, I know it’s just Electron app and I can hit Alt and see console to know what is happening. But a regular user seeing developer console will think that someone is hacking the Matrix. All they will see are the rooms with few messages which are encrypted so they will see this.
And that’s it. You CAN restart the client which will stop process of importing keys which apparently isn’t so important for functioning that it has to run in the background (and Element folks do realize that) but this has been an issue for a while now.
Where is Element?
For this section I’ll ask any Element developer reading this to look away, it’s not for you, it’s probably not something that is a direct result of how Element is written. But Element on Arch seems to have some kind of packaging issue that only amplifies grief caused by monthly occurrence of train wreck described above. Because what Element also does is index encrypted room messages so you can use search to locate past encrypted messages. Obviously you cannot do that with end-to-end encrypted messages on the server side so you have to search client side. However each time that database gets corrupted it needs to be rebuilt. That’s more than 250,000 messages that need to be downloaded from the server, decrypted and indexed in the database. No easy task, but compared to importing keys it does happen in the background and you can pause it with interface. Yet when the process of indexing is running on Arch Linux you are going to have bad time with it crashing (SIGFAULTing) in random moments which can lead to creative solutions to that problem.
But again, this is unlikely to be Element issue and considering very specific environment where it can be reproduced is likely an indication of some kind of architecture/packaging issue. Nonetheless it does impact me a lot especially when the issues above cause lost session so often.
And that’s just a drop in the ocean
The main issue I’ve mentioned in But… section is actually likely a bundle of multiple issues/regressions. Nonetheless they repeat on monthly basis for me and make me waste hours of time. And those aren’t the only issues that I think require a lot more love from Element developers than they get, after all element-web project has a not small number of 4,439 open issues. Many of those probably small papercut defects but among them some like audio messages taking seconds to render a wave form which is a potential opportunity to DOS Element clients, affects quite a few people and only has a priority of Minor. I dare to say that the issue has been improperly prioritized as I’d classify it in part as security issue. All you have to do is send bunch of large music files in a room to effectively render Element client inoperable for significant amount of time. Then there are papercut issues such as this one, which I think affect less people, but are probably worth looking into since the real impact is unknown as only a small promile of users will actually come to repo and complain/contribute.
Shake it, shake it
A common approach to issues by developers on Element repos is “please send a rageshake”. Rageshake is a debug functionality in Element that allows user to automatically gather and send debug information to Element servers for developers to understand the issue better. And I really want to help. When I write my replies I try to investigate as much as I can so I can help pin point the issue or present how it looks like. However I want to stress that a big part why I use Matrix is privacy, and debug information collected by rageshake is more than I’m comfortable with sharing. Metadata kills, and even if it doesn’t collect content of messages themselves, metadata is something I’m also concerned about. I’m fine if you cannot take my issue because I didn’t send a rageshake, you have almost 5k open issues so you have limited time to investigate. It just sometimes hurts that I cannot contribute because I don’t want to share this additional metadata, so if you can, please allow alternative ways I could help gathering the debug information without sending information I may not feel comfortable sharing.
I’ve wrote all of this not out of anger, no, I already am over that. I wrote it because of love. Love to software, to people who create it and use it. I don’t know if it will help, or even reach the people I want it to reach. But I figured its worth a try. For past year there is one thing I wanted to say to Element developers and that is: could you do more bug fixing? Features are great, I love new features, I want Element to have even more of them. However I don’t want features at cost of stability and at expense of user experience. If it’s possible to pause major features implementations and focus on fixing bugs for 6 months I’m sure that Element would become a much better client for everyone. It’s easy to lose focus as developer of such big and complex application – I know, I’m a software developer as well and I barely understand my own mess. But as a user of Element this is my own suggestion to Element team based on my personal experiences. Everyone will have different ones, but I think that many other users have their own issues on repo that they feel are being sort of neglected.