I will often use the term “Signer” to mean the same thing as “Wallet”, because these programs can sign more than just transactions.
UX involved with handling ethereum signing is especially cumbersome. This is mostly due to the fact that dapps and wallets are used in many different environments (i.e. desktop, mobile, CLI, plugin, mobile-browsers desktop-browsers, and hardware). Attempts to improve this UX shortcoming have generally involved rigid user work-flows, or specific application combinations. I assert the problem is best solved at the operating system level using a widely standardized protocol like the one proposed below.
The following describes
- A way for dapp developers to handle any action requiring a signature (i.e. signing a tx or logging in). The spec defines how the wallet is chosen and invoked by the user’s Operating System
- A way for wallet applications to handle incoming requests for signature.
Users should be able to use any wallet of their choosing when interacting with a dapp. Having users download a specific wallet per-environment reduces security, inhibits adoption, and is terrible UX: The user has to “top-up”, and not forget about each application-specific wallet. This sucks for the user because they have money all over the place and a laundry list of wallet program installations across devices.
The current architecture leads to staments like “Our dapp currently works on the chrome browser with metamask enabled. We plan to add support MyEtherWallet soon.”
As the number of wallets and dapps grow, the combinations become n^2. This will inhibit small wallets from entering the market, but require users to download all major wallets.
Desktop Browser Dapps
Mobile Browser Dapps
Native Dapps (the worst UX of all)
Currently, if a mobile app developer wants to integrate an ethereum feature in their app, they genreally build a wallet into the app itself (from scratch). The user ends up with loose change in siloed apps, and these wallets will have widely varying (and therefore pathetic) security standards.
Desired Dapp Experience
Before defining the proposed spec, I’d like to outline the ideal user experience, and work backwards to achieve it. After all, thats how this spec came about.
- If a dapp is browser-based, the user can browse to it using ANY available browser on mobile OR desktop.
- If the dapp is a native mobile or native desktop app, they simply download then open the application.
- When the dapp requires a signature it should
- Automatically open the user’s preferred mobile wallet when on mobile, or
- Automatically open the user’s preferred desktop/hardware wallet when on desktop.
- The wallet should then display details of the signing request.
- The user can then tap or click “sign” and be sent back to the dapp.
This problem finally must be addressed at the Operating System level, because it is the only way to flexibly hand off control flow to another app.
Deep linking is a method available from any application or webpage that provides the user’s operating system with instructions to open a specific application (and can carry an arbitrary data string with it). They work in basically all environments: Android, iOS, MacOS, Windows, and Linux (possibly others).
Most of us have seen deep linking used when clicking on a Zoom or Spotify link. It usually invokes focus to a specific application. In our case, we don’t want to open a specific (branded) application (i.e. Jaxx, Toshi, or Gnosis-Safe). Rather we tell the operating system to open the user’s “default signer” (wallet).
IOS and OSX
Implementation of signer apps varies by OS. On IOS it can be done with NSUrl Protocol, which is like a link, but instead of a opening a specific app, you specify a name space (e.g. “invoke-signer”) to which any app can register itself as a handler.
All apps on the device registered as a handler will call a boolean function deciding whether or not the app should handle the incoming request (perhaps true if (and only if) the user has chosen this signer as default). The first app to return
true from this function will be launched with access to data from the URL (the rpc data and/or more).
The name space should indicate fundamentally that its doing the signing. So
ethereum-signer was my first though. However, nearly any cryptocurrency can benefit from this. For that reason simply
signer would be better. Lastly this spec is about more than just a referring to the signer app - it is about specifically invoking that app. Therefore I think
invoke-signer is the most relevent namespace to use.
From there, the
path could communicate the specific cryptocurrency, i.e.
invoke-signer://ethereum-classic…). The path can be used by the singer in its boolean function to determine whether it’s capable of handling this particular signing request. Then we need a way to tell the signer exactly what to do:
From there we need to communicate exactly what the signer should do. Rather then re-inventing the wheel, we can just
uri_encode the existing JSON RPC calls. For instance all signers would want to support the existing RPC methods
In standard query format this would be:
Over time signers should become the secure and predictable place to view blockchain information, espesially data about to be signed. We can not really trust data being viewed on brower-based explorers or new dapp websites. More advanced RPC methods will be added over time to accomidate this. For instance, methods that pass
compilerVerion, will be able to verify the code and ABI in the signer app. But none of that needs to be defined right now, because the format accepts arbitrary future methods!
New User Experience
Most users will have a favorite mobile signer and enjoy a consistent experience with every dapp. Some may still use a plugin like metamask when on desktop, and power users will use a hardware signer for ultra-high security as needed.
It becomes much easier to add a single “ethereum feature” into any software application. The app dev does not need to worry about any of the details of building a wallet/signer. They just create a string to be signed, and invoke the user’s signer with it. The user expectation is only that they have a signer, as opposed to some specific signer.
This approach currently has a flaw that the name of the caller application is not sent to the invoked wallet, so handing control back to the caller app is very difficult. This can be worked around in an app-to-app setting (by passing it as part of the payload), but not in a browser-to-app setting (without complicated “browser sniffing” techniques).