This post describes the main lessons I learned while developing and supporting a successful native consumer app for iOS, Android and Windows Phone: Poollie WK 2014. I will focus on client-side cross-platform architecture and development; the lessons learned while developing the API for the app in Windows Azure really deserve a separate blog post.
Note that while this post is in English, the app is currently for the Dutch market only, so the app itself and some linked resources are in Dutch.
App Track Record
A couple of months ago, I was asked to join the app development team at Macaw that was building the Poollie: WK 2014 App. Poollie is an app that lets you create and join football pools with friends, family, colleagues and others. The app lets you predict matches and tournament statistics, follow live match scores etc. At that time, Poollie was already announced to be available in time for the world championship football 2014 in Brazil as a native app for iOS, Android and Windows Phone. So the pressure was on J
After a hectic and fun few months we can look back on a challenge that was met successfully. Poollie was released on time in all three app stores and became much more popular than we anticipated.
This is what Poollie looks like on iPhone, Android and Windows Phone:
And here are some app ‘achievements’:
You can read more of the Poollie story, from a non-technical perspective, here.
These are the key technologies we used to build Poollie:
|Xamarin for productive cross-platform native app development||QuickCross for productive cross-platform MVVM pattern code-generation and data binding|
|Json.NET for cross-platform local data persistence and IEnumerable for LINQ queries
(although we started out with sqlite-net and IQueryable)
|SignalR for in-app cross-platform realtime communication|
|Microsoft Azure Notification Hubs for scalable cross-platform push notifications||Splat for cross-platform image loading and display|
|Raygun for cross-platform error tracking||Google Analytics for cross-platform usage statistics|
Challenges and Solutions
The main challenges that drove our learning were cross-platform responsiveness and stability.
Users care a great deal about the app startup time. More specifically, the time it takes from launching the app to when you can start interacting with it. If this takes too long, people will simply stop using your app. The first versions of Poollie had a startup time varying from 6 to 16+ seconds, depending on platform and phone model. In a number of incremental performance improvements, we reduced this to a startup time of 3 to 5 seconds. The main changes through which we achieved this were:
- Eliminating all web requests from the app startup, and loading only the data that is necessary for initial display from the device’s local storage. Once the app is interactive, background loading of more local data and an attempt to update with online data are started.
- Optimizing Async performance of non-UI code with ConfigureAwait(false)
- Replacing sqlite-net and IQueryable with Json.NET and IEnumerable
- Replacing loading of resource bitmaps on Windows Phone through Splat with native WP resource URI’s
For more details on these changes, see Technology Tips below.
Another aspect of app responsiveness is how quickly you can navigate between the app screens and how quickly changes made in one screen of the app are reflected in other screens. We achieved this by using coding patterns (detailed below) that allow for view navigation and viewmodel updating to be asynchronous and independent of each other, while also optimizing against the view lifecycle on each platform.
If there is one thing that is even more important to users than app performance, it is app stability. An app that crashes is quickly abandoned. The first requirement for improving stability is that you need to know when something went wrong on a user’s device. Every time. We used Raygun for error logging, diagnosing and tracking and it proved invaluable.
We learned that the main causes for instability were web requests failing in unexpected or platform-specific ways, code that depends on the order of asynchronous task completion, and not properly handling lifetime events for low memory conditions (or not fast enough – in which case performance issues will cause stability issues).
Even when a technology is overall high-quality, sometimes the devil is in the details. Here are some tips that would have saved us time and troubles.
Azure Notification Hubs are great, but beware of the hidden 10 device limit
Azure Notification hubs make implementing cross-platform push notifications a breeze.
In the app you can simply register platform-specific, parameterized notification templates that leverage platform-specific (or device-specific) capabilities (e.g. Tile notifications in Windows Phone or different formatting for different screen sizes).
You also do not need to maintain platform-specific device ID’s on your servers and relate them to your user ID’s so you can address specific users; you can simply specify your user ID as a tag when you register a template on a device. E.g. here is the Poollie Windows Phone notification template registration code:
The templates specify named parameters: Title, Message and Navigation. With each template registration, you specify the tags for which this device should listen. One of the tags we used is the Poollie User ID:
On the server, sending messages is child’s play:
This sends a parameterized message to all devices on all platforms that have registered templates for the tag “NotificationMessage“. We could just as easy have sent a message to all devices of a specific user by using the user id tag, e.g. “#12345“.
As described in Debugging Notification Hubs, you can debug your app by sending notifications from the Debug tab in the Azure Notification Hub management portal, or by downloading a Service Bus Explorer tool (as source code). This lets you test your app even before you have implemented sending, and if you have issues it lets you quickly determine whether they originate in the server or in the app code.
E.g. this is how you specify the above template parameters and send a debug message to all devices that have templates registered for the (user ID) tag “#12345”:
But… beware! If you use the Azure tutorial code to send messages, and you test the code, all may appear to work well. Until you send a message to multiple devices… then you will find (if you are lucky, before your app is published) that only the first 10 registered devices will receive the notification. And exactly the same will happen if you use the Azure portal Debug tab or the Microsoft Service Bus explorer tool to send the same message to multiple devices. So you are inclined to think that whatever is causing this issue, it is not your send code – surely the Microsoft portal and Microsoft debug tools will not both have the same error as your own send code, right? Wrong!
What is causing this is a poorly designed API detail, combined with a lack of documentation and guidance. The 10 device limit is hidden in the code that the Azure tutorials gives you for sending notifications:
What? You don’t see it? Well, the CreateClientFromConnectionString method also has an overload with a 3rd parameter; a boolean enableTestSend:
Still not obvious? Although neither the above intellisense nor the MSDN documentation for this overload gives you any hint of this, enableTestSend is (contrary to convention) true when you omit it, and then this (misleadingly harmless named) flag causes a major side effect, which only is documented as a remark in the most detailed description page of the read-only EnableTestSend property on the NotificationHubClient:
It is very easy to not notice this limitation during development, because the first 10 devices you register will always get the notifications. On top of this, nothing of this 10 device limit is mentioned in the Debugging Notification Hubs documentation or in the Azure tutorials. Perhaps this limitation is so well hidden and counter intuitive in the API that the Azure portal debug tab / tool / tutorial builders have missed it themselves?
The solution is simple: just set enableTestSend to false:
I already asked Microsoft to update the Azure notification hubs tutorials sample code + documentation and the MSDN documentation with a clarifying comment about the 10 device limit. But until they do, I hope this post will prevent others from falling in the same trap – I actually only found this out after Poollie was published and distributed to 40000+ users.
Raygun tips for error tracking
Raygun.io is a great cross-platform library for error tracking. If you go for a paid account, you can create a separate application for each platform – something which I can recommend, especially if different developers are maintaining the app for different platforms.
Raygun collects a lot of information for you on each reported error: app version number, stack trace, device info and user info if you specify a user id or email. You can also pass additional information with an error report in the form of tags (a simple list of strings). In Poollie we used tags to specify the sequence of user actions leading up to the error (e.g. navigation events and edit actions). This provides us with a helpful (start of) error reproduction path:
In a Raygun error report, you then get this type of information:
The most important type of exception to report is the unhandled exception. These are the exceptions that you did not anticipate in your code and that have the highest potential to crash your app or make it nonfunctional. To easily spot this type of exception in Raygun, simply wrap it with a custom message before you report it in Raygun. E.g. in the Windows Phone Application_UnhandledException event:
You will not lose any detail info like this; the full stack trace of the inner exception is still reported in Raygun.
Raygun also lets you filter errors on user id, tags, version number etc. This proved a great help. Recommended!
ObservableCollection modifications on Windows Phone must be done on the UI thread
QuickCross ensures that the PropertyChanged data binding event is always raised on the UI thread, and that the code that handles the CollectionChanged data binding event for ObservableCollection on iOS and Android also uses the UI thread for UI related tasks. So in your viewmodels you normally do not need to care about which thread you are running on.
However on Windows Phone the built-in .NET data binding code that raises and handles the CollectionChanged event for ObservableCollections does not handle this for you. This means that you need to put #if WINDOWS_PHONE and RunOnUIThread() around any viewmodel code that adds, removes or replaces elements in an ObservableCollection.
This implementation of a RemoveNotification command demonstrates this:
Cross-platform performance of sqlite-net versus JSON.NET
We started out using sqlite-net for local data storage; basically we just followed the recommendation as cross-platform ORM by Xamarin. We figured for the simple data model and limited amount of data that we needed for Poollie, such a widely used library should easily be able to handle the job on all three platforms… right? Wrong!
On Android and iOS we started out with acceptable performance; probably because on these platforms, sqlite-net is just an interface to the built-in OS version of SQLite, which is well optimized. However, when we added Windows Phone we got abysmal startup and persistence performance. The app took two or three times as long to start as on other platforms, and persisting the database could take more than 10 seconds; it was so slow that when exiting or switching away from the app, the OS sometimes terminated the app before it was done updating the database. Which in turn lead to corrupted data. Quite a mess.
Fortunately, we could easily keep all our local data in memory, so we could refactor from SQLite tables and queries using IQueryable to standard .NET in-memory Lists, which were persisted in local storage with JSON.NET and queried in-memory with IEnumerable. Here is how:
The SQLite database with its tables becomes a JSON-serializable class with a List for each table:
The entire JsonData object is then serialized/deserialized to/from a string with one very efficient JsonConvert.SerializeObject / JsonConvert.DeserializeObject call, and the string value is stored as a file in local storage.
Converting read queries from IQueryable to IEnumerable is straightforward:
The same goes for write queries:
As a small bonus, this refactoring removes async/await overhead and the need to use ConfigureAwait(false).
Some Isolated Storage exceptions occur only AFTER you publish an app in the Windows Phone Store
Before you upload your app package to the Windows Phone store for publishing, you will of course test that exact package extensively in the emulator and on real devices. However some errors will only occur after you publish the app, as I found out after publishing the first version of Poollie. The store-processed package crashed at startup – on every device, every time – while the uploaded package did not.
This is a really painful situation and very hard to resolve as well. Your app is published in the store, visible for all the world, and it won’t even start! You cannot debug the store-produced package in Visual Studio, and if the crash occurs at startup, chances are that no error logging is persisted or published. So you have no clue as to the cause.
You also cannot test the Windows Phone Store produced package without actually publishing the app; even though you have the option to manually publish the app after the submitted package has been approved, there is no way to download or deploy the store-produced app package without actually publishing it in the store. So each potential fix has to be published through the store before you can test if it solves the problem. Ouch.
So what caused the app to crash? It were restrictions on the accessible file paths in isolated storage that only apply to store produced packages. In my case, the file that contained our SQLite database was accessed at application startup like this:
Working with a file in the root of the application user store works fine on devices and emulators as long as you deploy an app package produced with Visual Studio, but it is not allowed in an app installed from the store. Instead a correct location for files on Windows Phone is in Windows.Storage.ApplicationData.Current.LocalFolder.Path:
Windows Phone performance of loading resource bitmaps with Splat versus data binding to native resource uri’s
As was the case with SQLite, we also found that the performance of the Splat cross-platform library for loading images was much worse on Windows Phone than on iOS and Android. In Poollie we loaded some 50 small country flag PNGs of 70 x 50 pixels from the application resources into memory, like this:
This performed OK in iOS and Android, but in Windows Phone it added 6+ seconds to the application startup time!
Since we display these flags in the data-bound Source property of Image controls in Windows Phone XAML, what we need is a way to asynchronously load a bitmap if and when it needs to be displayed in a data-bound property. Even though you could code (cross-platform) solutions for this, this is not a simple task (e.g. see Patterns for Asynchronous MVVM Applications: Data Binding).
Fortunately, Windows Phone already has a built-in mechanism for this: you can bind to resource URI’s directly and the OS will take care of the asynchronous loading on demand. Since Splat performs much better on iOS and Android we really only need a Windows Phone solution. So instead of creating full-blown cross-platform asynchronous image data binding, we use a fake Splat IBitmap class for Windows Phone that does not load a bitmap file but instead just stores a resource URI:
The class also has an implementation of the Splat IBitmap interface that does not do anything – it just enables instances of the class to be passed as an IBitmap in the cross-platform viewmodel code.
Then in the Windows Phone UI project a simple XAML converter can be used to retrieve the resource URI from the fake IBitmap:
Note that the optional converter parameter allows to specify a default bitmap URI, which is also rendered in the XAML visual designer.
We started out using the QuickCross MVVM + Application-Navigator pattern:
We developed some viewmodels that needed to show changes made in other viewmodels – e.g. match score predictions could be adjusted in several views. The QuickCross pattern specifies (for good reasons and with only a few specific exceptions) that viewmodels should know nothing about each other; initializing viewmodels and passing data such as navigation parameters between viewmodels is the responsibility of the Application class. So initially we tried to coordinate viewmodel updates by calling update methods on the Application.
However because the update methods often had to be asynchronous, and view navigation is asynchronous as well, keeping the viewmodels updated lead to timing-dependant, re-entrant spaghetti code. So we replaced the application update methods with a simple additional pattern: service-viewmodel events.
Service-Viewmodel Event Pattern
In short, the service-viewmodel event pattern entails that services implement events and raise these events in service methods. Viewmodels subscribe to these events to update themselves. Then when a viewmodel calls a service method to effect a change, all other viewmodels interested in that change are informed that they need to update themselves. Note that this requires a viewmodel to distinguish between events raised by itself and by other viewmodels.
E.g., this is how the NotificationChanged event is implemented in the WebAPIServiceAgent class:
This is just the standard .NET event pattern. To minimize the amount of viewmodel (and therefore UI) updates, the event arguments contains fields to indicate the type and the origin of the changes: Changes and OriginId. Even though an event has a sender parameter, the OriginId allows to specify a specific, wider context such as a viewmodel.
E.g. here is how the NotificationsViewModel specifies its type name as the OriginId and checks against that to ignore events that were (indirectly) raised within its own context (and so prevents an unnecessary update):
Scheduled viewmodel update pattern
So now viewmodels know that they need to update themselves. But when is the best time to do that? You could of course update the viewmodel right in the above event handler. However, this may cause problems:
- If the viewmodel update method is asynchronous and may take some time (e.g. it does a web request), a sequence of quick change events (such as a user repeatedly tapping a button) could trigger the viewmodel update code to run multiple times simultaneously. That would most likely result in runtime errors and/or corrupted data.
- If the viewmodel update (or the associated UI update through data binding) requires significant CPU resources, updating all interested viewmodels for each change may cause the app to become less responsive. However viewmodels only need to be updated when their view is visible.
The scheduled viewmodel update pattern prevents these problems by scheduling updates to run only when the viewmodel has a visible view, and scheduling the updates sequentially when a change occurs while a previous change is still being processed.
E.g. this is how the same NotificationsViewModel event handler implements this:
The ScheduleUpdate, IsShowing and OnShow are members of the viewmodel base class. This is the ScheduleUpdate implementation:
IsShowing and OnShow are part of the view lifecycle support – they are called and set from the platform-specific view lifecycle event handlers in the view base classes. These are the events for each platform:
iOS: ViewWillAppear, ViewDidDisappear
Android: OnResume, OnPause
Windows Phone: OnNavigatedTo, OnNavigatingFrom
Note that view lifecycle support will be included in the next QuickCross release.
Async performance and ConfigureAwait
Many of the API’s that you use in C# apps are async. Combined with the bubble-up nature of the C# async-await pattern (most methods that call an async method will need to await that call and therefore will become async themselves) this results in a LOT of await statements in your shared code; i.e. Poollie has 300 awaits in shared code.
However, as explained the MSDN Magazine article Best Practices in Asynchronous Programming (under the headers Async All the Way and Configure Context), the default behavior for await causes performance loss in non-UI code and can potentially cause deadlocks.
When an incomplete Task is awaited, the current “context” is captured and used to resume the method when the Task completes. In native app development this is actually a good thing, since iOS, Android and Windows Phone all have a single UI thread that is the only thread allowed to access the UI. The default await behavior lets you move non-UI processing from the UI thread to a non-UI thread, therefore improving the responsiveness of the app, while ensuring that you are back on the UI thread when you continue with UI code.
But code that does not require being run on the UI thread is almost always indifferent to the thread(s) in which it runs. So, for awaits in this code you do not need to return to the original thread after each await; you can simply continue executing the method on whatever thread you happen to be on. You can specify this with the ConfigureAwait method of the Task class; simply append ConfigureAwait(false) to each statement that you await. E.g. when performing a web request on an HttpClient:
Since we are using the MVVM pattern we already have the non-UI code (90% of all code) conveniently separated in a shared code project. The approach we took with Poollie was to simply add ConfigureAwait(false) to all await statements in the shared project – effectively changing the default await behavior for all non-UI code.
Here is a handy regular expression that you can search on in Visual Studio to find any lines with await but without ConfigureAwait:
E.g. like this:
Note that ConfigureAwait(false) is only a sensible default; it is not a 100% rule for all non-UI code. You could program valid non-UI code that has thread affinity; for those rare awaits that do need to return to the original context, you can make this explicit by appending ConfigureAwait(true) plus some comment on why this is needed. In the UI projects, using await ‘normally’ without ConfigureAwait is still a sensible default; it improves code readability.
In Poollie, changing the 300 await statements in shared code resulted in shaving off several seconds from the app startup time. It also made the app noticeably more responsive.
What would I do different next time?
In addition to the lessons learned while implementing Poollie, what would I consider doing different for a new native cross platform app? Here is my shortlist:
- Use Xamarin Forms to also share a large part of the UI across platforms (Forms was not yet available when we built Poollie).
- Consider using SignalR for the entire API – instead of pulling for changes from the client, let changes be pushed to viewmodels no matter whether they originate on the same client device or on the server.
- Consider designing the API to deliver sparse JSON viewmodel updates instead of a more granular data model API, and use the method JsonConvert.PopulateObject to update existing viewmodels with changes. This will eliminate a lot of client-side code to compose viewmodels from data models, require less and smaller request/responses to update a view, and result in better client performance.
- Consider using the Reactive Extensions for .NET, as advocated in this Xamarin blog post, possibly in similar fashion as Reactive UI.
Building and supporting Poollie proved once again that Xamarin is a solid technology choice for building mobile apps. The development speed, level of reuse and grip on the user experience that Xamarin offers across platforms is unparalleled. With the introduction of Xamarin Forms the entire proposition becomes even more interesting.
The technology support for Xamarin is abundant; many high quality libraries and services support it out of the box and native libraries are easily leveraged as well. Even though some of these supporting technologies require knowledge of dodgy details, they enable you to quickly build powerful native apps – for consumers and for enterprises.
Delivering a consumer app to 47000 users across Android, iOS and Windows Phone has been very rewarding and a great learning experience. On to the next App!