Wikitude SDK 6 – A technical insight

Phil

Update (August 2017): Object recognition, multi-target tracking and SLAM: Track the world with SDK 7

In this blog post, Wikitude CTO Philipp Nagele shares some insights into the technical background of SDK 6 and the changes that go along with the new version of Wikitude’s augmented reality SDK.

The planning and work for SDK 6 reach far back into 2015. While working on the previous release 5.3 to add support for Android 7.0 and iOS 10, we already had a clear plan on what the next major release of our SDK should include. It is very gratifying that we now can lift the curtain on the scope and details of what we think is the most comprehensive release in the history of Wikitude’s augmented reality SDK.

While Instant Tracking is without doubt the highlight feature of this release, there are many more changes included, that are worth mentioning. In this blog post, I’ll try to summarize the noteworthy changes and some additional information and insights.

Pushing the boundaries of image recognition

Most of our customers choose the Wikitude SDK for the ability to recognize and track images for various purposes. The engine that powers it has been refined over the years, and, already with SDK 5, reached a level of performance (both in speed and reliability) that puts it at the top of augmented reality SDKs today.
With Wikitude SDK 6, our computer vision engine took another major step forward. In more detail, the .wtc file format now uses a different approach in generating search indexes, which improves the recognition rate. Measured on the MVS Stanford data set, SDK 5 delivered a recognition rate of around 86%, while SDK 6 now recognizes 94 out of 100 images correctly. Moreover, the recognition rate stays above 90% independent of the size of the .wtc file. So, no matter whether your .wtc file includes 50 or 1000 images, users will successfully recognize your target images.

Another development we have been working lately is to embrace the power of genetic algorithms to optimize the computer vision algorithms. Several thousands of experiments and hours on our servers led to an optimized configuration of our algorithms. The result is a 2D image tracking engine that tracks targets in many more sceneries and different light conditions. You can get a first impression on the increased robustness in this direct comparison between the 2D engine in SDK 5 and SDK 6. The footage is unedited and shows a setup with several challenging factors:

  • Low-light condition (single spot light source)
  • Several occluding objects
  • Strong shadows further occluding the images
  • Busy scenery in general
  • Reflections and refractions

The improvements in SDK 6 make the 2D engine more robust while keeping the same performance in terms of battery consumption and speed.

Instant Tracking – SLAM in practice

You might have seen our previous announcements and advancements in not only recognizing and tracking two-dimensional images, but instead, working with entire maps of three-dimensional scenes. It is no secret that Wikitude has been working on several initiatives in the area of 3D tracking.

For the first time, Wikitude SDK 6 includes a feature for general availability that is based on a 3D computer vision engine that has been developed in-house for the past 2 years. Instant Tracking is based on a SLAM approach to track the surrounding of the device and localize the device. In contrast to image recognition, Instant Tracking does not recognize previously recorded items, but instantaneously tracks the user’s surroundings. While the user keeps moving, the engine extends the recorded map of the scenery. If the tracking is lost, the engine immediately tries to re-localize and start tracking again, with no need for user input.
Instant Tracking is true markerless tracking. No reference image or marker is needed. The initialization phase is instant and does not require a special initialization movement or pattern (e.g. translation movement with PTAM).

The same engine used for Instant Tracking is used in the background for Extended Tracking. Extended Tracking uses an image from the 2D computer vision engine as an initialization starting point instead of an arbitrary surface as in the case of Instant Tracking. After a 2D image has been recognized, the 3D computer vision engines starts recording the environment and stays in tracking mode even when the user is no longer viewing the image.

For details on how to get started with Instant Tracking, visit our documentation, see our sample app (included in the download package) and watch the instruction video to learn how to track the environment. SDK 6 markerless augmented reality feature is available for Unity, JavaScript, Native, extensions and Smart-glasses (Epson and ODG).

Putting the Pieces Together – Wikitude SDK and API

What is the strength of the Wikitude SDK? It is much more than a collection of computer vision algorithms. When planning a new release, the product and engineering team aim to create a cross-platform SDK that is highly usable. We try to think of use-cases for our technology, then identify missing features. So it should come as no surprise that Wikitude SDK 6 is packed with changes and new features beyond the computer vision features.

The most obvious and noticeable change, especially for your users, is FullHD rendering of the camera image. Previously, Wikitude rendered the camera stream in Standard Definition (SD) quality, which was perfect back in 2012 when the Wikitude SDK hit the market. Since then, device manufacturers have introduced Retina displays and pixel-per-inch densities beyond the distinguishable. An image rendered in VGA resolution on this kind of display just doesn’t look right anymore. In Wikitude SDK 6, developers can now choose between SD, HD or FullHD rendering of the camera stream..

Additionally, on some devices, users can now enjoy a smoother rendering experience, as the rendering frequency can be increased to 60fps. For Android, these improvements are based on new support of the Android Camera 2 API, which, since Android 5.0, is the successor to the previous API (technically more than 60% of Android devices as of 1/1/2017 should run the Camera2 API). It allows fine-grained control and access to the camera and its capabilities. While the API and the idea behind it are a welcome improvement, the implementations of the Camera 2 API throughout the various Android vendors are diverse. Different implementations of an API are never a good thing, so support for the new camera features are limited to participating Android devices.

“Positioning” was another feature needed to allow users to interact with augmentations. This feature is ideal for placing virtual objects in unknown environments. With Wikitude SDK 6, developers now have a consistent way to react to and work with multi-touch gestures. Dragging, panning, rotating – the most commonly used gestures on touch devices are now captured by the SDK and exposed in easy-to-understand callbacks. This feature has been implemented in a way that you can use it in combination with any drawable in any of the different modes of the Wikitude SDK – be it Geo AR, Image Recognition, or Instant Tracking

The new tracking technology (Instant Tracking) lead us to another change which developers will encounter quite quickly when using SDK 6. Our previously used scheme of ClientTracker and CloudTracker didn’t fit anymore for an SDK with a growing number of tracker types. SDK 6 now carries a different tracking scheme with more intuitive naming. For now, you will encounter ImageTracker with various resources (local or cloud-based) and InstantTracker, with more tracker types coming soon. We are introducing this change now in SDK 6, while keeping it fully backward compatible with the SDK 5 API, while also deprecating parts of the SDK 5 API. The SDK comes with an extensive migration guide for all platforms, detailing the changes.

Last, I don’t want to miss the opportunity to talk about two minor changes that I think can have great impact on several augmented reality experiences. Both are related to visualizing drawables. The first change affects the way 2D drawables are rendered when they are attached to geo-locations. So far, 2D drawables have always been aligned to the user when attached to a geo-location. Now, developers have the ability to align drawables as they wish(e.g. North), and drawables will stay like that. The second change also affects 2D drawables. The new SDK 6 API unifies how 2D and 3D drawables can be positioned, which adds the ability to position 2D drawables along the z-axis.

Naturally, all of our official extensions are compatible with the newest features. The Titanium (Titanium 6.0) and Unity (Unity3D 5.5) extensions now support the latest releases of their development environments, and x86 builds are now available in Unity.

The release comes with cross-platform samples (e.g. gestures are demonstrated in a SnapChat-like photo-booth experience) and documentation for each of the new features, so you can immediately work with the new release.

Start developing with Wikitude SDK 6

Getting started with Wikitude’s new SLAM-based SDK is super easy! Here’s how:

  1. Download SDK 6 and sample app
  2. Check out our documentation
  3. Select your license plan
  4. Got questions? Our developers are here for you! 

We are excited to see what you will build with our new SDK. Let us know what is your favorite feature via Twitter and Facebook using the hashtag #SDK6 and #Wikitude !

Previous Article
Augmented Reality Market: 2016 Year in Review
Next Article
Augmented Reality Market: 2016 Year in Review