Wikitude SDK 7: A developer insight

Phil

Wikitude SDK 7 includes a long list of changes and additions to our augmented reality SDK. In this blog post, we will go through the modifications in more detail and what they offer for developers and users.

As you will see, SDK 7 has 3 key areas of improvement: Object Recognition and Tracking based on SLAM, multiple image recognition and enhancements for iOS developers.

Bring your objects into your augmented reality scene

Let’s get started with the biggest addition in this release: Object Recognition and Tracking for augmented reality. With this, we introduce a new tracker type beside our existing Image and Instant Tracking. The Object Tracker in the SDK gives you the possibility to recognize and track arbitrary shaped objects. The idea behind it is very similar to our Image Tracker, but instead of recognizing images and planar surfaces, the Object Tracker can work with three-dimensional structures and objects (tools, toys, machinery…). As you may have noticed, we don’t claim that the Object Tracker can work on any kind of object. There are some restrictions you should be aware of and types of objects that work a lot better. The SDK 7 documentation has a separate chapter on that.

In short – objects should be well structured and the surface should be well textured to play nicely with object recognition. API-wise the Object Tracker is set-up the same way as the Image Tracker.

The Object Tracker works according to the same principle as the Image Tracker. It detects pre-recorded references of the same object (the reference is actually a pre-recorded SLAM map). Once detected in the camera, the object is continuously tracked. While providing references for Image Targets is straight-forward (image upload), creating a reference for the object is a little bit more complex.

Scalable generation of object references

We decided to go for an approach that is scalable and usable for many users. This ruled out a recording application, which would be used to capture your object. This would also make it necessary, that each object is physically present. Considering this, we went for a server-side generation of Object Targets (sometimes also referred to as maps). Studio Manager, our web-tool for converting Image Targets, has been adopted for converting regular video files into Object Targets. You will find a new project type in Studio Manager that will produce Object Targets for you. Here’s a tutorial on how to successfully record objects.

https://www.youtube.com/watch?v=eY8B2A_OYF8

After you have uploaded your video, the backend will try to find the best possible Object Target in several computation runs. We can utilize the power of the server to run intensive computational algorithms to come to a more accurate result compared to a pure on-device solution that has to operate in real-time. Have a look at the chapter “How to create an Object Target” in the SDK 7 documentation for a deeper understanding of the process. It also gives us the ability to roll-out improvements of the recording process without the need for a new SDK version.

Rendering Upgrade: Working with occlusion models

When moving from Image Targets to Object Targets, requirements for rendering change as well. When the object has a solid body with different sides it is particularly important to reflect that when rendering the augmentations. SDK 7 introduces a new type called AR.Occluder in the JavaScript API, that can take any shape. It acts as an occlusion model in the 3D rendering engine, so you can hide augmentations or make them a lot more realistic. For your convenience, the occluder can either be used with standard pre-defined geometric shapes or take the form of any 3D model/shape (in wt3 format). Not only Object Tracking benefits from this. Occluders, of course, can be used in combination with Image Targets as well – think of an Image Target on your hand wrist, that acts as an Image Target for trying on watches. For a proper result, parts of the watch need to be hidden behind your actual arm.

Updated SLAM engine enhancing Instant Tracking

Object Recognition and Tracking is based on the same SLAM engine that powers Instant Tracking and Extended Tracking. To make Object Recognition work, we upgraded the SLAM engine with several improvements, changes to the algorithm and bug fixes to the engine itself. This means SDK 7 carries an entirely revamped SLAM engine. You as a developer and your users will notice that in several ways:

1. Higher degree of accuracy in Instant Tracking and Extended Tracking
2. More stable tracking when it comes to rotation
3. Less memory consumption
4. Less power consumption

All in all, that means that devices running 32-bit CPUs (ARMv7 architecture) will see a performance boost and perform considerably better.

Instant Tracking comes also with two new API additions. Setting trackingPlaneOrientation for the InstantTracker lets you freely define on which kind of plane the Instant Tracker should start (wall, floor, ramp…). The other API is called hit testing API and will let you query the depth value of any given screen point (x,y). It will return the 3D- coordinates of the corresponding point in the currently tracked scene. This is useful for placing augmentations at the correct depth in the scene. The SDK will return an estimate dependent on the surrounding tracked points. The video below gives you an idea of how the hit testing API can be used.

1,2,3…. Multiple targets now available

Additionally, our computer vision experts worked hard to make our Image CV engine even better. The most noticeable change is the ability to recognize and track multiple images at the same time in the camera frame. The engine can detect multiple different images, as well as multiple duplicate images in the camera (e.g. for counting purposes). Images can overlap or even superimpose each other. The SDK does not have a hard-coded limit on the number of multiple images it can track – only the processing power of the phone puts a restriction on it. With modern smartphones it is easily possible to track 8 and more images.

Furthermore, the SDK offers developers the ability to get more information about the targets in relation to each other. APIs will tell you how far targets are apart and how targets are oriented towards each other. Callbacks let developers react on changes of the relationship between targets. Developers can define the maximum number of targets, so the application does not waste power searching for further targets. The image below gives you an idea how this feature can look like for a simple interactive card game.

Boosting recognition to unparalleled distances

All developers and users that require images to be recognized from a far distance in their augmented reality scene should take a look at the extended range recognition feature included in SDK 7. By using more information from the camera frame, SDK 7 triples recognition distance compared to previous SDK versions. This means that an A4/US-letter sized target can be detected from 2.4 meters/8 feet. Calculated differently, images that cover 1% of the screenable area can still be accurately recognized and a valid pose can be successfully calculated. The SDK enables this mode automatically for devices capable of this feature (auto-mode). Alternatively, developers can manually enable/disable the function. When testing the feature and comparing it to competing SDKs, we did not detect any other implementation delivering this kind of recognition distance. All in all this means easier handling for your users and more successfully recognized images.

Bitcode, Swift, Metal – iOS developers rejoice

This brings us to a chapter dedicated to iOS developers, as SDK 7 brings several changes and additions for this group. First of all, Wikitude SDK now requires iOS 9 or later, which shouldn’t be a big hurdle for the majority of apps (currently nearly 95% devices meet this requirement). With SDK 7, iOS developers can now build apps including the Wikitude SDK using the bitcode option. Apps built with the bitcode will have the benefit of being smaller, as only the version necessary for the actual device architecture (armv7, armv7s, armv8) is delivered to the user and not a fat binary including all architectures.
As a more than welcomed side-effect of re-structuring our build dependencies to be compatible with bitcode, the

Wikitude SDK can now also run in the iOS simulator. You still won’t see a camera image in the simulator from your webcam, but you can work with pre-recorded movie files as input for the simulator.

In SDK 6.1 we introduced support for OpenGL ES 3 as graphics API. SDK 7 now also lets you use Metal as your rendering API in Native and Unity projects. Talking about new stuff, Wikitude SDK 7 also includes an extensive sample for Swift explaining how to integrate the Wikitude SDK in Swift application. Note the API itself is still an Obj-C API, but the sample makes it a lot clearer how to use API within a Swift environment.

We haven’t forgotten Android

Android developers will be happy to hear, that the Android version of Wikitude SDK 7 makes use of a different sensor implementation for Geo AR experiences. The result is a smoother and more accurate tracking when displaying geo-related content. For Android, we are also following the trend and moving the minimum Android version up a little bit by requiring Android 4.4 or later, which corresponds to a minimum of 90% of Android devices.

We hope you can put SDK 7 and its additions to good use in your AR project. We love to hear from you and we are keen to receive suggestions on how to make the Wikitude SDK even more useful to you!

Start developing with Wikitude SDK 7

Getting started with SDK 7 has never been easier! Here’s how:

Help us spread the news on Twitter, Facebook and Linkedin using the hashtag #SDK7 and #Wikitude.

Previous Article
Wikitude at AWE USA 2017: Auggie Awards, talks and more
Next Article
Wikitude at AWE USA 2017: Auggie Awards, talks and more