October 24, 2013

The Wikitude Tracking Engine and Wikitude Studio – The CTOs Perspective

In recent months, Wikitude has published several cutting-edge tools and technologies, most notably the launch of a fully offline image recognition engine, providing the ability to track more than 1000 target images at a time. We’ve also introduced the Wikitude Studio, a web-based creation and content management tool for augmented reality content. This article will take a deeper look at these two solutions and provide a bit more information about the underlying engine and technologies.

Natural Feature Tracking at Wikitude

CTO_KTM1 The image recognition engine developed by Wikitude is based on the Natural Feature Tracking (NFT) principle. As the name implies, this particular field of computer vision (CV) uses natural features to track images. The alternate approach is the use of marker tracking, which uses artificial images, barcodes for example, to track a scene using artificial shapes. We focused on NFT, as our primary goal is to incorporate and interact with as much of the “real world” as possible. Artificial markers are a good fit for certain scenarios, but we can not assume that these markers will be in place each and every time end users will want to scan their environment, especially in the broad use case categories Wikitude is operating in. To make a device “see”, several steps need to be executed:
  1. Preprocessing: The target image that should be tracked is analyzed. Significant areas in the image, so called Feature Points, are extracted and stored. How the Feature Points are detected and stored is the essential essence of the algorithm used. The preprocessing step is executed only once per tracked image, and can run offline and asynchronously.
  2. Feature Point Detection: On the device, similar to the preprocessing step, the current camera image is analyzed for keypoints.
  3. Tracking: These recognized keypoints are then compared with the keypoints generated from the target images in step 1. If a pre-defined similarity is determined, the image is considered a positive match and then tracked. Several algorithms exist to determine a threshold of similarity, which one to chose is up to the implementation.

Tracking vs. Cloud Recognition

NFT basically comes in two different shapes, offline tracking and online cloud recognition. Cloud recognition is the task where one single image is sent from the device to a server, the server scans for matches in its database and returns the best fit, typically including the position of the best fit in the original image sent to the server. The main advantage of cloud recognition is that servers are capable of searching through a huge database (read as in “thousands of target images”, potentially through distributed systems). The obvious disadvantage is that the process is asynchronous and not real time. It can take several seconds to complete the round-trip to the server, depending on the network connection of the device. Contrary to this, offline tracking solutions store the information of the target images on the device itself and constantly compare the camera image with the database. This allows for a smooth tracking experience, but also puts heavy requirements on the computational power of the device.

The Wikitude Tracking Engine

Cloud recognition was not an ideal solution for us, at least not in the first step. We wanted to offer real time tracking and built an engine which is capable of constantly scanning your surroundings. Cloud recognition has a network delay, and requires the end user to to take pictures of his surroundings. When we scanned the market for other offline tracking solutions to see what’s out there already, we found that existing solutions were always pushing for the best tracking quality with a very limited set of target images. But in most real-world-scenarios, a limited set of target images is not enough. Think about a magazine with more than a hundred pages – you will need one target image per page you want to augment, and the application will need to constantly search for all pages. We dove into the challenge of developing a solution for the offline tracking of several hundreds of target images, while still not compromising tracking robustness and speed – a solution the market could not offer to date. This challenge involved managing computational complexity, power consumption on mobile devices and the required memory to store tracking information about these target images. The team put a tremendous amount of effort into implementing and optimizing computer vision algorithms for mobile devices and how to deal with many target images. RedBull became our good friend, as many spent long nights were spent thinking about and implementing the best data structure for storing this information on a mobile device. In the end, we had found the optimal tradeoff between the computational complexity of handling hundreds of target images in real time, and the speed and robustness of the tracking quality. It was important for us to have better-than-market tracking quality in real world scenarios (including bad lighting conditions and steep angles on the target image) even with up to a thousand images … and we succeeded.

Management of Your Content

studio When we launched our own in-house image recognition solution a couple of months ago, it was already clear for us that since the Wikitude SDK is targeted towards developers, we needed to provide tools for non-developers. This led to the creation of Wikitude Studio, our fully web-based content management system for augmented reality content. After we had analyzed what we wanted Wikitude Studio to allow non-developers to achieve, we had an important technical decision to make: Will Wikitude Studio be an application users can download and install, or will it go the “web way” and run in a browser? After several meetings and workshops, it was obvious that a web-solution was the way for us to proceed. First and foremost, users don’t have to download and install any applications, they simply have to log on to a website. Easy! That’s what we wanted. Second, a web solution is automatically cross-platform (I’ll cover browser incompatibilities below), and the user is working in the well-known environment of his favorite web browser. Easy! And third, our entire platform stack is already oriented towards web technologies, so integration with a web-based CMS is straight forward. Of course, there is no upside without a downside. I have already mentioned browser incompatibilities, and anyone familiar with HTML5 and CSS3 support (or non-support) on various browsers, on various platforms, knows that browsers behaving differently can sometimes be a pain. Even with the near-application level API web browsers expose with HTML5, the APIs offered by native applications are still broader. But for us, the upsides overshadowed the downsides by far, and thus, Wikitude Studio was born. LRThe portal where Studio is hosted is running in Liferay, a powerful web solution where lots of our backend processes happen. The front-end of Studio is all written in pure HTML, JavaScript and CSS, without the need for any browser plugins, Applets or Flash players. Since Studio is an interactive front end-heavy application, we looked at various JavaScript front end libraries and decided to go with AngularJS, a JavaScript MVC library developed within Google. AngularJS-largeAngularJS, as with other MVC libraries, relies heavily on the separation of the data (Model), the presentation (View) and the business logic (Controller) of the application. If you apply their principles, maintaining your application is a piece of cake. Integration was easy, and our developers absolutely fell in love with the power and flexibility that AngularJS provides. Using web technologies with the latest state-of-the-art tools, including HTML5, CSS3, AngularJS and several others, we’re ready for the future when it comes to managing your AR content.. With our own in-house image recognition solution, we are able to twist and turn our code to our needs and – more importantly – to the needs of our customers and developers. We created the code, can wrangle every bit of it, and are ready for the future!

Questions, comments? Let us know!

If you have questions, suggestions or feedback on this, please don’t hesitate to reach out to us via Facebook, Twitter or send us an email at hello@wikitude.com!

Contact us

More Blog Posts

September 16, 2020

Scene Recognition and Tracking: Augmented Reality Use Cases and How-to

September 9, 2020

How AR adds value in the COVID-19 world

Subscribe to our Newsletter

Stay up to date with the latest Wikitude news.

Subscribe now

September 2, 2020

Augmented reality in maintenance and remote assistance

August 25, 2020

New in: Explore Wikitude SDK 9.3

August 17, 2020

Augmented Reality Glossary: from A to Z

July 29, 2020

4 ways AR transforms the customer journey