Documentation

Object and Scene Recognition

Introduction to Object and Scene Recognition

Object Recognition and Tracking extend the capabilities of the Wikitude SDK to recognize and track arbitrary objects for augmented reality experiences. The feature is based on Wikitude's SLAM engine that is used throughout the SDK for any kind of tracking the environment. Object Recognition and Tracking let you detect objects and entire scenes, that were pre-defined by you. Suitable objects include

  • Toys
  • Monuments and statues
  • Industrial objects
  • Tools
  • Household supplies

Recognition works best for objects that have only a limited number of changing/dynamic parts.

Scene Recognition

With version SDK 8 the object recognition engine can also be used to recognize larger structures that go beyond table-sized objects. The name Scene recognition reflects this in particular. The new image-based conversion method allows for Object targets that are a lot larger in size, that can be successfully recognized and tracked.

  • Rooms
  • Face of buildings
  • Squares and courtyards
Important: Learn how to create Object Targets Make sure to read the chapter on how to create Object Targets before using Object Recognition on your own.

This example shows how to track an object, add an occluder to it and make it interactive by adding buttons and animations.

Basic Object Tracking

The basic object tracking sample should just give you a rough idea of how object tracking works with the Wikitude SDK. We will track a toy fire truck using a .wto-file which contains the tracking information.

The first thing we want to do in our newly created ARchitect world is to load the tracking information and add it to a new object tracker.

If you already tried the Wikitude SDK you will know that .wtc-files (Wikitude Target Collection) are used to load tracking information for image targets. For object tracking, however, we use .wto-files (Wikitude Object Target Collection), which are loaded in just the same way. After this.targetCollectionResource is created use it for the initialization of a new AR.ObjectTracker.

this.targetCollectionResource = new AR.TargetCollectionResource("assets/firetruck.wto", {
});

this.tracker = new AR.ObjectTracker(this.targetCollectionResource, {
    onError: function(errorMessage) {
        alert(errorMessage);
    }
});

In the next step, we will add some 3D-models to be displayed once we track the fire truck. All drawable elements have to be added to the AR.ObjectTrackable later, so we create an array (World.drawables) that will contain them.

In the assets folder, there is the .wt3-file of a traffic cone. We will create four instances of the 3D-model and position them around the fire truck. The function getCone returns a model of a traffic cone at the desired position with the correct scale and rotation.

getCone: function getConeFn(positionX, positionY, positionZ) {
    var coneScale = 0.05;

    return new AR.Model("assets/traffic_cone.wt3", {
        scale: {
            x: coneScale,
            y: coneScale,
            z: coneScale
        },
        translate: {
            x: positionX,
            y: positionY,
            z: positionZ
        },
        rotate: {   
            x: -90
        }
    });
},

We call getCone once for every one of the four positions and add the retrieved cones to the World.drawables array so they can be displayed when the object is tracked.

var coneDistance = 1.0;

var frontLeftCone = World.getCone(-coneDistance, 0.0, World.occluderCenterZ + coneDistance);
World.drawables.push(frontLeftCone);

var backLeftCone = World.getCone( coneDistance, 0.0, World.occluderCenterZ + coneDistance);
World.drawables.push(backLeftCone);

var backRightCone = World.getCone( coneDistance, 0.0, World.occluderCenterZ - coneDistance);
World.drawables.push(backRightCone);

var frontRightCone = World.getCone(-coneDistance, 0.0, World.occluderCenterZ - coneDistance);
World.drawables.push(frontRightCone);

Now we finally create an AR.ObjectTrackable and initialize it with the World.drawables array. Furthermore, we implement two of the AR.ObjectTrackable's callback functions so we can react if the object is recognized or lost.

this.objectTrackable = new AR.ObjectTrackable(this.tracker, "*", {
    drawables: {
        cam: World.drawables
    },
    onObjectRecognized: this.objectRecognized,
    onObjectLost: this.objectLost,
    onError: function(errorMessage) {
        alert(errorMessage);
    }
});

If you run your application now and look at the fire truck, the four traffic cones should appear around it. However, what you will notice is that the cones are always visible even if the fire truck should be covering them. That is where the occluder comes in.

An occluder is a 3D-Model that is not rendered but rather keeps the elements behind it from being rendered. We have created an occluder 3D-Model in the shape of our fire truck, which you can find in the assets folder. The way you add it to the ARchitect world is simple.

At some point before you create your AR.ObjectTracker, use the firetruck_occluder.wt3 to create an AR.Occluder with the correct scaling and rotation. Then simply add it to the World.drawables array.

var occluderScale = 0.0057;

this.firetruckOccluder = new AR.Occluder("assets/firetruck_occluder.wt3", {
    onLoaded: this.loadingStep,
    scale: {
        x: occluderScale,
        y: occluderScale,
        z: occluderScale
    },
    translate: {
        x: -0.25,
        z: -0.3
    },
    rotate: {
        x: 180
    }
});
World.drawables.push(this.firetruckOccluder);

Now when you run the program and recognize the fire truck, the traffic cones will disappear correctly behind it.

Basic Object Tracking

Image and Sound Augmentations

In this sample, we will add a button, emergency lights and a siren to the fire truck from the first Object Tracking Sample.

First, we add the lights to the sample. The function getLight returns an AR.ImageDrawable showing a blue emergency light at the correct position.

getLight: function getLightFn(positionX, positionY, positionZ) {
    var lightScale = 0.3;
    var lightResource = new AR.ImageResource("assets/emergency_light.png");

    return new AR.ImageDrawable(lightResource, lightScale, {
        translate: {
            x: positionX,
            y: positionY,
            z: positionZ
        },
        rotate: {
            x: 90
        },
        enabled: false
    });
},

We call this method twice to create two emergency lights on top of the driver's cabin.

var leftLight = World.getLight(-0.6, 0.9, World.occluderCenterZ + 0.2);
World.drawables.push(leftLight);

var rightLight = World.getLight(-0.6, 0.9, World.occluderCenterZ - 0.2);
World.drawables.push(rightLight);

After the lights have been added to the ARchitect world we will animate them. We want to have both lights flashing, so we will animate the opacity of the lights, making them fade in and out.

The function addLightAnimation takes one light as a parameter. It creates two animations for the light, one for fading in and one for fading out. For this purpose, we create an AR.PropertyAnimation and hand it the following five parameters:

  • the object to be animated
  • the parameter name to be animated
  • the start value of the animation
  • the end value of the animation
  • the overall duration of the animation

As an additional parameter, we set type to AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE, which makes for a smooth transition between two values.

We save both animations in variables and group them together in an AR.AnimationGroup with the setting AR.CONST.ANIMATION_GROUP_TYPE.SEQUENTIAL, meaning that the animations in the group will be played after one another (as opposed to PARALLEL). Finally, we start the AR.AnimationGroup with the parameter -1, creating an endless loop.

addLightAnimation: function addLightAnimationFn(light) {
    var animationDuration = 500;
    var lowerOpacity = 0.5;
    var upperOpacity = 1.0;

    var lightAnimationForward = new AR.PropertyAnimation(light, "opacity", lowerOpacity, upperOpacity, animationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });

    var lightAnimationBack = new AR.PropertyAnimation(light, "opacity", upperOpacity, lowerOpacity, animationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });

    var lightAnimation = new AR.AnimationGroup(AR.CONST.ANIMATION_GROUP_TYPE.SEQUENTIAL, [lightAnimationForward, lightAnimationBack]);
    lightAnimation.start(-1);
},

We call the function above with both lights to create the animations. Since we disabled the lights when we created them (enabled: false) they will not be visible at this point, but we will get to that in a moment.

Now we will add the siren for which we use the type AR.Sound with the siren.wav sound file that you will find in the assets folder. Just initialize your AR.Sound like below and load it so it won't have to load when you first call play().

this.sirenSound = new AR.Sound("assets/siren.wav", {
    onError : function(){
        alert(errorMessage);
    }
});
this.sirenSound.load();

All the parts are ready now, but none of them are displayed/played. We will add an animated button to run the animation and the sound. The button is an AR.Model whose onClick callback method we set to World.setLightsEnabled(true), which enables the lights, disables the button and starts the siren.

We initialize it with the marker.wt3 model in the assets folder and give it a position above the driver's cabin. In order to add the morphing animation we call the function addButtonAnimation.

this.lightsButton = new AR.Model("assets/marker.wt3", {
    translate: {
        x: -0.6,
        y: 0.9,
        z: World.occluderCenterZ
    },
    rotate: {
        x: -90
    },
    onClick: function() {
        World.setLightsEnabled(true);
    }
});
World.addButtonAnimation(this.lightsButton);
World.drawables.push(this.lightsButton);

In addButtonAnimation we use AR.PropertyAnimation to morph our button to bigger and smaller sizes. We set up two animations for every dimension, one to make the button smaller and one to make it bigger again. Both animations take half of the total animation duration and both use the easing curve type EASE_IN_OUT_SINE for a smooth transition. After all six animations have been created and bundled together in separate AR.AnimationGroup instances, we just start all the animation groups at once and let them run indefinitely (-1).

addButtonAnimation: function addButtonAnimationFn(button) {
    var smallerScale = 0.03;
    var biggerScale = 0.04;
    var scaleAnimationDuration = 2000;

    // x
    var buttonScaleAnimationXOut = new AR.PropertyAnimation(button, "scale.x", smallerScale, biggerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationXIn = new AR.PropertyAnimation(button, "scale.x", biggerScale, smallerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationX = new AR.AnimationGroup(AR.CONST.ANIMATION_GROUP_TYPE.SEQUENTIAL, [buttonScaleAnimationXOut, buttonScaleAnimationXIn]);

    // y
    var buttonScaleAnimationYOut = new AR.PropertyAnimation(button, "scale.y", smallerScale, biggerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationYIn = new AR.PropertyAnimation(button, "scale.y", biggerScale, smallerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationY = new AR.AnimationGroup(AR.CONST.ANIMATION_GROUP_TYPE.SEQUENTIAL, [buttonScaleAnimationYOut, buttonScaleAnimationYIn]);

    // z
    var buttonScaleAnimationZOut = new AR.PropertyAnimation(button, "scale.z", smallerScale, biggerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationZIn = new AR.PropertyAnimation(button, "scale.z", biggerScale, smallerScale, scaleAnimationDuration/2, {
        type: AR.CONST.EASING_CURVE_TYPE.EASE_IN_OUT_SINE
    });
    var buttonScaleAnimationZ = new AR.AnimationGroup(AR.CONST.ANIMATION_GROUP_TYPE.SEQUENTIAL, [buttonScaleAnimationZOut, buttonScaleAnimationZIn]);

    // start all animation groups
    buttonScaleAnimationX.start(-1);
    buttonScaleAnimationY.start(-1);
    buttonScaleAnimationZ.start(-1);
},

2D Augmentations

Animated 3D Augmentations

In this sample, we will add the animation of screwdriver disassembling the wheel of the fire truck. It is based on the 2D Image and Sound Augmentations Sample.

First, we create the screwdriver and the screw that will be moving in our disassemble animation. We use the same x- and y-coordinates for parts, because they will only be moving in the z-direction. Also, we make the size of the screw relative to that of the screwdriver so we can change the size easily. We add both parts to the World.drawables array.

var screwdriverScale = 0.04;
var screwdriverPositionX = -0.52;
var screwdriverPositionY = 0.24;

this.screwdriver = new AR.Model("assets/screwdriver.wt3", {
    scale: {
        x: screwdriverScale,
        y: screwdriverScale,
        z: screwdriverScale
    },
    translate: {
        x: screwdriverPositionX,
        y: screwdriverPositionY
    },
    rotate: {
        y: 180
    },
    enabled: false
});
World.drawables.push(this.screwdriver);

var screwScale = screwdriverScale * 0.6;
this.screw = new AR.Model("assets/screw.wt3", {
    scale: {
        x: screwScale,
        y: screwScale,
        z: screwScale
    },
    translate: {
        x: screwdriverPositionX,
        y: screwdriverPositionY
    },
    enabled: false
});
World.drawables.push(this.screw);

Another model that we need is a turning arrow sign to better indicate the turning motion of the screw and the screwdriver during the animation.

var turningArrowScale = screwdriverScale * 0.2;
this.turningArrow = new AR.Model("assets/arrow.wt3", {
    scale: {
        x: turningArrowScale,
        y: turningArrowScale,
        z: turningArrowScale
    },
    translate: {
        x: screwdriverPositionX,
        y: screwdriverPositionY,
        z: World.occluderCenterZ + 0.7
    },
    rotate: {
        y: -90
    },
    enabled: false
});
World.drawables.push(this.turningArrow);

Just like in the last sample we initially disable all these elements and add a button to run the animation.

this.tireButton = new AR.Model("assets/marker.wt3", {
    translate: {
        x: -0.55,
        y: 0.25,
        z: World.occluderCenterZ + 0.4
    },
    onClick: function() {
        World.runScrewdriverAnimation();
    }
});
World.addButtonAnimation(this.tireButton);
World.drawables.push(this.tireButton);

All we have to do now is implement the function runScrewdriverAnimation(). First, we enable all the needed elements, then we specify an overall animationDuration and a translateDistance which will be the distance by which the screw and screwdriver move during the animation. Then we create the translate animations for the screwdriver and the screw. We also implement the onFinish callback of the first animation, so that it disables all elements after it is done. We also add an animation for the turning arrow, so that it makes one complete rotation, hinting at the direction the screwdriver should be turned in. After all animations have been created, we bundle them up in an AR.AnimationGroup and fire them all at once with the parameter AR.CONST.ANIMATION_GROUP_TYPE.PARALLEL.

runScrewdriverAnimation: function runScrewdriverAnimationFn() {
    World.setScrewdriverEnabled(true);

    var animationDuration = 2000;

    var translateDistance = 0.2;
    var screwdriverZOffset = World.occluderCenterZ + 1.0;

    var screwdriverTranslateAnimation = new AR.PropertyAnimation(World.screwdriver, "translate.z", screwdriverZOffset, screwdriverZOffset + translateDistance, animationDuration, {}, {
        onFinish: function() {
            World.setScrewdriverEnabled(false);
        }
    });

    var screwZOffset = screwdriverZOffset - 0.65;
    var screwTranslateAnimation = new AR.PropertyAnimation(World.screw, "translate.z", screwZOffset, screwZOffset + translateDistance, animationDuration);

    var arrowRotationAnimation = new AR.PropertyAnimation(World.turningArrow, "rotate.z", 0, 360, animationDuration);

    var animationGroup = new AR.AnimationGroup(AR.CONST.ANIMATION_GROUP_TYPE.PARALLEL, [screwdriverTranslateAnimation, screwTranslateAnimation, arrowRotationAnimation]);
    animationGroup.start();
},

3D Augmentations

Extended Object Tracking

Warning: This feature is marked as deprecated since 9.12.0 and will be removed in future releases.

Extended tracking is an optional mode you can set for each target separately. In this mode the Wikitude SDK will try to continue to scan the environment of the user even if the original target object is not in view anymore. So the tracking extends beyond the limits of the original target object. The performance of this feature depends on various factors like computing power of the device, background texture and objects.

To enable extended object tracking, we simply change the creation of the AR.ObjectTrackable to include the enableExtendedTracking, extendedTarget and onExtendedTrackingQualityChanged parameters. The enableExtendedTracking parameter we set to true to allow for extended object tracking to take place, the default value being false. This may only be done during object creation; setting the property afterwards will result in an error. The extendedTarget parameter we set to * for extended tracking to apply to all object targets. Other valid values are ? and the identifier of a specific target of a WTO file. This property may be changed after object creation. Tracking needs to be lost intermittently, however, for the changes to take effect. This can be achieved by calling the stopExtendedTracking function. Lastly, we set the onExtendedTrackingQualityChanged parameter to a custom function that will be discussed in detail shortly. This function will be called whenever the quality of the extended tracking changes.

this.objectTrackable = new AR.ObjectTrackable(this.tracker, "*", {
    [...]
    enableExtendedTracking: true,
    extendedTarget: "*",
    onExtendedTrackingQualityChanged: World.extendedTrackingQualityChanged,
});

With these changes in place, tracking will remain active for an object even after it has left the camera's field of view by tracking the surrounding environment instead. For this to work reliably, the environment to be tracked may not be void of features. The current quality of the environment tracking is communicated though the onExtendedTrackingQualityChanged event, which we have attached the following custom function to. It simply changes the background color of an HTML element according to one of the three distinct qualities we report. For better tracking qualities, the algorithm is likely to be able to better maintain active tracking and tolerate quicker movements. Conversely, for bad tracking quality the opposite is true. Moving the device more slowly and over areas richer in features is likely to favourably affect the state of the tracking quality.

extendedTrackingQualityChanged: function(targetName, oldTrackingQuality, newTrackingQuality) {
    console.log('extendedTrackingQualityChanged ' + oldTrackingQuality + ' ' + newTrackingQuality)
    var newBackgroundClass;

    switch (newTrackingQuality) {
    case -1:
        newBackgroundClass = 'trackingBad';
        break;
    case 0:
        newBackgroundClass = 'trackingMedium';
        break;
    default:
        newBackgroundClass = 'trackingGood';
        break;
    }

    World.removeTrackingIndicator();

    var trackingIndicatorDiv = document.getElementById('trackingIndicator');
    World.trackingIndicatorBackgroundClass = newBackgroundClass;

    trackingIndicatorDiv.classList.add(World.trackingIndicatorBackgroundClass);
},

We recommend having such a quality reporting procedure to guide users on where and how to point their device as tracking success is affected by it.