Documentation

Input Plugins API

This guide provides an introduction into the input plugins API of the Wikitude Native SDK and aims to familiarize the reader with its concepts and constraints. Due to the length and complexity of the corresponding example application source code, it will not be presented in its entirety. Relevant and descriptive source code examples will, however, be provided. Since the Input Plugins API is an extension to the Plugins API, we recommend familiarity with it prior to reading this guide.

  1. About Wikitude SDK Input Plugins
  2. Input Plugin Base Class
  3. Simple Input Plugin
  4. Custom Camera

About Wikitude SDK Input Plugins

The input plugins API provides a means to alter the inputs and outputs of the Wikitude Native SDK. For the input case specifically, custom frame data of arbitrary sources can be supplied as an input to the Wikitude SDK Native API for processing. Complementary, for the output case, the default rendering of the Wikitude SDK Native API can be substituted with more advanced implementations.

Input plugins can be registered before the SDK is started or while it is already running. In case it is registered before the Wikitude SDK is started, the internal Wikitude camera implementation is not started at all and the Wikitude SDK starts using the input plugin from the very beginning on. In case an input plugin is registered at runtime, the internal Wikitude SDK camera is first stopped, subsequently followed by a transition to the newly registered input plugin.

Input Plugin Base Class

class InputPlugin: public Plugin {
public:
    using InputFrameAvailableNotifier = std::function<int(long frameId, std::shared_ptr<unsigned char> frameData)>;

public:
    InputPlugin(std::string identifier_);
    virtual ~InputPlugin();

    virtual bool requestsInputFrameRendering();
    virtual bool requestsInputFrameProcessing();

    void notifyNewInputFrame(long frameId_, std::shared_ptr<unsigned char> inputFrame_, bool managedFromOutside_ = false);

    InputRenderSettings& getRenderSettings();
    InputFrameSettings& getFrameSettings();
    virtual void prepareRenderingOfInputFrame(long frameId_);

    virtual std::shared_ptr<unsigned char> getPresentableInputFrameData();

    virtual void internalError(const std::string& errorMessage);
    void setInputFrameAvailableNotifier(InputFrameAvailableNotifier newInputFrameAvailableNotifier);

private:
    InputFrameAvailableNotifier                     _newInputFrameAvailableNotifier;

    InputFrameRenderSettings                        _renderSettings;
    InputFrameSettings                              _frameSettings;
    std::unique_ptr<InputFrameBufferController>     _inputFrameBufferController;
};

The keen observer will be quick to notice that the InputPlugin class is derived from the Plugin class. This allows an InputPlugin to be handled in the same manner as a regular plugin; therefore Plugin instantiation and registration are identical and will not be discussed redundantly in this guide. Please refer to the Plugins API guide for a detailed explanation. Instead, we will focus on and demonstrate the use of the newly introduced functions.

Color Space

getFrameSettings.setInputFrameColorSpace(wikitude::sdk::FrameColorSpace::YUV_420_NV21);

The Wikitude Native SDK currently accepts RGB frame data corresponding to the FrameColorSpace::RGB value, as well as YUV data in 4:2:0 NV21 format corresponding to the FrameColorSpace::YUV_420_NV21 value. The former implies a frame data size of frameWidth * frameHeight * 3 bytes, while the latter implies a frame data size of frameWidth * frameHeight * 3 / 2 bytes.

Field of View

getFrameSettings.setFrameFieldOfView(frameFieldOfView)

The setFrameFieldOfView function parameter needs to be a float value representing the horizontal field of view angle in degrees of the camera the provided frame was captured with. This value is required for the Wikitude computer vision engine to be able to accurately recognise and track targets within the provided frame. Note that the field of view value may significantly differ from device to device, we therefore recommend querying this value from the frame source directly to ensure representative values. For input image files and input video files this value should be discernible from the corresponding meta data; for an input camera stream this value should be accessible though the corresponding camera API.

Frame Size

getFrameSettings.setInputFrameSize(inputFrameSize)

The setInputFrameSize function parameter needs to be of type wikitude::sdk::Size<int> containing the input image width and input image height in pixels. Since this value will be constant for many use cases, you may consider hard-coding it to the appropriate values. Alternatively, as with the previous function, we recommend querying the values from either the input file or the input camera device.

Default Frame Rendering

bool YUVFrameInputPlugin::requestsInputFrameRendering() {
    return false;
}

The requestsInputFrameRendering function can be overridden to provide a boolean value indicating whether the input frame data should be processed by the Wikitude SDK Native API or not. The default implementation returns true, meaning that the frame will be rendered using the internal rendering of the Wikitude SDK Native API to present the frame. Should this function be overridden to return false, the responsibility to present the frame becomes that of the InputPlugin.

Default Frame Processing

bool YUVFrameInputPlugin::requestsInputFrameProcessing() {
    return true;
}

The requestsInputFrameProcessing function can be overridden to provide a boolean value indicating whether the input frame data should be processed by the Wikitude computer vision engine. The default implementation returns true, meaning that it will be processed. The plugin will be notified of the recognition results through the update function, as is the case for the regular Plugins API. Should this function to be overridden to return false, the responsibility to perform the desired algorithms becomes that of the InputPlugin.

Supplying Image Data

void notifyNewInputFrame(long frameId_, std::shared_ptr<unsigned char> inputFrame_, bool managedFromOutside_ = false);

The notifyNewInputFrame function needs to be called to pass the actual input frame data to the Wikitude SDK Native API. It requires a unique frame identifier of long type to be supplied, as well as the frame data itself wrapped into a std::shared_ptr<unsigned char>. It additionally accepts a boolean value indicating whether the default frame caching should be employed or not. The parameter value defaults to false, meaning the Wikitude SDK Native API default caching will be used. Should you prefer to supply your own frame caching mechanism, set this value to true. The default caching mechanism keeps up to 5 recent frames in memory to ensure smooth processing performance. Note that you may need to invoke this method from native code due to the file resource or input stream device only being accessible therein. For this use case we recommend having a look at the custom camera sample application code of the Wikitude Native SDK example application.

Rendering Configuration

InputFrameRenderSettings& getRenderSettings();

The getRenderSettings function behaviour can be altered to provide a parameterised instance of type InputFrameRenderSettings. The default implementation returns the default constructed _renderSettings member. Should you like to provide render settings to the internal Wikitude Native SDK that differ from the default constructed values, alter the _renderSettings accordingly before it is returned.

Frame Caching

virtual void prepareRenderingOfInputFrame(long frameId_);

The prepareRenderingOfInputFrame function is called whenever a frame has been processed to report its identifier. It is, however, only called if requestsInputFrameRendering has been overridden to return false and requestsInputFrameProcessing has been overridden to return true. The default implementation of this function releases the frame indicated by the received identifier as well as any older frames from the frame cache. In case of a non-default frame caching mechanism, override this method accordingly. An input parameter of -1 identifies the most recent frame.

Receiving Processed Frame Data

virtual std::shared_ptr<unsigned char> getPresentableInputFrameData();

The getPresentableInputFrameData function can be called to receive the frame data of the most recently processed frame from the default frame cache. Use this method in case requestsInputFrameRendering has been overridden to return false in order to retrieve the current frame data to be rendered. When using a custom frame caching mechanism, this function is obsolete.

Error handling

virtual void internalError(const std::string& errorMessage);

The internalError function gets called whenever an internal error occurs in the Wikitude Native SDK that was not directly related to input plugins. The input parameter provides a description of the error that occurred.

Internal Use Only

void setInputFrameAvailableNotifier(InputFrameAvailableNotifier newInputFrameAvailableNotifier);

The setInputFrameAvailableNotifier function is called internally and should not be called anywhere else.

Simple Input Plugin

This sample shows the implementation of a custom camera whose frames are rendered by the Wikitude SDK.

The Simple Input Plugin Example consists of the C++ class SimpleInputPlugin and the Java classes SimpleInputPluginActivity, WikitudeCamera and WikitudeCamera2. Both WikitudeCamera classes contain a basic implementation of the android camera API. The class SimpleInputPluginActivity registers the plugin in the SDK and handles the view. The class SimpleInputPlugin derives from InputPlugin.

In onPostCreateof the SimpleInputPluginActivity the C++ Plugin(SimpleInputPlugin) is registered to the SDK by using architectView.registerNativePlugins.

With initNative we pass the SimpleInputPluginActivity to the SimpleInputPlugin which is then used to call SimpleInputPluginActivity methods from C++.

FrameSize is set in pixel using SimpleInputPlugin.setFrameSize. The FrameSize has to be set before initialization of the Plugin (Plugin::initialize).

public void onPostCreate(final Bundle savedInstanceState) {

    // other onPostCreate code

    // register Plugin in the wikitude SDK and in the jniRegistration.cpp
    this.architectView.registerNativePlugins("wikitudePlugins", "simple_input_plugin", new PluginManager.PluginErrorCallback() {
        @Override
        public void onRegisterError(int errorCode, String errorMessage) {
            Log.v(TAG, "Plugin failed to load. Reason: " + errorMessage);
        }
    });

    // sets this activity in the plugin
    initNative();

    setFrameSize(FRAME_WIDTH, FRAME_HEIGHT);
}

The pointer to the SimpleInputPluginActivity is passed to the SimpleInputPlugin.

extern "C" JNIEXPORT void JNICALL
Java_com_wikitude_samples_SimpleInputPluginActivity_initNative(JNIEnv* env, jobject obj) {
    env->GetJavaVM(&pluginJavaVM);
    simpleInputPluginActivity = env->NewGlobalRef(obj);
}

During the first update call the equivalent methods of the SampleInputPluginActivity for the Plugin lifecycle are stored. After that the C++ -> Java connection is fully established.

void SimpleInputPlugin::update(const std::list<wikitude::sdk::RecognizedTarget>& recognizedTargets_) {
    if ( !_jniInitialized ) {
        JavaVMResource vm(pluginJavaVM);
        jclass simpleInputPluginActivityClass = vm.env->GetObjectClass(simpleInputPluginActivity);
        _pluginInitializedMethodId = vm.env->GetMethodID(simpleInputPluginActivityClass, "onInputPluginInitialized", "()V");
        _pluginPausedMethodId = vm.env->GetMethodID(simpleInputPluginActivityClass, "onInputPluginPaused", "()V");
        _pluginResumedMethodId = vm.env->GetMethodID(simpleInputPluginActivityClass, "onInputPluginResumed", "()V");
        _pluginDestroyedMethodId = vm.env->GetMethodID(simpleInputPluginActivityClass, "onInputPluginDestroyed", "()V");

        _jniInitialized = true;

        callInitializedJNIMethod(_pluginInitializedMethodId);
        callInitializedJNIMethod(_pluginResumedMethodId);
    }
}

The following overridden methods specify the behaviour of the InputPlugin.

// Defines that the SDK should render the camera frame.
bool SimpleInputPlugin::requestsInputFrameRendering() {
    return true;
}
// Defines that the SDK should process the frames it gets from the InputPlugin.
bool SimpleInputPlugin::requestsInputFrameProcessing() {
    return true;
}
// Defines that the SDK allows other Plugins to use the frames provided by the InputPlugin
bool SimpleInputPlugin::allowsUsageByOtherPlugins() {
    return true;
}

The Plugin lifecycle:

// When initialize of the Plugin is called the FrameColorSpace of the InputPlugin, which can be YUV_420_NV21(default), YUV_420_YV12 or RGB, is set.
void SimpleInputPlugin::initialize() {
    getFrameSettings().setInputFrameColorSpace(wikitude::sdk::FrameColorSpace::YUV_420_NV21);
}

void SimpleInputPlugin::pause() {
    _running = false;
    callInitializedJNIMethod(_pluginPausedMethodId);
}

void SimpleInputPlugin::resume(unsigned int pausedTime_) {
    _running = true;
    callInitializedJNIMethod(_pluginResumedMethodId);
}

void SimpleInputPlugin::destroy() {
    callInitializedJNIMethod(_pluginDestroyedMethodId);
}

Whenever the SimpleInputPlugin receives a camera frame it notifies the SDK of it.

void SimpleInputPlugin::notifyNewImageBufferData(std::shared_ptr<unsigned char> imageBufferData_) {
    if ( _running ) {
        notifyNewInputFrame(++_frameId, imageBufferData_);
    }
}

The following methods set required information about the input frame.

extern "C" JNIEXPORT void JNICALL
Java_com_wikitude_samples_SimpleInputPluginActivity_setFrameSize(JNIEnv* env, jobject obj, jint frameWidth, jint frameHeight) {
    SimpleInputPlugin::instance->getFrameSettings().setInputFrameSize({frameWidth, frameHeight});
}

extern "C" JNIEXPORT void JNICALL
Java_com_wikitude_samples_SimpleInputPluginActivity_setCameraFieldOfView(JNIEnv* env, jobject obj, jfloat fieldOfView) {
    SimpleInputPlugin::instance->getFrameSettings().setFrameFieldOfView(fieldOfView);
}

Custom Camera

The custom camera example illustrates both principles in a unified use case. A custom camera stream is supplied as an input and a custom rendering effect is used to augment the rendered output.

Concurrency

When implementing an InputPlugin, one needs to be aware that its callback functions are invoked concurrently by the Wikitude SDK Native API. It is therefore necessary to protect against race conditions accordingly. We will present two recommended measures to do so: atomic operations and mutual exclusion.

In order to fully utilize the capabilities of the Input Plugin API one must gather data from several asynchronously called member functions, store them, potentially as member variables, and subsequently use them collectively. These operations might be vulnerable to race conditions.

An example snippet from the custom camera example application code:

void YUVFrameInputPlugin::surfaceChanged(wikitude::sdk::Size<int> renderSurfaceSize_, wikitude::sdk::Size<float> cameraSurfaceScaling_, wikitude::sdk::DeviceOrientation deviceOrientation_) {

    // some orientation handling code here

    _surfaceInitialized.store(true);
}
void YUVFrameInputPlugin::startRender() {
    // some early exit code here

    render();
}
void YUVFrameInputPlugin::render() {

    // some early exit code here

    if (!_surfaceInitialized.load()) {
        return;
    }

    // lots of OpenGL code here
}
#include <atomic>

std::atomic_bool _surfaceInitialized;

The surfaceChanged function and the startRender function are invoked concurrently. We rely on a boolean value inside the render function that is set from the surfaceChanged function, yielding a race condition should boolean reads and writes be non-atomic. In such cases, involving intrinsic data types for which atomic operations are provided by the C++ standard library, we recommend their use. These std::atomics can either be set and read intuitively through the corresponding operators or though the load and store functions.

An alternative snippet for which atomic operations are not available:

void YUVFrameInputPlugin::update(const std::list<wikitude::sdk::RecognizedTarget>& recognizedTargets_) {
    // platform specific intialization code here

    { // mutex auto release scope
        std::lock_guard<std::mutex> lock(_currentlyRecognizedTargetsMutex);
        _currentlyRecognizedTargets = std::list<wikitude::sdk::RecognizedTarget>(recognizedTargets_);
    }
}
void YUVFrameInputPlugin::startRender() {
    // some early exit code here

    render();
}
void YUVFrameInputPlugin::render() {
    // early returns and lots of OpenGL code here

    { // mutex auto release scope
    std::unique_lock<std::mutex> lock(_currentlyRecognizedTargetsMutex);

        if (!_currentlyRecognizedTargets.empty()) {
            const wikitude::sdk::RecognizedTarget targetToDraw = _currentlyRecognizedTargets.front();

            // early unlock to minimize locking duration
            lock.unlock();

            // lots of OpenGL code here
        }
    }
}

The update function and the startRender function are invoked concurrently. We, again, rely on data being set from an asynchronous function within our render function. Contrary to the previous case though, an object of type std::list cannot be set atomically using std::atomics. Therefore we employ a std::mutex as a locking mechanism to ensure atomicity. As depicted by the code snippet, we encourage the use of RAII style mutex locking using std::lock_guard and std::unique_lock to ensure proper mutex release.

OpenGL Context

Another important issue to be aware of is the availability of a valid OpenGL context during plugin run-time. We guarantee such a valid context to be available during the execution of the startRender, endRender, pause and resume functions. The former two functions should contain all of the rendering related function calls, while the latter should be used to release and acquire OpenGL related resources as the OpenGL context is likely to be destroyed upon pausing the application and recreated upon resuming the application. Therefore all the previously acquired OpenGL handles are no longer valid and need to be reacquired.

A code snipped from the custom camera example:

void YUVFrameInputPlugin::pause() {

    releaseFramebufferObject();
    releaseFrameTextures();
    releaseVertexBuffers();
    releaseShaderProgram();

    _renderingInitialized.store(false);

    // some additional code here
}
void YUVFrameInputPlugin::startRender() {
    if (!_renderingInitialized.load()) {
        _renderingInitialized.store(setupRendering());
    }

    render();
}

We release all the OpenGL resources we previously created and atomically set the _renderingInitialized flag to false, causing the rendering environment to be reinitialised during the next execution of the render loop.

Device Orientation

Lastly, we will demonstrate the rendering of an input frame from within an InputPlugin using OpenGL with device orientations taken into account. While there are alternative ways to achieve the correctly oriented frame renderings, we recommend applying the required transformations as matrices within a custom vertex shader.

We propose the following code in our custom camera example application to compose a matrix that is to be applied to a fullscreen quad:

void YUVFrameInputPlugin::surfaceChanged(wikitude::sdk::Size<int> renderSurfaceSize_, wikitude::sdk::Size<float> cameraSurfaceScaling_, wikitude::sdk::DeviceOrientation deviceOrientation_) {
    wikitude::sdk::Matrix4 scaleMatrix;
    scaleMatrix.scale(cameraSurfaceScaling_.width, cameraSurfaceScaling_.height, 1.0f);

    switch (deviceOrientation_)
    {
        case wikitude::sdk::DeviceOrientation::DeviceOrientationPortrait:
        {
            wikitude::sdk::Matrix4 rotationToPortrait;
            rotationToPortrait.rotateZ(270.0f);

            _orientationMatrix = rotationToPortrait;
            break;
        }
        case wikitude::sdk::DeviceOrientation::DeviceOrientationPortraitUpsideDown:
        {
            wikitude::sdk::Matrix4 rotationToUpsideDown;
            rotationToUpsideDown.rotateZ(90.0f);

            _orientationMatrix = rotationToUpsideDown;
            break;
        }
        case wikitude::sdk::DeviceOrientation::DeviceOrientationLandscapeLeft:
        {
            wikitude::sdk::Matrix4 rotationToLandscapeLeft;
            rotationToLandscapeLeft.rotateZ(180.0f);

            _orientationMatrix = rotationToLandscapeLeft;
            break;
        }
        case wikitude::sdk::DeviceOrientation::DeviceOrientationLandscapeRight:
        {
            _orientationMatrix.identity();
            break;
        }
    }

    _modelMatrix = scaleMatrix * _orientationMatrix;

    // some synchronization code here
}
attribute vec3 vPosition;
attribute vec2 vTexCoords;

varying mediump vec2 fTexCoords;

uniform mat4 uModelMatrix;

void main(void)
{
    gl_Position = uModelMatrix * vec4(vPosition, 1.0);
    fTexCoords = vTexCoords;
}";
struct Vertex
{
    GLfloat position[3];
    GLfloat texCoord[2];
};

Vertex _vertices[4];
_vertices[0] = (Vertex){{1.0f, -1.0f, 0}, {1.0f, 0.0f}};
_vertices[1] = (Vertex){{1.0f, 1.0f, 0}, {1.0f, 1.0f}};
_vertices[2] = (Vertex){{-1.0f, 1.0f, 0}, {0.0f, 1.0f}};
_vertices[3] = (Vertex){{-1.0f, -1.0f, 0}, {0.0f, 0.0f}};

The matrix composed within the surfaceChanged function is supplied to the vertex shader as a uniform parameter and subsequently used to transform the input vertices. Be aware though, that an additional matrix may be required depending on whether you previously rendered to a FBO. If so, the following amendment to the surfaceChanged function should correct the flipped Y-axis resulting from this process:

wikitude::sdk::Matrix4 scaleMatrix;
_fboCorrectionMatrix.scale(1.0f, -1.0f, 1.0f);

// same device orientation code here as depicted above

_modelMatrix = scaleMatrix * _orientationMatrix * _fboCorrectionMatrix;

For a complete implementation of an input plugin for a specific and advanced use case, we strongly recommend looking into the custom camera example application source code. Additionally, the custom camera sample source code is an excellent starting point to build your own implementation from.