Input Plugins API Wikitude SDK iOS Native API 9.13.0 Documentation

Input Plugins API

This guide provides an introduction into the input plugins API of the Wikitude Native SDK and aims to familiarize the reader with its concepts and constraints. Due to the length and complexity of the corresponding example application source code, it will not be presented in its entirety. Relevant and descriptive source code examples will, however, be provided. Since the Input Plugins API is an extension to the Plugins API, we recommend familiarity with it prior to reading this guide.

About Wikitude Native SDK Input Plugins

The Input Plugins API provides a means to alter the inputs and outputs of the Wikitude Native SDK. For the input case specifically, custom frame data of arbitrary sources can be supplied as an input to the Wikitude Native SDK API for processing. Complementary, for the output case, the default rendering of the Wikitude Native SDK can be substituted with more advanced implementations.

Input plugins can be registered before the SDK is started or while it is already running. In case it is registered before the Wikitude Native SDK is started, the internal Wikitude camera implementation is not started at all and the Wikitude Native SDK starts using the input plugin from the very beginning on. In case an input plugin is registered at runtime, the internal Wikitude Native SDK camera is first stopped, subsequently followed by a transition to the newly registered input plugin.

Camera Frame Input Plugin Module Class

class CameraFrameInputPluginModule : public PluginModule {
public:
    CameraFrameInputPluginModule() noexcept = default;
    virtual ~CameraFrameInputPluginModule() = default;

    /**
    * Override/implement this method to know when the default platform camera is fully released and this camera frame input plugin module can safely access all platform camera resources
    */
    virtual void onCameraReleased() = 0;
    virtual void onCameraReleaseFailed(const sdk::Error& error_) = 0;

    /**
    * Implement this method if this plugin module supports suspending camera frame updates while the surrounding SDK is still running.
    * This would be the case if PlatformCameraModule::setEnabled(false) is called while WikitudeUniversalSDK::isRunning is true.
    *
    * In case PlatformCameraModule::setEnabled(true) is called, ::startCameraFrameUpdates() is called again.
    *
    * In case PlatformCameraModule::setEnabled(true) is called while WikitudeUniversalSDK::isRunning is false, it's up to subclasses of this to handle this state correctly
    */
    virtual void pauseCameraFrameUpdates();
    virtual void resumeCameraFrameUpdates();

    /**
    * Implement this method if this plugin module supports camera focus mode changes.
    */
    virtual sdk::CallStatus setFocusMode(CameraFocusMode focusMode_);

    /**
    * Default: false
    */
    bool requestsCameraFrameRendering();

    /* Called from the Wikitude SDK */
    void registerOnPluginCameraReleasedHandler(std::function<void()> onPluginCameraReleasedHandler_);
    void registerNotifyNewUnmanagedCameraFrameHandler(std::function<void(const sdk::CameraFrame& cameraFrame_)> notifyNewUnmanagedCameraFrameHandler_);
    void registerCameraToSurfaceAngleChangedHandler(std::function<void(float cameraToSurfaceAngle_)> cameraToSurfaceAngleChangedHandler_);
    void registerOnPluginCameraErrorHandler(std::function<void(const sdk::Error& error_)> onPluginCameraErrorHandler_);

protected:
    /**
    * Call this method to notify a new camera frame to the SDK
    */
    void notifyNewUnmanagedCameraFrameToSDK(const sdk::CameraFrame& cameraFrame_);

    /**
    * Call this method to notify the SDK that this camer frame input plugin module fully released all platform camera resources.
    *
    */
    void notifyPluginCameraReleased();

    void setCameraToSurfaceAngle(float cameraToSurfaceAngle_);

    void onPluginCameraError(const sdk::Error& error_);

protected:
    bool                                            _requestsCameraFrameRendering = false;
    bool                                            _userDisabledCameraFrameUpdates = false;

private:
    std::function<void()>                           _onPluginCameraReleasedHandler;
    std::function<void(const sdk::CameraFrame&)>    _notifyNewUnmanagedCameraFrameHandler;
    std::function<void(float)>                      _cameraToSurfaceAngleChangedHandler;
    std::function<void(const sdk::Error&)>          _onPluginCameraErrorHandler;
};

An input plugin is simply a plugin that has a CameraFrameInputPluginModule associated with it. The module allows implementing the input related features.

Camera Frame Input Plugin Module Implementation

The following code is a minimal example of how to create an input plugin and provide a camera frame to be rendered and processed. The code corresponds to the simple input plugin sample of the Wikitude Native SDK sample application.

class SimpleYUVCameraFrameInputPluginModule : public wikitude::sdk::CameraFrameInputPluginModule {
public:
    SimpleYUVCameraFrameInputPluginModule() {
        _requestsCameraFrameRendering = true;
    }

    void onCameraReleased() override {

    }

    void onCameraReleaseFailed(const wikitude::sdk::Error& error_) override {

    }

    void notifyNewCameraFrame(const wikitude::sdk::CameraFrame& cameraFrame_) {
        notifyNewUnmanagedCameraFrameToSDK(cameraFrame_);
    }

    void cameraToSurfaceAngleChanged(float cameraToSurfaceAngle_) {
        setCameraToSurfaceAngle(cameraToSurfaceAngle_);
    }
};

The _requestsCameraFrameRendering flag is used to communicate whether the frame should be rendered by the SDK or not.

The onCameraReleased and onCameraReleaseFailed function should be used to wait for the internal camera of the SDK to shut down before opening a new camera. This only applies if the input plugin is registered after the SDK has been started.

The notifyNewUnmanagedCameraFrameToSDK function is used to pass a camera frame to the SDK for rendering and processing. The CameraFrame class encapsulates the frame data and the frame meta data.

The setCameraToSurfaceAngle function is used to pass the rotation of the camera relative to the rotation of the render surface. This has to be set so that the SDK can render the camera frame correctly and provide correct matrices for tracking.

After a camera frame has been acquired, in can be converted and forwarded using code akin to the following snippet.

std::int64_t timevalue = presentationTimestamp_.value;

const float fov = static_cast<float>([_camera fieldOfView]);
const wikitude::sdk::Size<int> size{static_cast<int>(CVPixelBufferGetWidth(imageBuffer_)), static_cast<int>(CVPixelBufferGetHeight(imageBuffer_))};
const std::int32_t timescale = presentationTimestamp_.timescale;
wikitude::sdk::ColorCameraFrameMetadata metaData(fov,
                                                 size,
                                                 wikitude::sdk::CameraPosition::Back,
                                                 wikitude::sdk::ColorSpace::YUV_420_NV21,
                                                 timescale
                                                 );

std::vector<wikitude::sdk::CameraFramePlane> planes;

if ( CVPixelBufferIsPlanar(imageBuffer_) ) {
    std::size_t planeCount = CVPixelBufferGetPlaneCount(imageBuffer_);
    for (std::size_t i = 0; i < planeCount; ++i) {
        std::size_t planeSize = CVPixelBufferGetHeightOfPlane(imageBuffer_, i) * CVPixelBufferGetBytesPerRowOfPlane(imageBuffer_, i);
        void* planeData = CVPixelBufferGetBaseAddressOfPlane(imageBuffer_, i);
        planes.emplace_back(planeData, planeSize);
    }
} else {
    std::size_t frameSize = CVPixelBufferGetDataSize(imageBuffer_);
    void* frameData = CVPixelBufferGetBaseAddress(imageBuffer_);
    planes.emplace_back(frameData, frameSize);
}

wikitude::sdk::CameraFrame cameraFrame(_frameId++, timevalue, metaData, planes);
notifyNewCameraFrame(cameraFrame);

For a complete implementation of an input plugin for a specific and advanced use case, we strongly recommend looking into the custom camera example application source code. Additionally, the custom camera sample source code is an excellent starting point to build your own implementation from.

Frame Metadata

As seen in the latter code snippet of the previous section, a camera frame is accompanied by a metadata object. This object is to be constructed as follows.

Field of View

The field of view parameter is a floating point value representing the horizontal view angle in degrees of the camera the provided frame was captured with. This value is required for the Wikitude computer vision engine to be able to accurately recognise and track targets within the provided frame. Note that the field of view value may significantly differ from device to device, we therefore recommend querying this value from the frame source directly to ensure representative values. For input image files and input video files this value should be discernible from the corresponding meta data; for an input camera stream this value should be accessible through the corresponding camera API.

Frame Size

The frame size parameter is of type wikitude::sdk::Size<int> and contains the input image width and input image height in pixels. Since this value will be constant for many use cases, you may consider hard-coding it to the appropriate values. Alternatively, as with the previous function, we recommend querying the values from either the input file or the input camera device.

Camera Position

Should the input plugin represent an image stream captured from one of the host device's cameras, the camera position parameter indicates which camera provides said image stream. It is an enumeration value which is defined in the CameraPosition.hpp header file. Currently valid values are Back, Front and Unspecified. This enumeration may, however, be extended should smartphone or smartglass technology evolve to include additional cameras.

Color Space

The color space parameter is of an enumeration type defined in the ColorSpace.hpp header. It indicates what format the supplied frame is encoded in. The enumeration includes several values for both RGB as well as YUV data and needs to be set appropriately. The ColorSpace.hpp header includes extensive comment on each of the enumeration value which is why additional details are omitted here. The interested reader is instead referred to the header file.

Timescale

The timescale parameter defines how the time values of the input frames are to be interpreted. It is an integer value that is used as the denominator in the following equation to determine the time in seconds: seconds = timeval / timescale. The timescale will likely correspond to an reciprocal SI prefix, e.g. 1 for identity, 10^3 for milli, 10^6 for micro, 10^9 for nano and so forth.

Frame Rendering

To indicate that you would like the input frame rendered by the Wikitude Native SDK, simply set the _requestsCameraFrameRendering field of the CameraFrameInputPluginModule base class to true. Should you prefer to do your own rendering and simply submit the input frame for processing, set it to false instead.

For the latter case, you need to implement at least one rendering plugin module that performs the rendering of the camera frame. Each rendering API to be supported requires its own rendering module. In each of these modules the camera frame data is to be acquired and subsequently rendered. While custom camera frame rendering can be tricky to get correct, it allows effects to be added that the SDK does not support out of the box.

void OpenGLESScanningEffectRenderingPluginModule::startRender(wikitude::sdk::RenderableCameraFrameBucket& frameBucket_) {
    long processedFrameId_ = _trackingParameters.getProcessedFrameId();
    frameBucket_.getRenderableCameraFrameForId(processedFrameId_, [this](wikitude::sdk::RenderableCameraFrame& frame_) {
        render(frame_);
    }, [](wikitude::sdk::Error& error_) {
        std::cout << error_;
    });
}

Rendering Considerations

While rendering of a camera frame appears to be a trivial task, it can be quite intricate to get right. Since the aspect ratio of the camera frame does not necessarily match the aspect ratio of the display, especially so considering the four different device orientations, transformations need to be adjusted accordingly. The following snippets showcase how to acquire the necessary data and hint at how to apply them. For a complete implementation the reader is referred to the actual sample code which demonstrates the entirety of the process.

The plugin registers to be notified when changes in the data occur. The cameraToSurfaceAngle is an integer rotation value in degrees between the camera frame and the rendering surface. The cameraToSurfaceScaling is a two component floating point scale value (in X and Y) between the camera frame and the rendering surface. These values change upon device orientation change.

wikitude::sdk::RuntimeParameters& runtimeParameters = pluginParameterCollection_.getRuntimeParameters();

runtimeParameters.addCameraToSurfaceAngleChangedHandler(reinterpret_cast<std::uintptr_t>(this), [&](float cameraToSurfaceAngle_) {
    static_cast<OpenGLESScanningEffectRenderingPluginModule*>(getOpenGLESRenderingPluginModule())->cameraToSurfaceAngleChanged(cameraToSurfaceAngle_);
});

runtimeParameters.addCameraToSurfaceScalingChangedHandler(reinterpret_cast<std::uintptr_t>(this), [&](wikitude::sdk::Scale2D<float> cameraToSurfaceScaling_) {
    static_cast<OpenGLESScanningEffectRenderingPluginModule*>(getOpenGLESRenderingPluginModule())->cameraToSurfaceScalingChanged(cameraToSurfaceScaling_);
});

The module has its changed handlers called and updates its internal transformations accordingly. Since these updates are asynchronous in nature, synchronization is required within the handlers as well as the rendering loop that uses it.

void OpenGLESScanningEffectRenderingPluginModule::cameraToSurfaceAngleChanged(float cameraToSurfaceAngle_) {
    _cameraToSurfaceAngle = cameraToSurfaceAngle_;

    std::lock_guard<std::mutex> l(_surfaceInitializedMutex);

    _cameraToSurfaceAngleInitialized = true;

    if (_cameraToSurfaceScalingInitialized) {
        _surfaceInitialized = true;

        updateMatrices();
    }
}

void OpenGLESScanningEffectRenderingPluginModule::cameraToSurfaceScalingChanged(wikitude::sdk::Scale2D<float> cameraToSurfaceScaling_) {
    _cameraToSurfaceScaling = cameraToSurfaceScaling_;

    std::lock_guard<std::mutex> l(_surfaceInitializedMutex);

    _cameraToSurfaceScalingInitialized = true;

    if (_cameraToSurfaceAngleInitialized) {
        _surfaceInitialized = true;

        updateMatrices();
    }
}

Simple Input Plugin

This sample shows the implementation of a custom camera whose frames are rendered by the Wikitude Native SDK. The Simple Input Plugin Example consists of the C++ class SimpleYUVInputPlugin and the Objective-C classWTSimpleYUVDeviceCameraInputPlugin. The class WTSimpleDeviceCamera contains a simple implementation of the AVFoundation camera API. The class WTSimpleYUVDeviceCameraInputPlugin creates an instance of SimpleYUVInputPlugin, so that the iOS SDK camera can be started/stopped and the frames passed to the C++ input plugin implementation.

The code to register the plugin is written in WTSimpleInputPluginViewController.mm file. The WTSimpleYUVInputCamera object is created right after the WTWikitudeNativeSDK object is created

self.simpleYUVInputCamera = [[WTSimpleYUVInputCamera alloc] init];

and the input plugin is registered before the Wikitude Native SDK is started.

NSError *error = nil;
    BOOL pluginRegistered = [self.wikitudeSDK registerPlugin:[_simpleYUVInputCamera cameraInputPlugin] error:&error];
    if ( !pluginRegistered )
    {
        NSLog(@"Unable to register plugin '%@'. Error: %@", [NSString stringWithUTF8String:[_simpleYUVInputCamera cameraInputPlugin]->getIdentifier().c_str()], [error localizedDescription]);
    }

    [self.wikitudeSDK start:nil completion:^(BOOL isRunning, NSError * _Nonnull error) {
        // ...
    }];

Registering an input plugin even before the Wikitude Native SDK starts indicates to the Wikitude Native SDK that the internal camera implementation is not needed and therefore also not started. This leads to a clean and fast startup of the SDK with no camera switch happening in the first frames.

The class SimpleYUVInputPlugin derives from Plugin and contains the most minimalistic implementation of a input plugin for the Wikitude SDKs.

The camera lifecycle is coupled to the input plugin lifecycle methods. The following code snippets demonstrates this:

void SimpleYUVInputPlugin::initialize(const std::string& temporaryDirectory_, wikitude::sdk::PluginParameterCollection& pluginParameterCollection_) {

    /* The initialize method is used to initialize the iOS SDK camera */
    setCameraFrameInputPluginModule(std::make_unique<YUVDeviceCameraFrameInputPluginModule>());
}

void SimpleYUVInputPlugin::pause() {

    /* In case the SDK pauses (because the hosting application resignes active), also the iOS SDK camera is paused */
    iterateEnabledPluginModules([](wikitude::sdk::PluginModule& pluginModule_) {
        pluginModule_.pause();
    });
}

void SimpleYUVInputPlugin::resume(unsigned int pausedTime_) {

    /* Like in `pause()`, `resume()` is used to resume the iOS SDK camera */
    iterateEnabledPluginModules([&](wikitude::sdk::PluginModule& pluginModule_) {
        pluginModule_.resume(pausedTime_);
    });
}

void SimpleYUVInputPlugin::destroy() {

    /* In case the plugin is destroyed (Which is called as soon as the plugin is unregistered and the hosting application destroyes the shared_ptr used to register the plugin), also the iOS SDK camera is released */
    static_cast<YUVDeviceCameraFrameInputPluginModule*>(getCameraFrameInputPluginModule())->releaseCamera();
}

The implementation of notifyNewImageBufferData calls the notifyNewInputFrame method defined for input plugins. It generates a new frame id and passes the std::shared_ptr containing the frame data generated in Objective-C.

void SimpleYUVInputPlugin::notifyNewImageBufferData(std::shared_ptr<unsigned char> imageBufferData) {
    notifyNewInputFrame(++_frameId, imageBufferData);
}

Please note that the implementation that is used here is purely for demonstration purposes. Production applications are free to implement camera access and camera frame distribution completely different.