Video & Audio Filters

Video Filters

Some apps allow filters to be applied to the current user’s video, such as blurred or virtual background, AR elements (glasses, moustaches etc) or color filters (such as sepia, bloom etc).

Our SDK offers background blurring and virtual backgrounds out of the box and also has support for injecting your custom filter into the calling experience.

AI Filters

We ship two AI video filters, background blur and virtual background, in a separate artifact. To use these filters, add the video filter library as a dependency:

dependencies {
    implementation("io.getstream:stream-video-android-filters-video:$stream_version")
}

Background Blur

Use the BlurredBackgroundVideoFilter class to add a background blur effect to the video feed. Pass an instance of it to the videoFilter call property.

call.videoFilter = BlurredBackgroundVideoFilter()

The BlurredBackgroundVideoFilter class has the following parameters:

Parameter	Description	Default Value
`blurIntensity`	Intensity of the blur effect. Three options available in the `BlurIntensity` enum: `LIGHT`, `MEDIUM`, `HEAVY`.	`BlurIntensity.MEDIUM`
`foregroundThreshold`	Confidence threshold for the foreground. Pixels with a confidence value greater than or equal to this threshold are considered to be in the foreground. Coerced between 0 and 1, inclusive.	`0.99999`

Virtual Backgrounds

Pass an instance of VirtualBackgroundFilter to the videoFilter call property to set a custom image as the participant’s background.

call.videoFilter = VirtualBackgroundVideoFilter(context, R.drawable.background_image)

The VirtualBackgroundFilter class has the following parameters:

Parameter	Description	Default Value
`context`	Context used to access resources.	None
`backgroundImage`	The drawable resource ID of the image to be used as a virtual background.	None
`foregroundThreshold`	Confidence threshold for the foreground. Pixels with a confidence value greater than or equal to this threshold are considered to be in the foreground. Coerced between 0 and 1, inclusive.	`0.99999`

Adding Your Custom Filter

You can inject your own filter through the Call.videoFilter property. You will receive each frame of the user’s local video, so you have complete freedom over the image processing that you perform.

There are two types of filters that you can create:

By extending the BitmapVideoFilter abstract class. This gives you a bitmap for each video frame, which you can manipulate directly.

abstract class BitmapVideoFilter : VideoFilter() {
    fun applyFilter(videoFrameBitmap: Bitmap)
}

You need to manipulate the original bitmap instance to apply the filters. You can of course create a new bitmap in the process, but you need to draw it on the videoFrameBitmap instance you get in the filter callback.

By extending the RawVideoFilter abstract class. This gives you direct access to the VideoFrame WebRTC object.

abstract class RawVideoFilter : VideoFilter() {
    abstract fun applyFilter(
        videoFrame: VideoFrame,
        surfaceTextureHelper: SurfaceTextureHelper,
    ): VideoFrame
}

BitmapVideoFilter is less performant than RawVideoFilter due to the overhead of certain operations, like YUV <-> ARGB conversions.

Example: Black & White Filter

We can create and set a simple black and white filter like this:

call.videoFilter = object : BitmapVideoFilter() {
    override fun applyFilter(videoFrameBitmap: Bitmap) {
        val c = Canvas(videoFrameBitmap)
        val paint = Paint()
        val cm = ColorMatrix()
        cm.setSaturation(0f)
        val f = ColorMatrixColorFilter(cm)
        paint.colorFilter = f
        c.drawBitmap(videoFrameBitmap, 0f, 0f, paint)
    }
}

Audio Filters

The Stream Video SDK also supports custom audio processing of the local track. This opens up possibilities for custom echo filtering, voice changing or other audio effects.

If you want to do custom audio processing, you need to provide your own implementation of the AudioFilter interface to Call.audioFilter.

The AudioFilter interface is defined like this:

interface AudioFilter {
    /**
     * Invoked after an audio sample is recorded. Can be used to manipulate
     * the ByteBuffer before it's fed into WebRTC. Currently the audio in the
     * ByteBuffer is always PCM 16bit and the buffer sample size is ~10ms.
     *
     * @param audioFormat format in android.media.AudioFormat
     */
    fun applyFilter(audioFormat: Int, channelCount: Int, sampleRate: Int, sampleData: ByteBuffer)
}

Example: Robot Voice

In the following example, we will build a simple audio filter that gives the user’s voice a robotic touch.

// We assume that you already have a call instance (call is started)
// Create a simple filter (pitch modification) and assign it to the call

call.audioFilter = object : AudioFilter {
    override fun applyFilter(audioFormat: Int, channelCount: Int, sampleRate: Int, sampleData: ByteBuffer) {
        // You can modify the pitch factor to achieve a bit different effect
        val pitchShiftFactor = 0.8f
        val inputBuffer = audioBuffer.duplicate()
        inputBuffer.order(ByteOrder.LITTLE_ENDIAN) // Set byte order for correct handling of PCM data

        val numSamples = inputBuffer.remaining() / 2 // Assuming 16-bit PCM audio

        val outputBuffer = ByteBuffer.allocate(inputBuffer.capacity())
        outputBuffer.order(ByteOrder.LITTLE_ENDIAN)

        for (channel in 0 until numChannels) {
            val channelBuffer = ShortArray(numSamples)
            inputBuffer.asShortBuffer().get(channelBuffer)

            for (i in 0 until numSamples) {
                val originalIndex = (i * pitchShiftFactor).toInt()

                if (originalIndex >= 0 && originalIndex < numSamples) {
                    outputBuffer.putShort(channelBuffer[originalIndex])
                } else {
                    // Fill with silence if the index is out of bounds
                    outputBuffer.putShort(0)
                }
            }
        }

        outputBuffer.flip()
        audioBuffer.clear()
        audioBuffer.put(outputBuffer)
        audioBuffer.flip()
    }
}

This is a simple algorithm that just does index shifting. For a more complex scenario you can use a voice processing library. The important part is that you update the channelBuffer with the filtered values.

Screen sharing

Screenshots