Ladybug Mixer: Video Mixing

HOME PRODUCTS SOLUTIONS PARTNERS STORE COMPANY

Ladybug Mixer
   § Introduction
   § Architecture
   § Document (PDF)
   § FAQ
   § Versions

Applications
   § Video Processing
   § Video Presentation
   § Digital Signage
   § Video Jockey VJ
   § Surveillance
   § Home Theater
   § Screen Wall
   § Stream Computing

Development
   § Configuration
   § Script Control
   § Download Demo

Video Composition

Today most of real-time video mixing is done on special hardware machines. Those machines are expensive and nonflexible. Unfortunately current general purpose Central Processing Unit (CPU) is not fast enough to complete multiple video composition in real-time.

The rapid development of the Graphics Processing Unit (GPU) technology had made the fastest GPU to be 10 to 50 times faster than the fastest CPU. The power of GPU with its parallel streaming pipelines makes real-time video mixing on GPU possible.

In theory CPU is good at procedure control and GPU is good at stream computing. Ladybug Mixer, powered by its multithread media engine and GPU shader can implement multiple channel video composition in real-time.

The media engine of Ladybug Mixer is benefit from modern technologies of CPU, GPU, and Chipset. The dual-core CPU from Intel and AMD will speed up the processing of multiple video streams in codec. Multi-core processor can speed up the multithread codec in Ladybug Mixer naturally.

Furthermore, the real-time video effects can be completed on GPU by the shader library of Ladybug Mixer.

Finally, multiple GPU processors as SLI/Crossfire can greatly improve the performance of video mixing on HD Media. The fast data channels of DRAM, PCIe Bus, and SATA hard disk are also speed up video mixing.

Video mixing, sometimes called cinematic composition, is to mix multiple video streams, do processing, and generate an output. It can be represented as a multiple variable function, R = S( V₀, V₁, ..., V_n-1 ) where V_i is the ith video source, S is the Shader, and R is the Render. In GPU processing, a video stream is set as a texture. Below is the diagram of video mixing.

Now let�s look at the video mixing by the number of channels.

Solo Channel

Similar to image processing, only one single video stream as input can do transform based on each video frame. There are many effects such as black and white, negative, and edge. Here is a sample list,

Negative	the reverse of the input video colors.
Black and White	maps color to gray levels.
Sepia Tone	classic color
Edge	edge detection
Bilinear Blur	blur
Gaussian Blur	blur
Median	median value
Color Swap	swap among RGB

Duo Channel

Video mixing from two or more channels is often called video composition. Usually, a video channel is treated as foreground video and another video channel is considered as the background video. We are going to call them foreground channel and background channel respectively. A special case is that the background video can be a picture.

The famous chromic key processing in the traditional video mixing can be easily implemented in a GPU shader. Suppose the foreground video such as an actor is shot on a background of green screen, and the background video is the scene of a beach. So the value of chromic key is the green color. Duo channels can be composted in the following way: open foreground channel when its pixel value is not equal to the value of chromic key and open background channel otherwise. This concept can be represented as a C conditional expression.

result = (color0 == key) ? color1 : color0;

Another composition is called fade. In this case a control variable alpha in [0.0, 1.0] is applied to generate a linear combination of two videos. Fade is the sum of alpha multiples the pixel value of first video and 1.0-alpha multiples the pixel value of second video.

The HLSL code of fade is represented below. Where color0 and color1 are the pixel values of video streams respectively. And the alpha is the transprancy value.

result = alpha*color0 + (1.0-alpha)*color1;

Furthermore, the wipe composition is widely used in movie effects. It is based on the position of pixel of a video frame. Wipe transform sweeps frame B over the frame A from left to right.

There are three rectangle regions: frame B only on the left, gradient region on the middle, and frame A only on the right. The gradient region is the rectangle mixing between frame A and B. The boundary between frame A region and the gradient region is called the leading edge. Similarly, the boundary between the frame B region and the gradient region is called the trailing edge.

Finally, the value of alpha can be the function of time. For example, asin(time) alpha value can display two videos alternatively and smoothly from one video fade in and fade out to another. The HLSL sample code is?

alpha = saturate(sin(fTime) + 1)/2.0;
result = alpha*color0 + (1.0-alpha)*color1;

Multiple Channel

A well known example of three video streams is the mask transform. A mask is a black and white image. Two video streams and a mask picture are the inputs. The effect is the sum of the first video multiples the mask and the second video multiples the negative of mask. Therefore, two videos are displayed in the areas distinguished by the shape of the mask. The HLSL sample code is simple.

result = mask*color0 + (1.0-mask)*color1;

More video mixing can be considered as multiple layer composition. The lighting and motion effects can make a common video uncommon. For example, a video that shows a light source from left to right and another video that shows another light source from right to left can be composted along with a video to generate dynamic lighting effects. This technology has been widely used in advertising and movie trailers.

This picture shows the playback of Duo DVD movies in Ladybug Mixer. One DVD/VOB movie was stored in the hard disk, another was in a DVD drive.