CRT emulation for pixel art Print E-mail
User Rating: / 14
Written by Stuart   
Sunday, 17 July 2011 19:41

[This article is cross-posted on #altdevblogaday].

A good way to get an authentic look for retro-pixel art is to simulate the distortion caused by encoding the image into an NTSC signal, decoding it again (as a TV would), and projecting it onto a virtual CRT. This gives you natural-looking artifacts, like fringing and color bleeding.

It also makes copyrighted hedgehogs look even more dashing.

Tube simulator example

Console emulators do this sometimes, and if you're old enough to have actually played games on a CRT TV, it really helps with the sense of immersion. This post gives a quick overview of the process, in case you'd like to try it for yourself. All of these steps are texture operations performed by pixel shaders.

  1. We start by encoding the low resolution input image as an NTSC signal. Each input line is converted into voltage over time, in the same format an NTSC signal would be sent across a wire (except for the sync and color pulse stuff).
  2. A "cable reflection" shader smears the signal out a little to the right. I'm not sure how much it looks like cable reflection, but it does kind of evoke the streaking artifacts you see on some old TVs.
  3. The luma is split out of the signal, and then used in the NTSC decoding process. This is also where the standard OSD parameters (brightness, contrast, sharpness, etc) are applied. Now our image is RGB again.
  4. The image is projected onto a curved tube. This step also takes care of tracing the scan lines and applying the phosphor pattern.
  5. The phosphors from the previous frame are decayed, and the new values are accumulated. This allows for ghosting of moving images.
  6. A standard post-processing stack is applied (bloom, glare, and tone mapping). This give users a taste of the eye-burning glow produced by a real CRT. (Do you remember when staying up late to play games caused physical pain? Kids these days are soft.)

Naturally, there are a few problems.



Moire artifacts

The combination of the scanlines and the phosphor texture make a little bit of moire pretty much unavoidable. Tuning the brightness, contrast, bloom, scaling, or NTSC parameters can produce moire, or move it from one place to another.

The good news is that it looks far worse in screen shots than it does on a running game. (I cranked it up for the screen shot above... it's not that ghastly under normal circumstances).


Crosstalk between chroma and luma is an key part of the effect, so it's a feature, not a bug. The problem is getting it to look bad in a "good" way. A band-stop FIR filter can be used to chop out the chroma signal, but it's tough to find the right balance between soft (filter too wide) and stripy (filter too narrow).

Tuning crosstalk

Phosphor resolution

My original goal was to have RGB phosphors visible in the image when you examined it up close. That's really hard, because if you make the phosphors small enough to look realistic, the RGB pattern blurs out and all you can see is a sort of vertical striping. If you don't mipmap or supersample the phosphors, you again get more moire.

Phosphor patterns at different scales

These examples have the phosphor texture enlarged and strengthened to exaggerate the problem. I wasn't able to get individual phosphors to look good at 720, and they are only barely tolerable at 1080.


Cranking all these techniques to the max gives you a delightfully bad video signal, which is also no fun to look at for more than 10 seconds. There is also a fair bit of interplay between the parameters: adjusting the scanline gap changes the brightness, and so on. Tuning all this can be a touchy process. Below are the parameters that control the effect. As you can see, there are a lot of ways I can screw things up.

Effect tuning parameters

Up next

In future posts I'll get into the technical details and show you what the shaders look like. If you're interested, subscribe to the RSS feed!

Game Concept: Amazing Furious Sky Print E-mail
User Rating: / 10
Written by Stuart   
Wednesday, 02 June 2010 05:53

The first game is called Amazing Furious Sky. It's a retro arcade game, presented as a light-hearted port of a Japanese game from the 16-bit era, with subtitles and loosely translated text.

There are two target audiences: older gamers (30+) who have warm memories of the games they grew up with, and arcade fans of all ages, who will enjoy the fast, simple gameplay.

When people today go back and play actual vintage games, either by dusting off a console or running an emulator, then normally find it much more primitive and frustrating than they remember! With this project, I want to provide the highlights of a nostalgic experience, without the rough edges that detract from player enjoyment.
So the general plan is this: focal elements are authentically retro, but integrated into a clean, modern presentation, as described below. The game pacing is optimized for short, casual play sessions, unlike actual vintage titles. It's possible for a player to "do a level" and make tangible progress in five minutes or less.


The player, enemies, bosses, weapons, and some of the "foreground" environment are rendered using a big-pixel art style with a limited palette. However, they are allowed to rotate arbitrarily in a higher resolution frame buffer.

The game is set in space, around a series of stars. The background is never black; the sky is coloured and luminous, with a generally warm palette. Clouds of starlight are animated using a fluid simulation, and interact with the player ship as described in the gameplay section. As the player advances in a stage, and gets closer to the star, the starlight becomes thicker and moves more quickly. The environment animation is done at higher resolution than the focal elements.

The entire game experience, starting with the splash screen, is presented as if viewed through an old CRT television. This involves:

  • Spherical distortion to give the look of a curved picture tube
  • Authentic shadow mask pattern
  • HDR phosphor glow, gamma response, and persistence
  • 3-phase dot crawl (like the original NES)
  • NTSC fringing and bleeding artifacts
  • Local geometric distortion in bright areas of the screen
  • Global blooming to simulate poor power regulation
  • DC offset drift (causing subtle horizontal streaking)
  • Cheap-cable simulation (smearing, reflections)
  • Slight interlace flicker
  • Ground-loop distortion
  • Differential gain/phase errors in the video signal
  • TV OSD controls (brightness, contrast, color, tint, sharpness) exposed to the user

All of these effects, individually, are rather subtle, but combine to give an authentic impression of looking into an old glass television. 1080p native output will give the best effect, as individual phosphors are visible, but 720p will be adequate. Users running in standard-def will get the NTSC artifacts, but not the phosphor patterns, interlace simulation, or similar effects.

Keeping with the theme, menus are minimal and terse, and rendered with an old-style pixel font. Menu transitions and animations, however, are smoother and modern.


The feel of the game music is inspired by the FM-synth sounds and limited voicings of old games, and melodies are played with a heavy synth feel. The rest of the mix uses modern instruments and a fatter sound. The general musical theme is triumphant, frantic, and slightly over-dramatic. Each stage in the game has its own musical theme, which is a loop somewhere between 90 and 120 seconds long.

This is an audio test I did for a "boss battle" sequence. The first part is pure chip music, and the second part is the same tune rendered with a nicer backing.

Gameplay sound effects are extremely retro, more in the style of Atari-era bleeps and bloops, but rendered with a bit of reverb/etc to help them fit into the soundscape.

There is also some ambient sound related to the CRT emulation, which includes:

  • 60hz mains hum, tied to the brightness of the on-screen image
  • Horizontal retrace whine (very, very subtle)
  • Screen static crackle during the splash sequence, when the CRT is "turned on", and also (to a much lesser degree) later when the screen gets darker after long periods of brightness

Gameplay mechanics

Space is filled with swirling, fluid starlight. The player can fly through it, but starlight is slightly viscous, and slows the ship down. Navigating the starlight is a strategic element of the game.

Some weapons use starlight as fuel, however, and need to be charged by ploughing through the fluid. That's a tradeoff the player needs to manage.

The basic player ship has a couple of mount points, and picking up a weapon will attach it to one of those points. However, weapons and other upgrades themselves have nested mount points, which can host further items. The player can therefore stack up multiple items, and must choose his upgrades carefully for the best effect.

In retro fashion, the player has a point counter which is constantly increasing as enemies are destroyed and the player makes progress. The player is rewarded with extra lives and power-up drops.

Each stage of the game is a "star", which is guarded by a boss. Each star has a theme (for example, say, robotic cats), which is embodied by the boss, the enemies, and the level layout. There are five stars.

Each star is presented as a series of about a dozen short levels, each of which can be completed in 3 minutes or less. They increase in difficulty, and culminate with a boss battle for control of the star. The difficulty then drops off a bit to begin the next stage, giving the player a few minutes of breathing room while the action builds up again.

This is intended to engage the casual player, who may only want to spend a few minutes to clear a level or two, but still provide constant positive reinforcement for players in longer sessions.

Progress is saved automatically, and the user can continue his game with one button press from the main menu.


Various parts of this plan are in different stages of prototype. In the next few days, I'll be sharing what I'm doing for the CRT simulation. In the meantime, please let me know what you think!

Unoptimized GPU FFT Print E-mail
User Rating: / 4
Written by Stuart   
Friday, 30 April 2010 14:28

This is yet another FFT implementation for GPU. It's a basic building block for procedural textures and special effects.

Right now it is horribly unoptimized. I'm forcing myself to leave it alone until I get GPU profiles on the consoles, because I've got too much other work to do, and it's useful for visualization in its current state. It's hard though, because there are a lot of fun improvements to make:

  • The bit reversal can be folded into the body shader
  • The transpose passes can be removed
  • The butterfly passes can be doubled up, so that each shader call does two passes
  • The scalar real and imaginary textures can be swizzled into RGBA, so that each pass can work on 4 independent transforms
  • The first two butterfly passes are trivial and can be moved into a simpler shader
  • Some ALU improvements are certainly possible

But for now, it's good enough. So working backwards, the implementation looks like this:

    1 --- Perform a 2D FFT on a texture.

    2 --

    3 --  UNOPTIMIZED! Many of these shader passes can be combined, and there's

    4 --  no need to transpose everything just to do the vertical pass.

    5 --

    6 --  @param dest     Output image, complex values will be placed in RG

    7 --  @param source   Input image, complex values will be taken from RG

    8 --  @param dir      Direction, either 1 for FFT or -1 for inverse FFT

    9 --

   10 function DoFourierTransform2D( dest, source, dir )


   12     local width  = source.Width

   13     local height = source.Height

   14     local prec   = source.Precision


   16     if prec < 16 then

   17         prec = 16

   18     end


   20     local real = TextureCache:Alloc( height, width, 1, 1, prec )

   21     local imag = TextureCache:Alloc( height, width, 1, 1, prec )


   23     DoSplit2( real, imag, source );


   25     DoFourierBitReverse( real, imag, real, imag )

   26     DoFourierTransformHoriz( real, imag, real, imag, dir )


   28     DoTranspose( real, real )

   29     DoTranspose( imag, imag )


   31     DoFourierBitReverse( real, imag, real, imag )

   32     DoFourierTransformHoriz( real, imag, real, imag, dir )


   34     DoTranspose( real, real )

   35     DoTranspose( imag, imag )


   37     DoCombine2( dest, real, imag )


   39     TextureCache:Free( imag )

   40     TextureCache:Free( real )


   42 end

That routine splits a complex image into real/imaginary layers, does an FFT along the rows, does it again down the columns, and then recombines them. The butterfly passes are done here:

    1 --- Perform a 1D FFT on each row of a texture.

    2 --

    3 --  The real and imaginary values are held in separate textures. There is

    4 --  currently no benefit to that, but it will make optimization easier later.

    5 --

    6 --  @param destReal     Output real values

    7 --  @param destImag     Output imaginary values

    8 --  @param sourceReal   Output real values

    9 --  @param sourceImag   Output imaginary values

   10 --  @param dir          Direction, either 1 for FFT or -1 for inverse FFT

   11 --

   12 function DoFourierTransformHoriz( destReal, destImag, sourceReal, sourceImag, dir )


   14     local width     = sourceReal.Width

   15     local height    = sourceReal.Height

   16     local prec      = sourceReal.Precision

   17     local tempReal  = TextureCache:Alloc( width, height, 1, 1, prec )

   18     local tempImag  = TextureCache:Alloc( width, height, 1, 1, prec )

   19     local inReal    = sourceReal

   20     local inImag    = sourceImag

   21     local outReal   = tempReal

   22     local outImag   = tempImag

   23     local size      = 2

   24     local scale     = 1.0


   26     while size <= width do


   28         if size == width then


   30             outReal = destReal

   31             outImag = destImag


   33             if dir == 1 then

   34                 scale = 1.0 / width

   35             end


   37         end


   39         DoFourierBody( outReal, outImag, inReal, inImag, dir, size, scale )


   41         inReal = tempReal

   42         inImag = tempImag


   44         size = size * 2


   46     end


   48     TextureCache:Free( tempImag )

   49     TextureCache:Free( tempReal )


   51 end

Permuting into bit-reversed order is done using a lookup texture. It turns out that one texture can be used for any size FFT, as long as you're careful about exactly where you sample. So this shader is hardcoded to use a 1024-pixel wide lookup texture.

    1 SAMPLER_DECLARE( Real ) {};

    2 SAMPLER_DECLARE( Imag ) {};

    3 SAMPLER_DECLARE( BitRev ) {};


    5 extern float Width;

    6 extern float InvWidth;

    7 extern float Height;

    8 extern float InvHeight;


   10 float BitReverse( float x, float size )

   11 {

   12     float   scale   = exp2( 10 - log2( size ) );

   13     float   redir   = TEX2D( BitRev, float2( (x  * scale + 0.5) / 1024.0, 0 ) ).x;


   15     return( redir );

   16 }



   19 {

   20     float   x       = floor( input.mUV.x * Width );

   21     float   y       = floor( input.mUV.y * Height );

   22     float   u       = BitReverse( x, Width );

   23     float   v       = BitReverse( y, Height );

   24     float2  sampPos = float2( (u + 0.5) * InvWidth, input.mUV.y );

   25     float4  real    = TEX2D( Real, sampPos );

   26     float4  imag    = TEX2D( Imag, sampPos );


   28     RETURN_PS_BLIT_OUTPUT_2( real, imag );

   29 }

The texture itself is generated by script. It's not pretty, but it only has to run once.

    1 function ReverseBits( n, bitCount )


    3     local result    = 0

    4     local bitToSet  = 1

    5     local bitToTest = 2 ^ (bitCount - 1)


    7     for i = 1, bitCount do


    9         if n >= bitToTest then

   10             result = result + bitToSet

   11             n = n - bitToTest

   12         end


   14         bitToSet  = bitToSet * 2

   15         bitToTest = bitToTest / 2


   17     end


   19     return result


   21 end



   24 function CreateReverseBitsTexture( bitCount )


   26     local width     = 2 ^ bitCount

   27     local height    = 1

   28     local channels  = 1

   29     local t         = {}


   31     for y = 1, height do

   32         for x = 1, width do

   33             t[#t + 1] = ReverseBits( x - 1, bitCount )

   34         end

   35     end


   37     local tex = CTextureValue:New()

   38     tex:InitImmediate( width, height, channels, t )


   40     return tex


   42 end

The meat of the work is done below in the body shader.

    1 SAMPLER_DECLARE( Real ) {};

    2 SAMPLER_DECLARE( Imag ) {};


    4 extern float Dir;

    5 extern float PassSize;

    6 extern float Scale;

    7 extern float Width;

    8 extern float InvWidth;



   11 {

   12     int  x        = floor( input.mUV.xxxx * Width );

   13     int  size     = PassSize;

   14     int  half     = size / 2;

   15     int  size_ofs = x % size;

   16     int  half_ofs = x % half;

   17     bool is_even  = (half_ofs == size_ofs);


   19     float2 even_uv;

   20     float2 odd_uv;


   22     if( is_even )

   23     {

   24         even_uv = input.mUV.xy;

   25         odd_uv  = float2( input.mUV.x + (half * InvWidth), input.mUV.y );

   26     }

   27     else

   28     {

   29         even_uv = float2( input.mUV.x - (half * InvWidth), input.mUV.y );

   30         odd_uv  = input.mUV.xy;

   31     }


   33     float4 even_r = TEX2D( Real, even_uv );

   34     float4 odd_r  = TEX2D( Real, odd_uv );

   35     float4 even_i = TEX2D( Imag, even_uv );

   36     float4 odd_i  = TEX2D( Imag, odd_uv );


   38     float4 angle     = 2 * M_PI * Dir * half_ofs / size;

   39     float4 sin_angle = sin( angle );

   40     float4 cos_angle = cos( angle );


   42     float4 delta_r = (odd_r * cos_angle) - (odd_i * sin_angle);

   43     float4 delta_i = (odd_r * sin_angle) + (odd_i * cos_angle);


   45     odd_r = even_r - delta_r;

   46     odd_i = even_i - delta_i;


   48     even_r = even_r + delta_r;

   49     even_i = even_i + delta_i;


   51     float4 result_r = is_even? even_r : odd_r;

   52     float4 result_i = is_even? even_i : odd_i;


   54     RETURN_PS_BLIT_OUTPUT_2( result_r * Scale, result_i * Scale );

   55 }

And that's it. It's not efficient yet, but at least it's not much code. And there are a lot of cool things you can do in the frequency domain.

FFT example

I'll be posting some design information for the first game within the next week or so, and you can see where all this stuff is going...

Random numbers Print E-mail
User Rating: / 3
Written by Stuart   
Sunday, 25 April 2010 07:08

I'm not using pre-generated noise textures or volumes, because there are a lot of ways to tune noise and that's something else I'd rather keep live. But to make noise you need random numbers, which is what this post is about.

I wish there were a (portable) way to do this directly in a shader (under shader model 3), but as far as I know, there isn't. So I've gone the standard route and initialized a texture with random numbers, but instead of using a PRNG to get the values (which would make gaps in the distribution), I generate a smooth gradient between 0 and 1, then randomly permute it:

    1 function CreateRandomTexture( width, height )


    3     local channels  = 1

    4     local count     = width * height   

    5     local t         = {}


    7     for i = 1, count do

    8         t[#t + 1] = (i - 1) / (count - 1)

    9     end


   11     for i = 1, count do

   12         local swap = math.random( count )

   13         t[i], t[swap] = t[swap], t[i]

   14     end


   16     local tex = CTextureValue:New()

   17     tex:InitImmediate( width, height, channels, t )


   19     return tex


   21 end

(Appending to a Lua table using the t[#t + 1] idiom is the fastest way to populate it, because it keeps the array values in contiguous memory. If you poke values in a different order, say by initializing the first column of an image, they end up in the hash table. I learned that the hard way).

Random pixels

This is fine so far, but I'll need a constant stream of random numbers, and generating them every frame on the CPU is too expensive. So I've also got a second texture to do a lossless shuffle of the random numbers:

    1 function CreateRandomShuffleTexture( width, height )


    3     local channels  = 2

    4     local count     = width * height   

    5     local t         = {}


    7     for y = 1, height do

    8         local v = (y - 0.5) / height

    9         for x = 1, width do

   10             local u = (x - 0.5) / width

   11             t[#t + 1] = u

   12             t[#t + 1] = v

   13         end

   14     end


   16     for i = 1, count do

   17         local swap = math.random( count )

   18         t[i*2],   t[swap*2]   = t[swap*2],   t[i*2]

   19         t[i*2-1], t[swap*2-1] = t[swap*2-1], t[i*2-1]

   20     end


   22     local tex = CTextureValue:New()

   23     tex:InitImmediate( width, height, channels, t )


   25     return tex


   27 end

This can be used once per frame (or more) to mix things up.

Noise is much more interesting. I'll share what I'm doing for that soon.

Starting from scratch Print E-mail
User Rating: / 1
Written by Stuart   
Thursday, 22 April 2010 13:18

I've got a pretty tight schedule, and I want to spend most of my time tuning. So the general plan is to keep everything "live", and editable while the game is running (shaders, scripts, textures, models, music, etc). All the "hard" work will be done in shaders, including as much of the simulation as possible. All the game code, and most of the engine code, will be written in script (Lua). I'm going to be GPU-bound, so the script overhead shouldn't be a problem.

So on the C++ side I've got a generic object/factory/lifetime system, and a reflection interface (hacked together with embarrassing macros) for all the system objects. There are only about a dozen of them though, and I've gotten to the point where I don't have to rebuild the code very often because all my changes are in scripts or shaders. The reflection interface makes it easy to interact with script, though it's slower than code-gen.

So here's an example of creating a mesh from the script side. This is the triangle used to do blits between textures (i.e. "full screen passes", even though they're all offscreen). This poor triangle works very hard.

    1 function CreateBlitMesh()


    3     local vb =

    4     {

    5     --  POS            NORM          UV

    6         -130,    00, -1,    0, -1,

    7          3, -10,    00, -1,    21,

    8         -1, -10,    00, -1,    01,

    9     }


   11     local ib = { 0, 1, 2 }


   13     local vbobj = Factory:Create( "VertexBuffer" )

   14     vbobj:SetData( vb, 8 )


   16     local ibobj = Factory:Create( "IndexBuffer" )

   17     ibobj:SetData( ib )


   19     local mesh = Factory:Create( "Mesh" )

   20     mesh:BindVertexStream( vbobj, "POSITION"0, 3 )

   21     mesh:BindVertexStream( vbobj, "NORMAL",    3, 3 )

   22     mesh:BindVertexStream( vbobj, "TEXCOORD0", 6, 2 )

   23     mesh:BindIndexBuffer(  ibobj )


   25     return mesh


   27 end

Textures are passed around as Lua tables, and allocated through a simple cache (which is also implemented in Lua).

So this is the fun part, because I've never been able to do this before. While the game is running, you can create a new shader, like this one to run a Sobel filter:



    3 extern float    InvWidth;

    4 extern float    InvHeight;

    5 extern float    Gain;

    6 extern float    Power;

    7 extern float4   Coeff;



   10 {

   11     float2 center = input.mUV;


   13     float4 s1 = TEX2D( Source, center + float2( -InvWidth, -InvHeight ) );

   14     float4 s2 = TEX2D( Source, center + float2(         0, -InvHeight ) );

   15     float4 s3 = TEX2D( Source, center + float2(  InvWidth, -InvHeight ) );

   16     float4 s4 = TEX2D( Source, center + float2( -InvWidth,          0 ) );

   17     float4 s6 = TEX2D( Source, center + float2(  InvWidth,          0 ) );

   18     float4 s7 = TEX2D( Source, center + float2( -InvWidth,  InvHeight ) );

   19     float4 s8 = TEX2D( Source, center + float2(         0,  InvHeight ) );

   20     float4 s9 = TEX2D( Source, center + float2(  InvWidth,  InvHeight ) );


   22     float4 gradVert = (s1 + s2 + s2 + s3 - s7 - s8 - s8 - s9);

   23     float4 gradHorz = (s1 + s4 + s4 + s7 - s3 - s6 - s6 - s9);

   24     float4 mag      = sqrt( gradVert * gradVert + gradHorz * gradHorz );

   25     float  val      = pow( dot( mag, Coeff ) * Gain, Power );

   26     float4 result   = float4( val, val, val, 0 );


   28     RETURN_PS_BLIT_OUTPUT_1( result );

   29 }

To call that shader from the script looks something like this:

    1 function DoSobel( dest, source, coeff, gain, power )


    3     local shaderSobel = ShaderCache:Get( "Sobel" )


    5     shaderSobel.Val.InvWidth  = 1.0 / source.Width

    6     shaderSobel.Val.InvHeight = 1.0 / source.Height

    7     shaderSobel.Val.Gain      = gain

    8     shaderSobel.Val.Power     = power

    9     shaderSobel.Vec.Coeff     = coeff

   10     shaderSobel.Tex.SourceTex = source


   12     Blit( { dest }, shaderSobel )


   14 end

You fill out the shader parameters by populating a table, and then call Blit(). The first parameter to Blit() is a list of the target textures. Blit() detects if any source textures are also targets, and fixes it by allocating/shuffling buffers as needed.

Now you can start putting things together to build more interesting effects:

    1 function DoWatercolor( dest, source )


    3     local edgeTex = TextureCache:AllocLike( source )


    5     DoSoften( dest, source )

    6     DoSobel( edgeTex, dest, { 0.30, 0.59, 0.11, 0 }, 1, 3 )

    7     DoSharpen( edgeTex, edgeTex )

    8     DoApplyInverseMask( dest, dest, edgeTex )


   10     TextureCache:Free( edgeTex )


   12 end

Watercolor example

Ok, that might not be the greatest watercolor, but you can create the effect and tune it while the game is running, and see changes in realtime. I've found it to be a really fun way to work, because you can do looping and arbitrary scripted logic, instead of just changing shader code. And for creating procedural content, this is easier (for a programmer) than using a tool like Allegorithmic Substance, because I can type a line of script faster than I could connect all those little boxes together.

So, that's what I'm working on. I'll share more in the coming days, but it has taking me a surprisingly long amount of time to write this entry, so I've got to figure out how to blog better!

Awkward first post Print E-mail
User Rating: / 2
Written by Stuart   
Monday, 19 April 2010 05:10

Right, so in the spirit of transparency I'll be posting details here of the things I've been working on. I don't think there's any advantage to being secretive about ideas or techniques. And if I'm doing something wrong or you've got a better idea, let me know!

(Apologies in advance if the RSS feed is flaky, I'm still getting set up).



Licensed developer

Licensed PS3 developer


Follow us on Twitter!

S5 Box



Fields marked with an asterisk (*) are required.

Copyright © 2011 Pure Energy Games, Inc. All rights reserved.