Der Schmale – David Lenaerts’s blog

Flash Platform Experiments

September! Speaking at Adobe User Group BE meeting

Tags: , , , ,

augbeThe summer holidays aren’t even over yet and September already seems like it’s going to be a great month! Apart from the fact that I’m going to Flash on the Beach, there’s also something for us Belgians that’s a bit closer to home: the annual Adobe User Group meeting & barbeque on September 12th! And as it happens, I’ll be speaking there alongside much more impressive names as Seb Lee-Delisle and Niqui Merret :)

What to expect? Here’s my session description:

“In his session, David will walk through some of the experiments and projects he has done over the past year while elaborating on the concepts behind them. Without getting too technical, he will explain his motives, his love affair with Pixel Bender and the big looming question: “Mmkay, but what’s the use?”. Using experimentation as the main thread, he will try to show how going off the beaten path can take you to sweet little places you didn’t expect, and finally, offer a sneak-peak at some of the more useful projects he ended up doing as a result.”

In short, since it’s my first public session, I figured I would give a bird’s view of my philosophy concerning after-work development, and show some cool things I have in the pipeline at the moment (so even if you follow this blog, there’ll be something new to look at ;) ).

Don’t forget to register here if you want to come (and gloat) – and if you do, I’ll see you there!

Leave a comment (1 comment)

Some Flash Pixel Bender performance tips + benchmarks

Tags: , , , , ,

flashpbSince I started playing around with Pixel Bender in Flash, I’ve been trying out some different approaches here and there and learned a thing or two on performance optimizations (and quirks). As many people use PB specifically for its performance, and not much has been written on the subject, I thought I’d share my experiences and back them up with some benchmarks. Some of the things here are pretty obvious, yet others can be surprising and even frustrating.

Remember that this concerns Shaders in Flash Player, not Photoshop or After Effects, and that results could change in future versions. All benchmarks were performed on my crummy pc (AMD Athlon 64 X2 Dual, 2.21Ghz, 2GB Ram, Win XP), using 500×500 data with 4 channels, each performing 10 consecutive kernel executions. The kernel itself is just a read, a multiplication, a division, and a sqrt. ShaderJobs are performed synchronously.

Let’s get the obvious out of the way first (I won’t go into common sense optimalizations too much).

Well, duh!

  • Use 4 channels only if necessary. No transparency? Ditch it.
  • Precalculate recurring constant calculations in Flash and pass them as parameters (such as width*height). Sure, it makes the “interface” of your Kernel potentially harder to read, but since Flash doesn’t support dependents (I hope it will some day), this should be a no-brainer if performance is really important.
  • If only a part of a BitmapData needs to be processed, isolate it into a new BitmapData using copyPixels. Even when using applyFilter, sourceRect is buggy.

Told you it was obvious :p Now, some better ones.

Use ShaderJob, not ApplyFilter

  • ShaderJob (on BitmapData) benchmark: 92-99ms
  • ApplyFilter benchmark: 104-109ms
  • ShaderJob ~ 10% faster

BitmapData is faster than ByteArray is faster than Vector.<Number> !

I’ve seen (and been guilty of) a lot of copying BitmapData into a Vector to harness “the power of Vector”. But look at this:

  • ShaderJob on BitmapData: 92-99ms
  • ShaderJob on ByteArray: 147-172ms
  • ShaderJob on Vector.<Number>: 167-192ms
  • BitmapData is ~40% faster than ByteArray
  • BitmapData is ~47% faster than Vector.<Number>!!

Use BitmapData unless you have no other choice, or if complete floating point precision is important.

Conditionals are expensive!

This one annoys me quite a bit. Imagine you’re doing some calculations that you don’t need to do when alpha == 0 (which, as it happens, is usually the case). It can be a good idea to do them anyway in favour of dropping the alpha == 0 check. For the benchmark, I used values that had alpha set to 0 for about half of the data! Compare results to the previous benchmark.

  • BitmapData: 134-192ms – ~47% speed loss!!!
  • ByteArray: 147-172ms – ~22% speed loss
  • Vector: 192-213ms – ~27% speed loss

In practice, test a version with and one without conditional. The results vary heavily depending on how many times calculations are omitted, and how many calculations are otherwise performed. Still, with half the (although slightly trivial) calculations omitted in this case, it’s stupefying that there’s so much increase in execution speed.

Do not use the input as the output

When using a ShaderJob or ApplyFilter, don’t use the same BitmapData/ByteArray/Vector instance that functions as the source. If you need iteration, you’re better of swapping two buffers. What happens is that Flash Player will need to make a temporary copy of the source, which slows things down.

Edit: The results here were compared to the normal ShaderJob test, while they’re using the alpha test. Percentages have been updated

  • BitmapData: 207-218ms – ~30% speed loss
  • ByteArray: 256-271ms – ~65% speed loss
  • Vector: 276-293ms – ~40% speed loss

Update: Asynchronous ShaderJob

I just tested it, and the results indicate that asynchronous calls (waitForCompletion=false) are slower than synchronous calls. I suppose that’s mainly because of the event handling flow. Another thing I tested was to run 2 asynchronous calls with data of half the size, but it seems only 1 asynchronous ShaderJob can be started at the same time.

That’s it, see for yourself!

In closing, I’ll mention something I usually do but doesn’t seem to have any effect (it’s actually a habit from ActionScript). When reading from the same coordinate multiple times, I often store outCoord() in a variable and use that in the sample function. Well, I tested it, and it doesn’t have any impact at all :)

That’s it, at least for now, I hope it’s helpful! Check the benchmark and its source (the source is in fact pretty ugly, but does the trick). I’d be happy to know what kind of results other hardware yields.

Leave a comment (17 comments)

Slice-based volume rendering using Pixel Bender

Tags: , , , , , ,

volumerendering

After a futile exploration of sparse voxel octree ray casting using Alchemy (which was fun but hopeless), I turned towards another technique for volume rendering, using view-aligned slices. The approach is not much different from the rendering of this older experiment, in which the slices were aligned to the object itself. Again, we’re using the same technique to create and read from the 3D texture (which is static in this case): ie. a set of cross sections placed next to eachother. CT scans are wonderful for this:

cross-section

Not that the image above is just a crop-out, we need a lot more to make it look decent (I used 32 cross sections).

Rendering the slices

slicesWhen using view-aligned slices, they typically won’t be aligned to the texture’s slices, as illustrated in the image to the right (yes, my graphic skills are EPIC!). The point p is any point on any view-aligned slice. We need to know where it is in the texture’s 3D space. This is simply a change of basis transformation, where both bases are defined to have the same origin. In our specific case, eye space is world space, so all we have to do is multiply p with the inverse of the object’s delta transformation matrix. Since the result will usually lie between 2 slices of the 3D texture (as in the illustration), we sample both texture slices with constant x and y coordinates and interpolate the colour values. This approach is not 100% correct, since the interpolation should also be aligned to the view. However, for this purpose, it’s a good trade-off for some extra performance.

As this needs to be done for every pixel on every slice, we’re doing these calculations through Pixel Bender. And that’s how it works in general lines. There’s some translations and scaling going on as well to ensure a uniform and properly centered transformation. If you’re still interested, you can check the source for that. Important to note is that half the slices in the back are actually culled for a worthwhile performance boost. They don’t really contribute all that much to the final image after all.

Demos

Click and drag to rotate the pitbull skull in all demos:

Leave a comment (8 comments)

Render Bender v0.1: optimizing my Flash+Pixel Bender workflow

Tags: , , , , ,

renderbenderOften, when I’m writing Pixel Bender content targeted for Flash Player, the original toolkit leaves something to be desired. Luckily, Joa Ebert created his PBDT plugin for Eclipse which was already a great step to streamline my workflow. Still, the lack of previewing would require switching back to the PB toolkit. And more often than not, that preview didn’t tell me much, as many of my PB projects consist of multiple filters that are dependent on eachother’s output.

Render Bender

So yesterday, I created Render Bender v0.1 (yes, that is some juvenile wordplay) strictly to suit my own needs, but it might just be useful for people out there with the same problem. It is designed to tackle the following issues:

  • Eliminate the need of writing much AS3 code for PB code that might not even work. Instead generating an easy “proof of concept” method.
  • Real Flash-based rendering (instead of simulated rendering of the toolkit, which does not expose runtime bugs)
  • Running a sequence of consecutive Shaders, which may be dependent of a previous Shader or a previous run of an iterative sequence
  • Easy integration with Eclipse and PBDT, without the need of switching applications.
  • Have an easy way of demoing pixel bender effects in development

Essentially, it’s just a Flex App/Component that you need to add to a project, which contains nothing else except for perhaps some interactivity handling necessary for your effect. This way, you can then add your kernels to this project and have them compiled with PBDT. All you need to do is to set up an xml file (once) defining the assets and the kernel sequence. After that, it’s simply a matter of running the project after editing a kernel.

The documentation is a bit sparse at this point, but along with the Demo.mxml source file, it should be pretty clear.

Get it!

In closing…

It’s in early stages, but it works for me :)  I’ll be glad to hear questions, suggestions or bugs. I’m not sure how much time I can invest at this point, but I do have some ideas for future development. At this point, it uses the default Flex skin (3 or 4, depends on your sdk) to minimize compile time. If someone wants to create a minimal fast skin, feel free! :)

Leave a comment (5 comments)

More Stok3d: Parallax mapping & Water shading

Tags: , , , , , , , , ,

waterToday’s update on Stok3d is perhaps not as useful as the previous post, but I certainly had fun working on it. Or as we say in Dutch with a word blatantly stolen from German: it’s “spielerei”.

Demos:

I’m going to post the demos first this time. Saves you some scrolling effort, because the explanations below are rather boring ;)

  • Parallax Mapping : Move and turn towards the edges of the screen to see the extrusion of the texture best. A PhongFilter is added as a second filter, making it slower but the effect becomes more obvious.
  • Water Shading 1 – Ocean : Reflects or refracts light depending on the view angle and the surface relief. It animates a perlin noise filter to generate a water heightmap.
  • Water Shading 2 – Ripples : Same thing, but with a simple drawn ripple effect. The difference between refracted and reflected light is more obvious here.

Edit: Even if you have Flash Player 10, you still might get an update request. That is because these demos require the version of 10.0.22 concerning recent Pixel Bender bugfixes.

The source code for these demos can also be found on Google Code.

Useful updates

Some updates I did involved some bugfixes and performance-related updates. I also added a NormalMapUtil class, which provides a basic API to generate and manipulate normal maps. The main features are the generateFromHeightmap and drawFromHeightmap methods. Since height maps (or bump maps) are generally easier to come by (and to make), these methods generate a normal map for you. generateFromHeightMap creates a new BitmapData, drawFromHeightMap uses an existing BitmapData that you provide (useful if you need to generate one on every frame). The NormalMapUtil class furthermore allows you to invert the components of the normals, in case a normal map reflects the light in the wrong directions.

Water Shading

A new shader filter that was added is the WaterShadingFilter. If you remember your high school physics, depending on the view angle, the surface either reflects the light or lets it pass through and refracts it (which is called the fresnel effect).  To put it simply, when looking at water at a shallow angle, it seems like a very reflective surface, but when looking straight down into it, you can see through it but it’s a distorted view. The reflection uses a combination of environment mapping and phong shading, while the refraction is a simple displacement mapping technique. The DisplayObject to which the filter is applied is used as what’s underneath the surface, ie: the refracted light. The ripples seen in the demo are not made by the filter, but are custom written to manipulate the normal map.

Parallax Mapping

Another new filter is the ParallaxFilter, which performs (you guessed it) parallax mapping. It’s a technique like bump and normal mapping, in the sense that it tries to give more depth to a 2D texture. It does so by displacing texture coordinates based on a height map and view direction to the coordinate that it would normally have in 3D space. This causes the texture to look extruded and more detailed. For more (and better) information, check the article on wikipedia. Stok3d implements an iterative variation. It’s a bit slower but takes care of overlap issues and can handle sharp edges.

That’s it! For now, it’s back to Farbe, and… some other things :)

Leave a comment (15 comments)

Introducing Stok3d – More FP10 3D+Pixel Bender shading

Tags: , , , , , , , , ,

stok3d-envmapphongLast week I posted an example of Environment Mapping using FP10′s native 3D and Pixel Bender. The reactions were quite positive, which motivated me to push the concept a bit further and create more shaders using Pixel Bender. These new additions all work in the same fashion, ie. as Filters which need to be updated whenever the target object (or if provided: light position) changes.

I created a project on Google Code for this, dubbed Stok3d (as I was positively stoked at that specific point in time). It’s a mini-library at this point, but in the future there’s potentially more to it than shaders; Z-sorting for instance: although it has been done already, it wouldn’t be a bad idea to create something specifically for Stok3d and have more functionality in one place. But… zat’s for ze futuah! At least for now, it will be easier to commit bugfixes and updates.

Demos

In the order from low cpu usage (and less visually interesting) to high cpu usage (and more interesting):

* Although Stok3d is distributed under the GNU GPLv3 license, the textures are NOT covered by this license. In particular, the blast door and hangar door textures, normal maps, and specular maps are made by Florian Zender (http://www.florianzender.com) and are used with his kind permission. Check out his work, it’s quite impressive! :)

Source

The source for the examples as well as the library can be found on Google Code. It’s available over svn or as a downloadable archive.

Leave a comment (14 comments)

One year later: a short retrospective

Tags: , , ,

birthdaypieThe first post on this blog dates from May 4th, 2008, so I thought it fitting to look back at the year that has passed since then.

Over the time I’ve done some experiments/projects that, when I look back at them, make me think “What was I thinking?!”. On the other hand, there are luckily a few of which I’m actually proud, or at least content with how they turned out – whether they were practically useful or not (which is not exactly always my goal). A lot of the things I did, I did as a way to learn. As such there was Wick3d, a (now defunct) basic 3d engine to rehash my algebra (actually, I worked on it some more after the last update without comitting anything to svn). And of course Pixel Bender came along, giving me a whole new area to explore, as did Alchemy.

I do feel I learned quite a bit, but at the same time it seems with everything I learn, there’s twice as much I still need to study. Feels like a constant battle against ineptitude, especially when talking to those who actually seem to know what they’re talking about ;)

Lastly, I’d like to pass some more updates on Farbe. I recently made some updates implementing oil paint and airbrush. There’s still so much work to do before it can go public, but soon it will be time to look for a Flex UI-designer/skin artist. If you know or are anyone with experience, send me a hoot with some examples! AS3/MXML/CSS experience is a must of course :) Just keep in mind, the project is free and will likely be open-sourced eventually (which is also a cheap way of saying: no money involved ;) ). If you follow me on Twitter, you might have seen some demo pictures being tweeted. Those that haven’t: here’s some previews of Farbe simulating pencils (1, 2), oil paint and airbrush.

Off towards another year! Thanks to all you readers!

Leave a comment (4 comments)

Image bleeding with water (Flash + PixelBender)

Tags: , , , , , , ,

Image BleedingThe last 2 months, I’ve been investing 99% of my free time into the next iteration of Farbe, turning it into a real Flex-based image editing tool simulating natural media. Although there’s nothing of the application itself that I can show yet, today I created another small proof of concept for it that I can make public.

One of the things lacking in the watercolour POC was that once a brush stroke was made, nothing could be done with it. I thought it’d be nice to still be able to add water once the paint was rendered and have the colours bleed out. Using much of the same physics as for the watercolours, I figured out an algorithm that was both adequate in speed (real-time) as visually effective enough to water down the image. As usual, much Pixel Bender was used. The multi-threaded nature of ShaderJob really proved its worth in this case. You can keep adding water without the simulation slowing down the interaction, even if the simulation itself gets slow when there’s a lot of wet areas to cover.

To get the most realistic results, settings such as “ink speed” and “water amount” should be kept low while slowly rubbing over the image. Higher levels are not natural and will look caricatural (reminding me of Kai’s Power Tools of old! ;) ).

So check it out! :)

Note that, even tho Farbe is not an open source project (or not yet at least), I’m providing the source for this POC – consider it a late Easter present ;) But do remember, it IS poc-style code!

In closing, I’ll leave you with a few updates on Farbe. First of all, the watercolour paint is quite a bit faster (unless, of course, you’re working on much bigger canvas sizes than the cheaply upscaled old version) and so far it seems it’s pretty bug free! Secondly, I recently finished a pencil and eraser tool which are looking alright. The rest of the time has been spent mainly on the user interface and typical paint tool functionality. I’m starting to feel quite overworked at the moment, but the app is shaping up so it’s worth it! I hope I’ll be able to give out some more tangible updates on all that soon :)

Leave a comment (8 comments)

Cloth simulation modifier in AS3Dmod

Tags: , , , , , , , , ,

 

flagI was recently invited to create a cloth modifier for AS3Dmod by Bartek Drozdz, similar to the 2D version I did earlier. In the unlikely case you haven’t heard about it before, AS3Dmod is a cool modifier library compatible with the most popular 3D Acionscript engines (Papervision3D, Away3D, Sandy and Alternativa3D). To put it simply, it takes existing 3d meshes and changes its shape on a per-vertex basis, and also allows you to animate them without needing an animated model. Lucky for me, as I was interested in doing a 3D version, and this was the perfect setup to do it in :)
 
The cloth modifier provides some methods and functions to adjust its behaviour (rigidity, air friction) as well as to apply external forces such as gravity or wind. You can also set boundaries to act as fake walls or floors. This modifier works best with meshes that have a flat edge, such as planes, boxes/cubes, cylinders, … This so they can be locked in place at an edge and actually give you something to look at instead of having a mesh that gets blown out of the view straight away :)
One last remark is that the cloth should be the first in the modifier stack. It needs its previous state, and any prior modifiers changing its state will not have any effect.
 
On to the demos!
  • Flag sim with parameters to play with (Papervision3D, 600 triangles): Demo | Source
  • Hanging cloth with fake floor and wind (Away3D, 450 triangles): Demo | Source
  • Strange cube being blown about (Away3D, 1200 triangles): Demo | Source
Get AS3Dmod on Google Code.
Leave a comment (13 comments)

Verlet + Newton + FP10 = Cloth Simulation

Tags: , , , , , , ,

clothA project I’m currently involved in inspired me to completely rewrite my old Curtain class into something more stable and versatile. Using a character physics method based on Verlet integration, and adding some properties for friction and gravity, it resulted in a 2D cloth simulation (at least after some updates I just did since I needed to get away from work for a bit). The curtain itself is drawn using the new drawTriangles API for Flash Player 10.

Anyone interested in Verlet integration (or scripted animation in general) should check out Keith Peters‘ book AdvancED ActionScript 3.0 Animation.

On to the demo! Right-click to view source. Not commented due to lack of time, but it shouldn’t be too hard to figure out :)

Leave a comment (18 comments)

© 2009 Der Schmale – David Lenaerts’s blog. All Rights Reserved.

This blog is powered by Wordpress and Magatheme by Bryan Helmig.