Introduction
(Warning, this post is LARGE)
With raise of large audiences in social games and trend of micro transactions and “in game purchase” a lot of big and small companies started to makes Flash games. This results in heated competition to deliver next game that stands out in various ways. In AAA games (sadly to be honest) one of the edges to stand out was and is graphics. Probably because of spend effort and gained attention ratio and that after seeing good graphics old starts to itch the eye. Now with this competition for hundredths of millions of players market same starts to happen in a browser with Flash being the most popular technology at use at the moment. This leads to Flash rendering being pushed to the limit and wondering in to uncharted combination of rendering techniques and various pitfalls they bring.
Recently some discussion rose around using Flash display list vs using custom based rendering based on BitmapData. One of such conversations was risen by SocialCity developer discussing what kind of compromises they did making SocialCity. And such compromise is pretty common among various desktop developers coming to Flash as that’s very similar to what they were doing there I guess. Anyways that article got attention of a Flash developer who was making games using Flash for a while before that market heat up that much. So he made number of his own tests and comparisons on how good Flash display list is vs BitmapData blitting.
Flash Display List vs BitmapData
Compromise here is of course obvious, lets summarize a little:
Flash display list:
Pros:
- Powerful decomposition of objects that many 3D developers would be familiar with where objects have child’s and internal coordinate space effected by transformation matrices by which content of object is effected. In other words heavy but expressive hierarchical structure.
- There are various filters and effects you can apply easily and dynamically
- You get powerful animation engine out of the box
- It is native to Flash so uses some low level optimizations you can’t really achieve yourself(more on redraw regions later)
- Designers can make complex objects which programmers then can manipulate, very good for complex, colorful and animated UIs (well it is not easy often to separate design and programming that way and Adobe tries to solve this with Catalyst which I do not like but that’s for other article)
- Interactivity is built in
Cons:
- It is a black box, there are various stuff under the surface you can not effect, control etc. There are bugs or inconsistencies too.
- All that dynamic power of expressiveness costs you in performance. Sometimes a lot.
BitmapData:
Pros:
- It is fast. Sometimes very fast in comparison to Display list. Check Iain comparisons. For me it is something like 2x difference on average in favor of BitmapData even when you do everything right for Display List
- You get your custom bitmap caching
- You build your own pipe line which you know how to use right, no rocks under the surface unless you put some yourself,and we all do sometimes I guess
Cons:
- After using Display List it feels as if after flying your wings are cut out and you are given a shovel to dig the dirt… You loose animations and need to make them yourself, doing transforms becomes harder, adding effects too.
- If you make your own animations only efficient way to do them is to precache them which costs in memory, sometimes a lot(though they are faster then native ones as a result). And making a hybrid system may be hard and make your engine work with hiccups when some animations are cached again and some are removed from memory.
- You loose ability to control Flash redraw areas well(more on that later again)
- Making complex interfaces based on this is horrible, hard to animate, programmer does all composition by code, lot of work(on other side sometimes using designers not well made UI structure is lot of work too)
- You loose all interactivity and need to build it your self from scratch which sometimes is hard or possibly even impossible
I guess there is more but I mentioned the main ones, I guess. And for sake of this article this is enough.
Conclusions
As you can see pros of one are often cons of the other. There is no clear “one fits all” here. One are strong for one thing and other for other so it highly depends on a kind of game you make and what kind of compromises you like.
Facebook games and redraw area
There are two games I would like to investigate in terms of what they use and how. Of course I can not know that but I can investigate and speculate. Those games are SocialCity I mentioned before and CityVille.
So here are game screen shots made in Debug Player with Show Redraw Regions turned on. What that means is that debug player draws red rectangles around areas it redraws on each frame.
To the left we see CityVille while to the right we see SocialCity.
From article I mentioned above we know that SocialCity to the right uses BitmapData approach. And we can see it at picture as well. We see that at bottom UI some parts of it are redrawn occasionally while whole main top part is redrawn each frame completely including UI that is above it even though it is not necessary.
Now take a look to the left at CityVille. You can see various red areas all over the screen especially around moving cloud and few characters while other things are not redrawn. This is what I was speaking about mentioning as a pro of a display list and con of a BitmapData. Flash as part of optimizations makes a very good job at figuring at each frame what changes, what areas should be redrawn and what should not without programmer even bothering to think about it(well not true but almost). Anyways all that leads me to conclusion that CityVille is using Flash display list as a “better” approach.
Lets see how they compare in terms of CPU and memory use by loading them in chrome(exiting the browser to be fair between them):
- CityVille 120-130Mb and 22-38% of CPU use.
- Social City 220-240Mb and 15-25% of CPU use.
Of course it is kind of wrong to compare them like that, they have different engines, different count of animated and not animated objects on screen and outside of screen but still results I guess are what you would expect.
Hybrid approach
As we see both approaches are compromises based on what your staff can do best, how fast you need to do things, how much graphics you want to push to the screen. But picking only one or another is only options? At work we currently are exploring that question by trying a hybrid approach that is a compromise on itself. We use a BitmapData based rendering pipeline for our main game graphics where we render a lot of stuff to the screen and we use vector display list based UI above it and it brings some new problems to the table.
This approach is closer to SocialCity and problems it has. We redraw whole game rendering are each frame and this causes to unnecessarily redraw complex UIs above it which costs a lot sometimes. Returning back to Flash redraw area thing. Flash display lists handles this efficiently and automatically but what about BitmapData? BitmapData has an lock/unlock methods that allow to control redraw areas a little. If BitmapData is locked it is not redrawn unless something else above or below it causes that. Then unlock without provided change rectangle redraws whole BitmapData. And providing change rectangle causes only that rectangle to be redrawn. And that’s it.
To test all those compromises with I made following experiment (shows redraw regions in Debug Player automatically). Also frame rate is set to 98 to completely overload Flash/CPU and see how much difference things really make, click on image to see it but read below stuff first I guess:
Lets first stop on things we have here:
- At the back we have a full stage Bitmap that is copying pixels from 4 prerendered BitmapData objects each frame.
- Then we have vector MovieClip window that that has Bevel and Drop shadow filters applied.
- In it we have a TextField with text.
- And finally probably the heaviest things are 12 copies of vectorized Flash logo animated trough frames classic tween.
Now number of switches I added to right click menu(bad idea as it will not work on phones but I am lazy already
):
- First menu item is cacheAsBitmap, it turns it off or on for window with its content.
- Lock bmd locks or unlocks BitmapData and stops pixel copying from cached BitmapData’s
- Remove objects removes or adds window from/to display list along with all stuff inside it
- Stop animations stops or starts Flash logo animations
- And last one is break bitmaps, it brakes background bitmap data on to 8 smaller bitmaps around the window in hopes that Flash would not rerender it anymore
All those switches and objects are there to test various elements and their effects on frame rate in such hybrid model. I had high hopes on 5 switch so that by breaking larger rendering are on to smaller peaces and blitting in to them from original large area would allow me to trick Flash in to not rerendering the Window. Sadly it is not so. Anyways lets explore effect of various switch combination’s on frame rate:
- With window removed and bitmap locked, animations turned off I have 62FPS shown in Chrome, I guess we could call it a best I can get right now(I guess that FPS counter itself slows things down from target 98 frames per second, locally though it works at 100FPS in that case)
- Adding window back does not change much, it still is something like 61-62FPS, animations of logos and bitmap are still off
- Turning on animations in window drops it to ~45FPS showing redraw regions around them
- Turning cacheAsBitmap for window at that moment does not change much
- Turning off cacheAsBitmap and unlocking bitmap drops things even more to 29-30FPS
- Breaking bitmap on to 8 smaller ones does not change much, though I think it gets slower as it is now constantly at 29 FPS
- Stopping animations now brings us back to ~42FPS
- Turning on cacheAsBitmap brings us to ~62FPS
- removing window leaves us at ~62FPS
Conclusions:
- It seems that for not animated objects chacheAsBitmap helps a lot but still this seems to be not consistent. For animated objects it can help to but here it becomes rare and things can get even worse. In this experiment it seems to not change much, also it does not seem to effect memory in any way which is interesting, in two games we currently make in one game this makes things better, in other makes them worse…
- Heavy animated objects are a disaster currently
- Having something that causes redraw under heavy vector object is a disaster
- Having complex vector animations costs a lot
It is interesting to notice 2,8,9 work at nearly perfect FPS, while difference between 3 and 5 is 15 frames. So redrawing whole screen with bitmap is perfect. Having animated window without whole screen redraw is slow but adding whole screen redraw makes things another 15 frames slower. Seems to me that there is possible 15 frames optimization here where even though staff under screen is is redrawn we could force Flash not to redraw that part of the screen unless necessary. I suspect that it is possible to have 5 working at almost the same frame rate as 3 if we could do that, that’s what I wanted to achieve with breaking bitmap but it did not work. Flash seems to analyze redraw rectangles and merge close standing ones in to single one as it probably optimizes things on average but in this case it causes whole screen to be redrawn.
Anyways few days ago before I complete this experiment I filled a feature request at Adobe JIRA about changing BimtapData unlock parameters from single change rect to vector of change rects. This way we could optimize how we redraw stuff. Sadly now after this experiment it seems to me that it would not be enough. What we really need is not BitmapData change rects but ability to manipulate Flash Player change rects to optimize its performance manually in such extreme cases. So I am thinking to fill such feature request too but I was too quick with first one and filled it without much of thinking. Will take it slowly this time. Still if you like first request go on and vote/watch it while I am thinking on adding this second one.



