Ok, since the IRC session, I've tried to code what I was talking about...

First, I code a 100% C routine that tried to fill all the Hires screen with
a code like that:

for x=0 to 239
  for y=0 to 199
    plot_pixel(x,y)
  next y
next x

This code, using my own plot_pixel (the C version of the one I 
described in] CEO mag demo pages) takes around 53 seconds to execute... Arg !

By replacing the C code of plot_pixel, by the 6502 optimised version (but 
still using the "by stack" parameter system), takes around 11 seconds to
 execute. A lot better, but still bad.

The same code 100% in 6502, without parameters, but still acessing the screen
in the wrong direction (I mean, the inner loop is the vertical one), takes
around 2 seconds.

When I reverse X and Y direction, it go down to between 1 and 2 seconds (due
to the fact that the addition of 40 at each pixel was remplaced by an offset
adressing.

For comparison, if you try to code that in basic (with curset x,y,1), it will
take 6 minutes and 55 seconds !


After these little benchmarks, I've started to convert a picture made with my
favorite paint program to the format I described... I've to double the 
allowed size for having all the picture perfectly converted !

Why ? Because of all little holes I made in the picture.

Each isolated pixel in the graphics provoque 2 colors change in the stream,
and so takes 2 cells in the list.

The version I include, is configurated with 16 cells.
Coding is done by line, not by columns, because of processing time. This 
version is optimisable, by using direct write of 6-pixels packets.
Before writing to screen, it's possible to mask graphics with some patterns.
etc etc...

There's a lack of comments in the code, but it was done fastly...

see ya,
Mike

