Media Player 128

Started by Hydrophilic, May 18, 2010, 05:26 PM

Previous topic - Next topic

0 Members and 3 Guests are viewing this topic.

bacon

Quote from: saehn on June 18, 2010, 11:50 PM
Quote from: bacon on June 18, 2010, 04:00 PMThe EeePC's monitor (or any other PC monitor) doesn't know anything about the pixels being output from the VIC chip, and neither does the video capture card. The capture card takes the video signal from the C128 or any other source and outputs a 4:3 aspect ratio picture to the PC, which shows it on the screen. As long as the PC monitor has square pixels the image on the screen will retain the 4:3 aspect ratio, i.e., the CBM pixels will show up with the correct ratio since the C128's video output was meant to be shown on a 4:3 aspect monitor. Hence my pointing out that the EeePC screen shows square pixels.

I get what you're saying but I still think that it's slightly off since it's 4:3 aspect ratio whereas the average CBM monitor aspect ratio is closer to .82 (this is something of a scene standard---for both NTSC and PAL---not just my own personal opinion). And as we can see from our earlier posts, your output (on the eeePC) differs from mine (on a CBM monitor).

If Hydrophilic can make it adjustable, then all problems will be solved.
Now I see what you mean. I thought you meant that the pixel size ratio from a CBM computer would show up as .82 on a true 4:3 display, in which case my reasoning would have been correct.
Bacon
-------------------------------------------------------
Das rubbernecken Sichtseeren keepen das cotton-pickenen Hands in die Pockets muss; relaxen und watschen die Blinkenlichten.

Hydrophilic

I've been thinking about the aspect ratio, and came to this conclusion.  Since both NTSC and PAL TVs have 4:3 aspect ratio (not considering high-def / wide-screen), then it makes sense that PAL would appear wider than NTSC (or equivalentatly, PAL would be shorter than NTSC) because PAL has a greater vertical resoultion: 312 rasters (as opposed to only 263 rasters for NTSC).

So if 0.82 pixel aspect ratio is correct for NTSC (looks right on my PC using VICE as compared to real C128 on my TV), then 0.82 * 312 / 263 = 0.97 pixel aspect ratio for PAL.  This explains why, to my confusion in the past, some people would refer to the hi-res pixels as square.  They are square on PAL TVs (or very close).

I don't think our 8-bit CPU can handle resizing an entire bitmap in realtime to correct pixel aspect ratio.  So now I'm thinking something like 152 x 192 multi-color image.  This should appear slightly stretched vertically on NTSC and slightly stretched horizontally on PAL.  Those numbers come from 38-column / 24-row mode to allow smooth scrolling. 

Thus I calculate an NTSC display aspect ratio to be 1.300 (slightly strectched vertically) or 1.536 for PAL (streched a bit more, but horizontally).

...

I wanted to report I've completed my first P-Frame encoded video and am very disappointed!  The file size got reduced from 320 to 310 blocks.  That's only about 2 kiByte savings!  And the frame-rate didn't improve much either.  From approximately 0.85 to 0.95 frames / second.  (0.92 NTSC, 0.97 PAL).

This was quite confusing until I investigated the issue.  The main problem is inconsistant bitmap encoding.  For example, a block of solid grey (choose your own shade!) might be encoded as bit-pair %01 in one frame but as %10 in another frame.  So even though two frames appear the same on the screen, they are stored with different bytes, and thus benefit less in compression.

Another more complex example deals with dithering.  Because of the limited color resolution of the VIC-II, many areas are not a solid color but a pattern created by dithering.  Imagine a color half-way between two VIC colors that covers a character-sized area.  It could be coded as
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
or in the opposite manner
1 0 1 0
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
0 1 0 1
Either way will give the same overall appearance, but because each uses a different byte sequence, using both reduces the compression.

The dithering algorithm I'm using is fairly standard.  The problem is, it is quite chaotic!  Pixels in one part of an image can be affected by pixels on the complete opposite side of the screen!

So that's the current situation.  Need to improve frame coding consistancy.  And then on to motion vectors and smooth-scrolling.  Oh what fun!
I'm kupo for kupo nuts!

saehn

Hydrophilic, you may want to talk to Algorithm. He's done some of the most impressive video rendering I've ever seen on the C64. Also, I bet he'd be a really good source of information regarding the PAL aspect ratio. Here's his contact info:

http://www.algotechproductions.com/contact/contact.htm

And check out this PAL release he did... very impressive in WinVICE:

http://noname.c64.org/csdb/release/?id=45850

Hydrophilic

Thanks for the links!  That demo has a nice frame rate, but there only about 20 frames in the whole video repeated ad nauseum (I am exagerating a little).  While watching it, I noticed it seemed to be using "virtual bitmap mode" which a quick peek with the VICE monitor confirmed.  That was my plan 67 (when true bitmap mode proves impossible).  Without doing anything else, virtual bitmap mode should triple my frame rate (from 9ki bitmap down to 3ki charset + matrix).  But like I said, I'm saving that for the future...

Although there was a noticable difference between PC 1.000 pixel aspect ratio and NTSC 0.82 pixel aspect ratio, after watching the demo in both modes, I can't say which one is better.  I guess it doesn't matter, as long as your close!  I guess I should contact Algorithm to see what he thinks about the aspect ratio...
I'm kupo for kupo nuts!

Hydrophilic

I was right about the dithering!  I spent a few hours to re-code the encoder to simply copy (appropriate) dithered patterns.  This increased the compression from approximately 44% to 64%.  It also improved the average frame rate to about 1.29 fps (1.25 NTSC, 1.33 PAL).

In a few hours I made more improvement than in the past two weeks!  There is a lot of room for improvement.  For one thing, now it only compares pixels on a per-card 'cell' basis (4*8 pixels, aligned with text-screen char positions).  There's no reason it couldn't do 4*6 or 4*4 'cells' (although 4*4 would probably provide little benefit).  Also, it doesn't need to be aligned with a text-character boundery (at least not vertically).  As if those possibilities weren't enough, the test video I'm using has a HUGE amount of motion in all the frames.  Since the code only tests a cell with the exact same position in other frames, only about 50% of each frame gets "the treatment."  Once I get motion-vectors added, another 10% to 25% percent should also benefit from "the treatment" (dither optimization).

But it isn't all good news.  The main problem is the video now has dithered cells (areas 4*8 pixels) that often scream error next to other blocks (that were not copied from previous frames).

For example, imagine the original (without "treatment") was coded like this:
0 1 0 1   0 1 0 1
1 0 1 0   1 0 1 0
0 1 0 1   0 1 0 0
1 0 1 0   1 0 1 0
0 1 0 1   0 1 0 0
1 0 1 0   1 0 0 0
0 1 0 1   0 0 0 0
1 0 1 0   1 0 0 0
(the left cell is a 50% dithered image, the right is a dithered "triangle")
Now if there were no copy of "0 1 0 1 / 1 0 1 0" x 4 but there was the opposite, then result would be
1 0 1 0   0 1 0 1
0 1 0 1   1 0 1 0
1 0 1 0   0 1 0 0
0 1 0 1   1 0 1 0
1 0 1 0   0 1 0 0
0 1 0 1   1 0 0 0
1 0 1 0   0 0 0 0
0 1 0 1   1 0 0 0
I guess my ASCII art doesn't convey the idea properly.  But anyway, in the first example, you should note there are never any pairs of "1 1" but there are four pairs in the second example.  Also in the first example there are no "0 0" pairs between the left and right cells, but there are four pairs in the second example.

Let me summarize for those confused (even me a little).  If you take a dithered image, and replace cells with those dithered from another frame, the result can produce distracting (anoying, trust me!) visual artifacts, even though each dithred cell (original and copy) by itself is a faithfull reproduction of the original (undithered) cell.

So I significantly improved compression and decreased visual quality.  Well that is quite normal in the art/science of image/video compression!

So I thought, I should re-dither the areas around copied cells so that adjacent cells "mesh well" with the copied ones.  That should improve video quality... and since the adjacent cells didn't benefit from "the treatment" to begin with, there should be no loss in compression.  A win - win scenario!

While debugging my "improved" codec, I found a significant error in the dithering algorithim.  Well not the algorithim, but my implementation.  Anyway, the basic principle is to propogate the "error" (difference between source video and closest VIC-II color) to adjacent pixels.  I discovered my code was discarding about 50% of the error, resulting in reduced image quality.

On a scale of 20 (one-half the 8-bit luma difference between the two closest VIC-II grays), the original code had an error of about 14.  Which I think is pretty good considering the closest undithered is on a scale of 40.

After fixing the error propogation, the error was reduced to approximately 9 out of 20.  That's a huge "improvement"!

Naturally I expected this to result in some file-size increase (or compression decrease).  The resulting size (and playback frame-rate) is what I had about 4 weeks ago!  ERRRRRRRRRRRR!

To add insult to injury, the resulting video actually appeared worse to me!  The residual error has decreased by at least 50%...  So how can that be?

I should probably post some screen shots, but sneaker net is out of order at the moment.  Anyway, the original (before "fixing") video had plenty of dithered areas, but also lots of "solid" (undithered) areas.  It was quite pleasing to the eye in my opinion.  Now (after "fixing"), almost every area of the screen is dithered... areas that were previously undithered are now "sparsely dithered" which is to say large areas are almost a solid color, but have seemingly random dots mixed in.

And so it goes... 1.5 steps forward and 1.0 steps back.  Any competent mathmetician could tell you telescoping functions can take a long time to converge... much like this project!
I'm kupo for kupo nuts!

saehn

Keep it up Hydro, interesting project. Thought you might like to see this:

http://hitmen.c02.at/temp/palstuff/

Hydrophilic

Thanks for the link!  Coming from Hitmen, I was expecting a demo or something... I was pleasantly suprised to see an actual web page!  Not only do they demonstrate a different aspect ratio between PAL/NTSC, but they explain the notorious vertical lines of the VIC that are the bane of many C128 users.  I don't have a problem on my TV, and since I have a flat C128 I assume the VIC is using NMOS not HMOS technology... I'm guessing low contrast on my TV?

Thanks for the encouragment too!  I haven't worked on this for over a month (better things to do during the summer), but before a stopped, I did un-fix the "problem" so now the frame rate is back up to previously reported levels.  Even after implementing smooth scrolling and possibly smaller macroblocks, I worry that true bitmap mode will not meet my expections with the audio IRQ running... which means no digi-audio or no true bitmap mode (emulated char-bitmap is promosing) or neither.  I'll keep everyone posted once I get back to this project.
I'm kupo for kupo nuts!

Hydrophilic

#32
Warning: another long post.

Now that my summer vacation is over (or winter vacation as they would say down under) I've taken another look at video on the C128 with Media Player 128 and the results are not promising...

Because of the continuous, significant, and yet variable overhead of the audio + JiffyDOS IRQ routine, I had never previously attempted to calculate (or even scientifically estimate) the maximum theoretical frame rate of my software.  I assumed that with better compression a reasonable frame rate could be achieved with (near) full screen bitmap video.

I've had time to mull over the idea this summer, and concluded I needed a solid estimate of the maximum frame rate (which is quite different from maximum data transfer speed or best video compression).

So I set up an experiment where the (IRQ) audio player + disk loader is constantly running at maximum capacity and the (main thread) is constantly "drawing" a new bitmap for each frame.

The "drawing" is a simple copy of data in memory from one of two off-screen bitmaps to the visible screen.  A simple copy is approximatly (in CPU cycles) equal to the RLE/memory decompression method used for decoding real frames loaded from disk. (I can explain this more for anyone who cares.)

And a simple copy is signifcantly faster than my planed "advanced" decompression which would envolve "macroblocks".

The sad result: with the 2-bit audio + JiffyDOS IRQ routine running at peak effeciency, the main code is only able to render at a frame rate of 2 frames per second.

       
  • P.S. I previously reported frame rates in excess of 2 fps, but this is due to luck; when an old bitmap memory has the same (or close) data as the new image, no new data is written to RAM (P-frame encoding); this improves compression and frame-rate in some cases, but is not possible for all new images; thus, it can not be used as a limiting factor.
This convinces me that full-screen bitmap video is not possible with both 2-bit audio and JiffyDOS running in the IRQ...  At least not with an acceptable frame rate, which I consider to be at least 10 frames/second.

The best option is, I think, to improve the overhead of the IRQ routine.  There are only two ways to do this: 1) eliminate the audio, or 2) use fast-serial instead of JiffyDOS

       
  • P.S. I don't think you can improve the performance of the 2-bit audio as it is extremely simple (simpler than 4-bit audio for those in the know).  And the JiffyDOS part is as fast as JiffyDOS protocol will allow.
Option 1 is out for now because I have no intention of eliminating audio completely.  A "video" without audio can not be truly called a video in the modern sense of the term.

Option 2, fast-serial hardware, has always been my preference on the C128.  Unfortunately, the C1571 lacks the capacity for a video of any reasonable length, and the uIEC / MMC / HD64 do not support fast-serial.  The C1581 does support fast-serial and has enough capacity for a limited amount of video, but I do not own one.  And even if I did, I hate disks!  Floppy disks, CDs, DVDs, you name it.  They always give me more grief than a memory card or HD ever has.  (Despite my meticulous efforts to not damage the disks.)

So it may seem this project has reached a dead end...

But options 1 and 2 were the best options, not the only options.

Another option, plan 67, is to forgo true bitmap mode with emulated bitmap mode; this is simply text mode with a custom character set.  (Thanks saehn
for the link to Sabrina - that demo validated my idea of emulated bitmap mode!) This offers several advantages and problems.

The best advantage is an emulated bitmap requires only 4kiB uncompressed data, a savings over 50% (a real bitmap needs 9 to 10 kiB).  Another advantage is scrolling (or panning as we say in video).

However the problems are just as significant; probably greater.  First, 3 of the 4 multi-colors are fixed for the entire screen (in true bitmap mode only 1 color is fixed); this will reduce the quality of all but "lucky" images with a limited palette.  Another color problem is that the only color that can be manipulated "freely" (in char-cell terms) is limited to the first 8 of the 16 VIC colors.  Third is the limited pixel precision; a full text screen has 1000 cells but only 256 cell-patterns; thus unless the image has highly redundant background (on the order of 3:1) the image decoded will have significant distortion.  Fourth, not relevant to the user but to me as the programmer, is writing a compressor to produce such a video file.

There are other options too.  Plan 74 is to wait for Craig Bruce or some other genius to release an affordable mass-storage device that supports fast-serial protocol.

Plan 83 is for me to finish and release HDserver128 which supports B-E, M-E, Custom ROMs, MFM, fast-serial, and a bunch of other stuff nobody has done to my knowledge... but at the moment I'm focused on Media Player 128.

So right now, I'm considering option 35.  This is to give up on video for the moment and concentrate on an audio-only player.  (If your reading this RobertB, I haven't forgotten about DigiMaster)

I'll stop now before I bore everyone to death!

P.S. The good news.  Media Player 128 has run in the test mode now for over 2 hours without crashing!

Edit
5 hours later... still going... nothing outlasts the hydrogizer :)
/Edit
I'm kupo for kupo nuts!

RobertB

Quote from: Hydrophilic on October 03, 2010, 08:45 PMThis convinces me that full-screen bitmap video is not possible with both 2-bit audio and JiffyDOS running in the IRQ...  At least not with an acceptable frame rate, which I consider to be at least 10 frames/second.
I guess that you'd have to go with a less-than-fullscreen bitmap for extra frame rate speed.
Quote(If your reading this RobertB, I haven't forgotten about DigiMaster)
Hooray!

          Truly,
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          The Other Group of Amigoids
          http://www.calweb.com/~rabel1/
          Southern California Commodore & Amiga Network
          http://www.sccaners.org

Payton Byrd

One option you did not mention is working with Ingo and Jim Brain on implementing a fast serial routine on the uIEC that would be sufficient for your needs.  This may require a bounty on commodorebounty.com but in the end such a routine could be a benefit to the whole community and if the computer side uses standard kernel calls then it could become a cart or internal ROM.
Payton Byrd
---------------------------------------------------
Flat 128 w/JD, uIEC/SD, uIEC-CF/IDE,
1541 Ultimate, 1571 w/JD, 1581 w/JD,
VDC-SVideo, Visio 19" HDTV

Hydrophilic

Well I did mention something similar to that; except like an idiot I referred to Craig Bruce instead of Jim Brain.   :-[   Thanks for bringing that to my attention.  I thought about adding fast-serial myself.  I downloaded the source code for the uIEC a while back and spent some time studying it.  There is plenty of unused Flash memory to code it.  But I think the SREQ line would need to be reconfigured for interrupts as I don't think the controller would be fast enough to catch the signals through polling... at least not until a BurstLoad command was issued.  I'll probably end up doing it if somebody doesn't beat me to it.

But I would rather work on Media Player for the moment.

Another idea that I didn't mention would be to reduce the sampling frequency.  With 8kHZ rate (approx) the IRQ is getting called every 2 rasters.  Reducing it to every 3 rasters would make a rate of approx 5.24kHZ.  The issue is that the audio already sounds crummy, being it's only 2 bit samples.  I wouldn't want to make it any worse.  But since it sounds so bad to begin with, it might not make much difference.

Quote from: RobertBI guess that you'd have to go with a less-than-fullscreen bitmap for extra frame rate speed.

Now that sounds like a pretty good idea until I can get fast-serial mass storage device.  I was hoping to use 38x24 mode to effeciently implement smooth-scrolling.  Using anything smaller would require software.  But I guess ya gotta do what ya gotta do...

So I re-ran my experiment with some different values.  Here are the results, including previous report:
Char Cells Frames/Sec MC bitmap Hi-res bitmap
38 x 24 2.0152x192304x192
30 x 19 3.4120x152240x152
27 x 17 4.0108x136216x136

So it still doesn't look very promising.  But, this is testing I-frames, the worst-case scenario.  With P-frames we can skip writting to RAM (and loading from disk) if the old data is the same or close to what is needed.  Also with a smaller vertical size, we can run the CPU in 2MHz more often.  When I first implemented fast mode it added about 25% extra frame rate.  So I'm thinking with both benefits, we can have 6+ fps at the smaller size.  Then maybe with motion vectors / macroblocks it can get up to 8 or 9 fps.

Of course such a small video might not be worth watching.  Right now, despite the limited resolution of the VIC, at full screen you can still make out details on peoples faces, like there lips and teeth and which way their eyes are looking.  With a reduced bitmap, I fear you'll only be able to see that it is person who has eyes and maybe a mouth.  Well guess I won't know until I try!

I also tested this on my real C128 to make sure the fast mode code was really working.  I don't think I had tested it on real hardware since I implemented fast mode.  Well it works!  I also tested the Multicolor + ExtendedBG mode trick to make sure it hides all the "random" pixels that appear when the CPU is at 2MHz.  That works too!  So I'm off to code a smaller screen...
I'm kupo for kupo nuts!

RobertB

Quote from: Hydrophilic on October 05, 2010, 10:26 AMOf course such a small video might not be worth watching.  Right now, despite the limited resolution of the VIC, at full screen you can still make out details on peoples faces, like there lips and teeth and which way their eyes are looking.  With a reduced bitmap, I fear you'll only be able to see that it is person who has eyes and maybe a mouth.  Well guess I won't know until I try!
:)  Heh, when I played Gloom (a Doom clone for the Amiga 1200), I had to reduce to video playfield to a little square in the center of the screen for me to get any kind of scrolling speed.  With that small square, it was hard to recognize the any of the objects.

          Truly,
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          The Other Group of Amigoids
          http://www.calweb.com/~rabel1/
          Southern California Commodore & Amiga Network
          http://www.sccaners.org

Hydrophilic

#37
The results are still not promising.  With smaller screen average frame is only about 1.5 fps.  I haven't implemented mid-screen 2MHz + MCM + EBM yet.  That might get the average rate up to 2.0 fps.  And you can still people have eyes and see when they talk, but I don't think there's enough detail to read lips.  (Not that you could do that at 2Hz anyway)

Based on this, I'm not sure what my previous experiment tells me... because if the average is 1.5 fps, then the minimum is much less; I don't have an exact value for the maximum frame rate, but it was pretty high, like 15+ fps.

I'm thinking the experiment tells the maximum full-screen decompression provided the compressed data is available fast enough... so it seems obvious to me the data rate is just too low...

I messed around with the encoding of cards / macroblocks in the P-frames and discovered some interesting things.  Depending on how "sloppy" the comparison function is made, the compressor can be made to copy huge portions of old data from RAM (instead of loading from disk).  For example, the best quality (virtually no loss within limits of the VIC2) produced a file size of about 320 blocks (5 seconds of video).  But by changing some constants, I got the file size down to 137 blocks.  Of course this noticeably reduced the quality, but it didn't result in a complete mess either.
Edit
You would think reducing file size by 50% would increase frame rate dramatically... but that small file only had a tiny improvement... about 1.65 fps... arrrr!
/Edit

So I need to develop a way to change those constants into variables.  That way if a frame or two are really "hard" to encode, the quality of the image can be reduced to maintain a decent frame rate.

Similarly, the encoder is currently writing frames to the file at a constant rate.  If this could be made variable, say dropping a frame that's really "hard" to encode or dropping a frame to give more data to the previous frame that was "hard" to encode, it could improve the average frame rate... albeit at the cost of a more "jittery" frame rate.

So many variables, so little time!

Edit 2
Previously, before the new smaller screen size, the average frame rate was approximately 1.0 fps.  You would think that reducing the screen size (from 152x192=29.184kPix down to 108x136=14.688kPix) would significantly improve the frame rate... but it only improved by 50% (from 1.0 to 1.5 fps)
/Edit 2
I'm kupo for kupo nuts!

RobertB

Quote from: Hydrophilic on October 06, 2010, 12:34 PM...but it only improved by 50% (from 1.0 to 1.5 fps)
Hmm.
QuoteUnfortunately, the C1571 lacks the capacity for a video of any reasonable length, and the uIEC / MMC / HD64 do not support fast-serial.  The C1581 does support fast-serial and has enough capacity for a limited amount of video, but I do not own one.
FWIW, you can connect a CMD hard drive to a RAMLink via parallel cable, and the appropriate software can make use of the faster throughput (Wheels does this).  However, CMD hard drives and RAMLinks are not exactly widely-available.  FWIW #2, when using the IDE64, I've seen very fast video framerates with its movie player plug-in.  However, IDE64 is for the C64 (with the exception of being able to use C128 function keys).

          Truly,
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          The Other Group of Amigoids
          http://www.calweb.com/~rabel1/
          Southern California Commodore & Amiga Network
          http://www.sccaners.org

RobertB


Quote from: me on October 06, 2010, 03:03 PM
However, IDE64 is for the C64 (with the exception of being able to use C128 function keys).
Today I received word from the IDE64 developer that he is working on a beta version of the firmware in which it can use the 2 MHz of the C128 while in C64 mode (the C64 screen is blanked using shift-lock).

          Truly,
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          The Other Group of Amigoids
          http://www.calweb.com/~rabel1/
          Southern California Commodore & Amiga Network
          http://www.sccaners.org

gsteemso

I personally think that adding fast-serial support to the µIEC/SD2IEC project would be the most widely useful move. I’d do it myself if I thought I could get it working in less time than “a few years,” but as some of you may recall I am already way late with a variety of other projects, including a mostly unrelated µIEC enhancement. Adding more to my backlog queue would not be productive.


Does anyone think there would be much public support behind putting a bounty on adding the fast-serial feature? I am conflicted on the point; I’m in favour but have no money to put where my mouth is.


G.
The world's only gsteemso

Hydrophilic

I think a bounty for uIEC fast serial is a good idea, but I won't put up any money for something I plan on writing anyway... eventually... I have a large queue as well!

I CMD HD would be the way to go if they were more common and economical... and had a larger capacity...

The IDE64 sounds pretty good if it would run in 128 mode.  The problem with 64 mode (even if you have access to 2MHz which the software uses) is the fixed CharROM and no ColorRAM banks.  Not to mention 1/2 RAM for less video buffering / P-frame encoding, thus larger files due to less compression (of course faster transfer rate may overcome the larger file size).

I've finally coded and debugged the within-visibile-screen 2MHz audio + JiffyDOS IRQ... what a pain!  The IRQ routines are like a house of cards, make one small change and the whole thing collapses.  I thought I was hiding the 2MHz garbage with the MCM+EBM trick, but I was actually using the Bitmap+EBM trick.  They both do the same thing (force video output to black), except the Bitmap+EBM method is faster because both bits are in the same register.

Anyway, this boosted the average frame-rate up to 2.0fps NTSC, 2.1fps PAL.  I also took the time to determine the best and worst frame rates.  The lowest was about 1.3fps and the highest was 6.0fps.  Previously I reported 15+ fps but that was just my impression... I was wrong, there I said it.

It seems like the median frame rate was about 2.2fps (I didn't take the time to calculate the median).  More importantly, I noticed the encoder is not doing a very good job.  For example, whenever there was a scene change, that is when I would get the below average (1.3fps) frame rate, which is to be expected unless you seriously compromise the quality of the image.  And whenever there was little movement that is when I got those above average (6.0fps) frame rates, also to be expected.  But, often when there was only "modest" movement, the frame rate would still drop below average (1.5fps or so).

I'm going to work some more on the encoder.  Hopefully I can get the scenes with "modest" movement up above the median.  I'm thinking that would give an average of 3 to 4 fps.  Then I can work on the "sloppy" control factor for a little more speed.  Unfortunately it seems this will never get above 5 fps without a hardware solution.
I'm kupo for kupo nuts!

redrumloa

Quote from: Payton Byrd on October 04, 2010, 11:24 PM
One option you did not mention is working with Ingo and Jim Brain on implementing a fast serial routine on the uIEC that would be sufficient for your needs.  This may require a bounty on commodorebounty.com but in the end such a routine could be a benefit to the whole community and if the computer side uses standard kernel calls then it could become a cart or internal ROM.

I'd be very interested in that bounty, speaking for Commodore Bounty.

Hydrophilic

I've implemented a limited form of motion vectors and virtualized the encoding software to improve P-Frame encoding.  I was hoping this would give a nice improvement, but has only given a small one.  Both NTSC and PAL average 2.2 fps.  Previously PAL enjoyed a speed advantage, but with the increased 2MHz zone (reduced video size), NTSC is now almost as fast (difference less than 0.1 fps)

I've tried various things to improve the compression and hopefully the frame rate as a consequence.  Although I made little improvement, I learned a lot about how to make things worse!

Maybe somebody has an idea to explain this... I get better compression and frame rate with motion vectors limited to less than 8 rasters.  Here is an example table of file size versus search range (rng) and block size to copy...

Block Sizerng=8rng=7rng=6rng=5rng=4
2x4150k100k110k120k130k
3x4160k110k120k130k140k
4x4170k120k130k140k150k
5x4180k130k140k150k160k
This is just an example, the real data is much larger and not so pretty.  But you should see that file size decreases with smaller blocks and a larger search range, until 8 raster vertical search is reached and then the size jumps considerably.  Ideas?

I did find it is important if the block is to move to an area that crosses a cell row (8 full rasters corresponding to 1 text character row) that it should not be copied.  This reduces compression because of the way video memory is laid out.  So the encoder checks for this and avoids it.  But if the block to copy does not cross a cell row, it should improve compression no matter "how far away" that it is... but according to the data, there is a huge "penalty" if the block is 8 rasters away... bizarre.

This is important because it should improve compression if blocks can be copied (moved) from arbitrarily far distances.  Right now the search range is limited to 7 rasters to avoid the "penalty".  This is only vertical motion; horizontal motion is not implemented.

Related to that is updating the encoder's "memory" of the scene after moving blocks around.  That was bit of programming fun.  I noticed through a programming error that if the blocks "memory" of the "new" value is not correct, the compression actually improves...

So I tried various methods of lying about the copied value.  For example, saying that 50% comes from the real source and 50% comes from the old copy being used.  I tried various fractions and limits... but in the end it turns out that the best compression occurs when you lie 100%.  Always say the copied value is the real value.  I guess it goes to show it's okay to lie if you do it consistantly.

Another thing I worked on was trying to optimize the color palette.  Again I made marginal improvement and found a lot of ways to make things worse.

The one thing that actually worked was to simply enforce a consistant palette as much as possible.  Previously only multi-color 3 (in Color RAM as opposed to system RAM) was fixed, and the encoder would choose the other 3 colors to reduce the overall error.  The entries were ordered by priority so some frames might have "black" as color 0 and some frames would have "dark gray" as color 0... even when they all used both "black" and "dark gray".  So now if the encoder finds the same colors in consequtive frames, it makes sure the palette entries of the new frames match the previous frame.

Now that makes sense; because the data is more consistant the compression and frame rate are better.  Unexpected was the improvement from choosing a bad palette.

Originaly the program would test each color against the entire bitmap and assign an error to each color and then use the 3 colors with the lowest error (the 4th color is fixed remember).  This method is fast and simple, but the quality of images (based on final VIC2 luma versus source video luma) is not very good... so I tried different algorithims to get a better palette.  Instead of a simple sum, I tried a sum of the squared errors.  I also tried calculating the error of color n+1 based on the fact that color n was also available.

Well these methods gave a "better" image in terms of overall luma error.  But they considerably increased the file size.  The images also appeared less pleasing because they had large amounts of dithering.  The large amount of dithering would explain the increased file size.

The original "bad" palette optimization gives the best compression, and in my opinion, better images.  Go figure.

Well I'll try and get some screen shots up for you this weekend.  But you need to see it in action, if you can call 2.2Hz action :)  Some of the scenes look like a typical VIC2 dithered image but some are really nice.  My favorite parts are the screen disolves.  I always think for an instant my program has crashed and then the next scene magically appears.

Strangely you would think the screen disolves / fades would suffer low compression or 'pixelation' due to lots of dithering between two different scenes.  Yet these are as fast, often faster, than normal frames in the video.  Go figure.

I'm gonna work on hi-res bitmaps instead of multi-color and try some other ideas to improve the frame rate.  Might have an actual working demo by Christmas.
I'm kupo for kupo nuts!

Hydrophilic

#44
I got hi-res working and totally re-coded the video encoder to be more structured... the (original) single-file source of over 100k was getting a bit unwieldy... now it is way over 100k (multiple files) but is easier to manage... took me quite a while to squash all the memory leaks I created in the process...

The main benefit, besides better organization, is now I can "parameterize" all those weird/useful compression optimizations I spoke about in previous posts.  So hopefully the encoder will be able to produce an optimal (or nearly) video through some iteration/constraint type algorithim.

Anyway, as promised, I've got some screen shots for you.  I've got so much data, I can't think of good way to present it...  Edit Actually I can't put that much data in one post, so you get the short, short version/Edit Let's go in this order:

Size (H x V)
Mode (Hi-Res, Multi-Color)
Quality (P-Frame, I-Frame)

In regards to Size, the fist is my first video at 40x23 cards (160x184 MC or 320x184 HR) which I thought would be close to 4:3 aspect ratio but was wrong according to responses from forum members (thanks to all who replied!)

WARNING: do no not enlarge the C128 images, it only makes them look worse... trust me! :)

Oh, I said this in previous post(s), but the images are limited to VIC-II gray scale... color is on my to-do-list  :)
I'm kupo for kupo nuts!

Hydrophilic

Here is a set of images in the smaller format to allow faster frame rate due to smaller data size and more 2MHz time.  These were captured outside of the video player so the 2MHz mid-screen borders are not present.

The size is 27x17 cards (108x136 MC or 216x136 HR).  This gives nearly perfect 4:3 aspect ratio on a real C128.  When viewing them on a PC with perfectly square pixels, the image will appear a little wide.

WARNING: do no not enlarge the C128 images, it only makes them look worse... trust me!  :)
I'm kupo for kupo nuts!

Hydrophilic

The third Size is wide-screen format.  This gives a good 16:9 aspect ratio on a real C128.  Again, when viewed on a PC with square pixels, the images appear too wide.

WARNING: do no not enlarge the C128 images, it only makes them look worse... trust me!  :)

I could spam the boards all day, but instead if you want more, you can download a zip file (2.58 MiB) which contains over 70 images... multiple samples from each screen size and the original images.  The *pH files are mid-quality (P-Frame) hi-res; the *iH files are high-quality (I-Frame) hi-res; the *p filese are P-Frames in multi-color, and the *i filese are I-Frames in multi-color; those without i/p(H) are the originals.
I'm kupo for kupo nuts!

Hydrophilic

#47
What... no comments???

I guess I have to take it to the 4th dimension (color).  Stay tuned...
I'm kupo for kupo nuts!

RobertB

Quote from: Hydrophilic on October 18, 2010, 04:18 AMThe third Size is wide-screen format.  This gives a good 16:9 aspect ratio on a real C128.
If you go to the letterbox format, it looks as if you would have room at the bottom of the screen for other information.

          Nah, not scrolly greetz,  ;)
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          The Other Group of Amigoids
          http://www.calweb.com/~rabel1/
          Southern California Commodore & Amiga Network
          http://www.sccaners.org

saehn

Quote from: Hydrophilic on October 19, 2010, 04:41 AM
What... no comments???

I guess I have to take it to the 4th dimension (color).  Stay tuned...

It looks great, even in B/W! I think I prefer the multicolor over hi-res. It's little more blocky, but I bet that wouldn't matter much in motion.