summaryrefslogtreecommitdiffstats
path: root/gst/dvdspu/Notes.txt
diff options
context:
space:
mode:
Diffstat (limited to 'gst/dvdspu/Notes.txt')
-rw-r--r--gst/dvdspu/Notes.txt324
1 files changed, 324 insertions, 0 deletions
diff --git a/gst/dvdspu/Notes.txt b/gst/dvdspu/Notes.txt
new file mode 100644
index 00000000..a0e56e3c
--- /dev/null
+++ b/gst/dvdspu/Notes.txt
@@ -0,0 +1,324 @@
+
+ DVD subtitles
+ ---------------
+
+
+ 0. Introduction
+ 1. Basics
+ 2. The data structure
+ 3. Reading the control header
+ 4. Decoding the graphics
+ 5. What I do not know yet / What I need
+ 6. Thanks
+ 7. Changes
+
+
+
+
+
+The latest version of this document can be found here:
+http://www.via.ecp.fr/~sam/doc/dvd/
+
+
+
+
+
+0. Introduction
+
+ One of the last things we missed in DVD decoding under my system was the
+decoding of subtitles. I found no information on the web or Usenet about them,
+apart from a few words on them being run-length encoded in the DVD FAQ.
+
+ So we decided to reverse-engineer their format (it's completely legal in
+France, since we did it on interoperability purposes), and managed to get
+almost all of it.
+
+
+
+
+
+1. Basics
+
+ DVD subtitles are hidden in private PS packets (0x000001ba), just like AC3
+streams are.
+
+ Within the PS packet, there are PES packets, and like AC3, the header for the
+ones containing subtitles have a 0x000001bd header.
+ As for AC3, where there's an ID like (0x80 + x), there's a subtitle ID equal
+to (0x20 + x), where x is the subtitle ID. Thus there seems to be only
+16 possible different subtitles on a DVD (my Taxi Driver copy has 16).
+
+ I'll suppose you know how to extract AC3 from a DVD, and jump to the
+interesting part of this documentation. Anyway you're unlikely to have
+understood what I said without already being familiar with MPEG2.
+
+
+
+
+
+2. The data structure
+
+A subtitle packet, after its parts have been collected and appended, looks
+like this :
+
+ +----------------------------------------------------------+
+ | |
+ | 0 2 size |
+ | +----+------------------------+-----------------+ |
+ | |size| data packet | control | |
+ | +----+------------------------+-----------------+ |
+ | |
+ | a subtitle packet |
+ | |
+ +----------------------------------------------------------+
+
+size is a 2 bytes word, and data packet and control may have any size.
+
+
+Here is the structure of the data packet :
+
+ +----------------------------------------------------------+
+ | |
+ | 2 4 S0+2 |
+ | +----+------------------------------------------+ |
+ | | S0 | data | |
+ | +----+------------------------------------------+ |
+ | |
+ | the data packet |
+ | |
+ +----------------------------------------------------------+
+
+S0, the data packet size, is a 2 bytes word.
+
+
+Finally, here's the structure of the control packet :
+
+ +----------------------------------------------------------+
+ | |
+ | S0+2 S0+4 S1 size |
+ | +----+---------+---------+--+---------+--+---------+ |
+ | | S1 |ctrl seq |ctrl seq |..|ctrl seq |ff| end seq | |
+ | +----+---------+---------+--+---------+--+---------+ |
+ | |
+ | the control packet |
+ | |
+ +----------------------------------------------------------+
+
+To summarize :
+
+ - S1, at offset S0+2, the position of the end sequence
+ - several control sequences
+ - the 'ff' byte
+ - the end sequence
+
+
+
+
+
+3. Reading the control header
+
+The first thing to read is the control sequences. There are several
+types of them, and each type is determined by its first byte. As far
+as I know, each type has a fixed length.
+
+ * type 0x01 : '01' - 1 byte
+ it seems to be an empty control sequence.
+
+ * type 0x03 : '03wxyz' - 3 bytes
+ this one has the palette information ; it basically says 'encoded color 0
+ is the wth color of the palette, encoded color 1 is the xth color, aso.
+
+ * type 0x04 : '04wxyz' - 3 bytes
+ I *think* this is the alpha channel information ; I only saw values of 0 or f
+ for those nibbles, so I can't really be sure, but it seems plausable.
+
+ * type 0x05 : '05xxxXXXyyyYYY' - 7 bytes
+ the coordinates of the subtitle on the screen :
+ xxx is the first column of the subtitle
+ XXX is the last column of the subtitle
+ yyy is the first line of the subtitle
+ YYY is the last line of the subtitle
+ thus the subtitle's size is (XXX-xxx+1) x (YYY-yyy+1)
+
+ * type 0x06 : '06xxxxyyyy' - 5 bytes
+ xxxx is the position of the first graphic line, and yyyy is the position of
+ the second one (the graphics are interlaced, so it helps a lot :p)
+
+The end sequence has this structure:
+
+ xxxx yyyy 02 ff (ff)
+
+ it ends with 'ff' or 'ffff', to make the whole packet have an even length.
+
+FIXME: I absolutely don't know what xxxx is. I suppose it may be some date
+information since I found it nowhere else, but I can't be sure.
+
+ yyyy is equal to S1 (see picture).
+
+
+Example of a control header :
+----
+0A 0C 01 03 02 31 04 0F F0 05 00 02 CF 00 22 3E 06 00 06 04 E9 FF 00 93 0A 0C 02 FF
+----
+Let's decode it. First of all, S1 = 0x0a0c.
+
+The control sequences are :
+ 01
+ Nothing to say about this one
+ 03 02 31
+ Color 0 is 0, color 1 is 2, color 2 is 3, and color 3 is 1.
+ 04 0F F0
+ Colors 0 and 3 are transparent, and colors 2 and 3 are opaque (not sure of this one)
+ 05 00 02 CF 00 22 3E
+ The first column is 0x000, the last one is 0x2cf, the first line is 0x002, and
+ the last line is 0x23e. Thus the subtitle's size is 0x2d0 x 0x23d.
+ 06 00 06 04 E9
+ The first encoded image starts at offset 0x006, and the second one starts at 0x04e9.
+
+And the end sequence is :
+ 00 93 0A 0C 02 FF
+ Which means... well, not many things now. We can at least verify that S1 (0x0a0c) is
+ there.
+
+
+
+
+
+4. Decoding the graphics
+
+ The graphics are rather easy to decode (at least, when you know how to do it - it
+ took us one whole week to figure out what the encoding was :p).
+
+ The picture is interlaced, for instance for a 40 lines picture :
+
+ line 0 ---------------#----------
+ line 2 ------#-------------------
+ ...
+ line 38 ------------#-------------
+ line 1 ------------------#-------
+ line 3 --------#-----------------
+ ...
+ line 39 -------------#------------
+
+ When decoding you should get:
+
+ line 0 ---------------#----------
+ line 1 ------------------#-------
+ line 2 ------#-------------------
+ line 3 --------#-----------------
+ ...
+ line 38 ------------#-------------
+ line 39 -------------#------------
+
+ Computers with weak processors could choose only to decode even lines
+ in order to gain some time, for instance.
+
+
+ The encoding is run-length encoded, with the following alphabet:
+
+ 0xf
+ 0xe
+ 0xd
+ 0xc
+ 0xb
+ 0xa
+ 0x9
+ 0x8
+ 0x7
+ 0x6
+ 0x5
+ 0x4
+ 0x3-
+ 0x2-
+ 0x1-
+ 0x0f-
+ 0x0e-
+ 0x0d-
+ 0x0c-
+ 0x0b-
+ 0x0a-
+ 0x09-
+ 0x08-
+ 0x07-
+ 0x06-
+ 0x05-
+ 0x04-
+ 0x03--
+ 0x02--
+ 0x01--
+ 0x0000
+
+ '-' stands for any other nibble. Once a sequence X of this alphabet has
+ been read, the pixels can be displayed : (X >> 2) is the number of pixels
+ to display, and (X & 0x3) is the color of the pixel.
+
+ For instance, 0x23 means "8 pixels of color 3".
+
+ "0000" has a special meaning : it's a carriage return. The decoder should
+ do a carriage return when reaching the end of the line, or when encountering
+ this "0000" sequence. When doing a carriage return, the parser should be
+ reset to the next even position (it cannot be nibble-aligned at the start
+ of a line).
+
+ After a carriage return, the parser should read a line on the other
+ interlaced picture, and swap like this after each carriage return.
+
+ Perhaps I don't explain this very well, so you'd better have a look at
+ the enclosed source.
+
+
+
+
+
+5. What I do not know yet / What I need
+
+I don't know what's in the end sequence yet.
+
+Also, I don't know exactly when to display subtitles, and when to remove them.
+
+I don't know if there are other types of control sequences (in my programs I consider
+0xff as a control sequence type, as well as 0x02. I don't know if it's correct or not,
+so please comment on this).
+
+I don't know what the "official" color palette is.
+
+I don't know how to handle transparency information.
+
+I don't know if this document is generic enough.
+
+So what I need is you :
+
+ - if you can, patch this document or my programs to fix strange behaviour with your subtitles.
+
+ - send me your subtitles (there's a program to extract them enclosed) ; the first 10 KB
+ of subtitles in a VOB should be enough, but it would be cool if you sent me one subtitle
+ file per language.
+
+
+
+
+
+6. Thanks
+
+ Thanks to Michel Lespinasse <walken@via.ecp.fr> for his great help on understanding
+the RLE stuff, and for all the ideas he had.
+
+ Thanks to mass (David Waite) and taaz (David I. Lehn) from irc at
+openprojects.net for sending me their subtitles.
+
+
+
+
+
+7. Changes
+
+ 20000116: added the 'changes' section.
+ 20000116: added David Waite's and David I. Lehn's name.
+ 20000116: changed "x0" and "x1" to "S0" and "S1" to make it less confusing.
+
+
+
+
+--
+Paris, January 16th 2000
+Samuel Hocevar <sam@via.ecp.fr>