| Subtitles |
| ========= |
| |
| 1. Problem |
| GStreamer currently does not support subtitles. |
| |
| 2. Proposed solution |
| - Elements |
| - Text-overlay |
| - Autoplugging |
| - Scheduling |
| - Stream selection |
| |
| The first thing we'll need is subtitle awareness. I'll focus on AVI/MKV/OGM |
| here, because I know how that works. The same methods apply to DVD subtitles |
| as well. The matroska demuxer (and Ogg) will need subtitle awareness. For |
| AVI, this is not needed. Secondly, we'll need subtitle stream parsers (for |
| all popular subtitle formats), that can deal both with parsed streams (MKV, |
| OGM) as well as .sub file chunks (AVI). Sample code is available in |
| gst-sandbox/textoverlay/. |
| |
| Secondly, we'll need a textoverlay filter that can take text and video and |
| blits text on video. We have several such elements (e.g. the cairo-based |
| element) in gst-plugins already. Those might need some updates to work |
| exactly as expected. |
| |
| Thirdly, playbin will need to handle all that. We expect subtitle streams |
| to end up as subimages or plain text (or xhtml text). Note that playbin |
| should also allow access to the unblitted subtitle as text (if available) |
| for accessibility purposes. |
| |
| A problem popping up is that subtitles are no continuous streams. This is |
| especially noticeable in the MKV/OGM case, because there the input of data |
| depends on the other streams, so we'll only notice delays inside an element |
| when we've received the next data chunk. There are two possible solutions: |
| using timestamped filler events or using decoupled subtitle overlay elements |
| (bins, probably). The first has as a difficulty that it only works well in |
| the AVI/.sub case, where we will notice discontinuities before they become |
| problematic. The second is more difficult to implement, but works for both |
| cases. |
| A) fillers |
| Imagine that two subtitles come after each other, with 10 seconds of no-data |
| in between. By parsing a .sub file, we would notice immediately and we could |
| send a filler event (or empty data) with a timestamp and duration in between. |
| B) decoupled |
| Imagine this text element: |
| ------------------------------ |
| video ----- | actual element |out |
| | / -----------------| |
| text - - | |
| ------------------------------ |
| where the text pad is decoupled, like a queue. When no text data is available, |
| the pad will have received no data, and the element will render no subtitles. |
| The actual element can be a bin here, containing another subtitle rendering |
| element. Disadvantage: it requires threading, and the element itself is (in |
| concept) kinda gross. The element can be embedded in playbin to hide this |
| fact (i.e. not be available outside the scope of playbin). |
| Whichever solution we take, it'll require effort from the implementer. |
| Scheduling (process, not implementation) knowledge is assumed. |
| |
| Stream selection is a problem that audio has, too. We'll need a solution for |
| this at the playback bin level, e.g. playbin. By muting all unused streams |
| and dynamically unmuting the selected stream, this is easily solved. Note |
| that synchronization needs to be checked in this case. The solution is not |
| hard, but someone has to do it. |
| |
| 3. Written by |
| Ronald S. Bultje <rbultje@ronald.bitfreak.net>, Dec. 25th, 2004 |
| |
| |
| Appendix A: random IRC addition |
| <Company> intersting question: would it be a good idea to have a "max-buffer-length" property? |
| <Company> that way demuxewrs would now how often they'd need to generate filler events |
| <Company> s/now/know/ |
| <BBB> hm... |
| <BBB> I don't think it's good to make that variable |
| <Company> dunno |
| <Company> (i'm btw always looking at this from the midi perspective, too) |
| <Company> (because both subtitles and midi are basically the same in this regard) |
| <BBB> and do you mean 'after the stream has advanced <time> and we didn't read a new subtitle in this mkv stream, we should send a filler'? |
| <Company> yeah |
| <BBB> it goes for avi with large init_delay values, too |
| <Company> so you don't need to send fillers every frame |
| <BBB> right |
| <BBB> cant' we just set that to, for example, 1s? |
| <BBB> it's fairly random, but still |
| <Company> that's another option, too |
| <Company> though you could write all file parsers with max-delay=MAXINT |
| <Company> would make them a lot easier |
| <BBB> it's true that queue size, for example, depends on this value |
| <BBB> e.g. if you make this 5s and set queue size to 1s, it'll hang |
| <Company> right |
| <BBB> whereas if you set it to 1s and queue size to 5s, you waste space |
| <BBB> :) |
| <BBB> you ought to set it to max-delay * (n_streams + 1) |
| <BBB> or so |
| <BBB> or -1 |
| <BBB> I forgot |
| <BBB> ohwell |
| <Company> if you'd use filtercaps and queue sizes in your app, you could at least work around deadlocks |
| <BBB> yeah |
| <Company> though ideally it should just work of course... |
| <BBB> good point... |
| |
| |