docs/random/bbb/subtitles - imx-gstreamer - Git at Google

 Subtitles
 =========

 1. Problem
 GStreamer currently does not support subtitles.

 2. Proposed solution
   - Elements
   - Text-overlay
   - Autoplugging
   - Scheduling
   - Stream selection

 The first thing we'll need is subtitle awareness. I'll focus on AVI/MKV/OGM
 here, because I know how that works. The same methods apply to DVD subtitles
 as well. The matroska demuxer (and Ogg) will need subtitle awareness. For
 AVI, this is not needed. Secondly, we'll need subtitle stream parsers (for
 all popular subtitle formats), that can deal both with parsed streams (MKV,
 OGM) as well as .sub file chunks (AVI). Sample code is available in
 gst-sandbox/textoverlay/.

 Secondly, we'll need a textoverlay filter that can take text and video and
 blits text on video. We have several such elements (e.g. the cairo-based
 element) in gst-plugins already. Those might need some updates to work
 exactly as expected.

 Thirdly, playbin will need to handle all that. We expect subtitle streams
 to end up as subimages or plain text (or xhtml text). Note that playbin
 should also allow access to the unblitted subtitle as text (if available)
 for accessibility purposes.

 A problem popping up is that subtitles are no continuous streams. This is
 especially noticeable in the MKV/OGM case, because there the input of data
 depends on the other streams, so we'll only notice delays inside an element
 when we've received the next data chunk. There are two possible solutions:
 using timestamped filler events or using decoupled subtitle overlay elements
 (bins, probably). The first has as a difficulty that it only works well in
 the AVI/.sub case, where we will notice discontinuities before they become
 problematic. The second is more difficult to implement, but works for both
 cases.
 A) fillers
 Imagine that two subtitles come after each other, with 10 seconds of no-data
 in between. By parsing a .sub file, we would notice immediately and we could
 send a filler event (or empty data) with a timestamp and duration in between.
 B) decoupled
 Imagine this text element:
 ------------------------------
 video ----- | actual element |out
 |        /  -----------------|
 text - -                     |
 ------------------------------
 where the text pad is decoupled, like a queue. When no text data is available,
 the pad will have received no data, and the element will render no subtitles.
 The actual element can be a bin here, containing another subtitle rendering
 element. Disadvantage: it requires threading, and the element itself is (in
 concept) kinda gross. The element can be embedded in playbin to hide this
 fact (i.e. not be available outside the scope of playbin).
 Whichever solution we take, it'll require effort from the implementer.
 Scheduling (process, not implementation) knowledge is assumed.

 Stream selection is a problem that audio has, too. We'll need a solution for
 this at the playback bin level, e.g. playbin. By muting all unused streams
 and dynamically unmuting the selected stream, this is easily solved. Note
 that synchronization needs to be checked in this case. The solution is not
 hard, but someone has to do it.

 3. Written by
 Ronald S. Bultje <rbultje@ronald.bitfreak.net>, Dec. 25th, 2004


 Appendix A: random IRC addition
 <Company> intersting question: would it be a good idea to have a "max-buffer-length" property?
 <Company> that way demuxewrs would now how often they'd need to generate filler events
 <Company> s/now/know/
 <BBB> hm...
 <BBB> I don't think it's good to make that variable
 <Company> dunno
 <Company> (i'm btw always looking at this from the midi perspective, too)
 <Company> (because both subtitles and midi are basically the same in this regard)
 <BBB> and do you mean 'after the stream has advanced <time> and we didn't read a new subtitle in this mkv stream, we should send a filler'?
 <Company> yeah
 <BBB> it goes for avi with large init_delay values, too
 <Company> so you don't need to send fillers every frame
 <BBB> right
 <BBB> cant' we just set that to, for example, 1s?
 <BBB> it's fairly random, but still
 <Company> that's another option, too
 <Company> though you could write all file parsers with max-delay=MAXINT
 <Company> would make them a lot easier
 <BBB> it's true that queue size, for example, depends on this value
 <BBB> e.g. if you make this 5s and set queue size to 1s, it'll hang
 <Company> right
 <BBB> whereas if you set it to 1s and queue size to 5s, you waste space
 <BBB> :)
 <BBB> you ought to set it to max-delay * (n_streams + 1)
 <BBB> or so
 <BBB> or -1
 <BBB> I forgot
 <BBB> ohwell
 <Company> if you'd use filtercaps and queue sizes in your app, you could at least work around deadlocks
 <BBB> yeah
 <Company> though ideally it should just work of course...
 <BBB> good point...
	Subtitles
	=========

	1. Problem
	GStreamer currently does not support subtitles.

	2. Proposed solution
	- Elements
	- Text-overlay
	- Autoplugging
	- Scheduling
	- Stream selection

	The first thing we'll need is subtitle awareness. I'll focus on AVI/MKV/OGM
	here, because I know how that works. The same methods apply to DVD subtitles
	as well. The matroska demuxer (and Ogg) will need subtitle awareness. For
	AVI, this is not needed. Secondly, we'll need subtitle stream parsers (for
	all popular subtitle formats), that can deal both with parsed streams (MKV,
	OGM) as well as .sub file chunks (AVI). Sample code is available in
	gst-sandbox/textoverlay/.

	Secondly, we'll need a textoverlay filter that can take text and video and
	blits text on video. We have several such elements (e.g. the cairo-based
	element) in gst-plugins already. Those might need some updates to work
	exactly as expected.

	Thirdly, playbin will need to handle all that. We expect subtitle streams
	to end up as subimages or plain text (or xhtml text). Note that playbin
	should also allow access to the unblitted subtitle as text (if available)
	for accessibility purposes.

	A problem popping up is that subtitles are no continuous streams. This is
	especially noticeable in the MKV/OGM case, because there the input of data
	depends on the other streams, so we'll only notice delays inside an element
	when we've received the next data chunk. There are two possible solutions:
	using timestamped filler events or using decoupled subtitle overlay elements
	(bins, probably). The first has as a difficulty that it only works well in
	the AVI/.sub case, where we will notice discontinuities before they become
	problematic. The second is more difficult to implement, but works for both
	cases.
	A) fillers
	Imagine that two subtitles come after each other, with 10 seconds of no-data
	in between. By parsing a .sub file, we would notice immediately and we could
	send a filler event (or empty data) with a timestamp and duration in between.
	B) decoupled
	Imagine this text element:
	------------------------------
	video ----- \| actual element \|out
	\| / -----------------\|
	text - - \|
	------------------------------
	where the text pad is decoupled, like a queue. When no text data is available,
	the pad will have received no data, and the element will render no subtitles.
	The actual element can be a bin here, containing another subtitle rendering
	element. Disadvantage: it requires threading, and the element itself is (in
	concept) kinda gross. The element can be embedded in playbin to hide this
	fact (i.e. not be available outside the scope of playbin).
	Whichever solution we take, it'll require effort from the implementer.
	Scheduling (process, not implementation) knowledge is assumed.

	Stream selection is a problem that audio has, too. We'll need a solution for
	this at the playback bin level, e.g. playbin. By muting all unused streams
	and dynamically unmuting the selected stream, this is easily solved. Note
	that synchronization needs to be checked in this case. The solution is not
	hard, but someone has to do it.

	3. Written by
	Ronald S. Bultje <rbultje@ronald.bitfreak.net>, Dec. 25th, 2004


	Appendix A: random IRC addition
	<Company> intersting question: would it be a good idea to have a "max-buffer-length" property?
	<Company> that way demuxewrs would now how often they'd need to generate filler events
	<Company> s/now/know/
	<BBB> hm...
	<BBB> I don't think it's good to make that variable
	<Company> dunno
	<Company> (i'm btw always looking at this from the midi perspective, too)
	<Company> (because both subtitles and midi are basically the same in this regard)
	<BBB> and do you mean 'after the stream has advanced <time> and we didn't read a new subtitle in this mkv stream, we should send a filler'?
	<Company> yeah
	<BBB> it goes for avi with large init_delay values, too
	<Company> so you don't need to send fillers every frame
	<BBB> right
	<BBB> cant' we just set that to, for example, 1s?
	<BBB> it's fairly random, but still
	<Company> that's another option, too
	<Company> though you could write all file parsers with max-delay=MAXINT
	<Company> would make them a lot easier
	<BBB> it's true that queue size, for example, depends on this value
	<BBB> e.g. if you make this 5s and set queue size to 1s, it'll hang
	<Company> right
	<BBB> whereas if you set it to 1s and queue size to 5s, you waste space
	<BBB> :)
	<BBB> you ought to set it to max-delay * (n_streams + 1)
	<BBB> or so
	<BBB> or -1
	<BBB> I forgot
	<BBB> ohwell
	<Company> if you'd use filtercaps and queue sizes in your app, you could at least work around deadlocks
	<BBB> yeah
	<Company> though ideally it should just work of course...
	<BBB> good point...