docs/design/part-negotiation.txt - imx-gstreamer - Git at Google

 Negotiation
 -----------

 Capabilities negotiation is the process of deciding on an adequate
 format for dataflow within a GStreamer pipeline. Ideally, negotiation
 (also known as "capsnego") transfers information from those parts of the
 pipeline that have information to those parts of the pipeline that are
 flexible, constrained by those parts of the pipeline that are not
 flexible.

 GStreamer's two scheduling modes, push mode and pull mode, lend
 themselves to different mechanisms to achieve this goal. As it is more
 common we describe push mode negotiation first.

 Push-mode negotiation
 ~~~~~~~~~~~~~~~~~~~~~

 Push-mode negotiation happens when elements want to push buffers and
 need to decide on the format. This is called downstream negotiation
 because the upstream element decides the format for the downstream
 element. This is the most common case.

 Negotiation can also happen when a downstream element wants to receive
 another data format from an upstream element. This is called upstream
 negotiation.

 The basics of negotiation are as follows:

  - GstCaps (see part-caps.txt) are refcounted before they
    are attached to a buffer to describe the contents of the buffer.
    It is possible to add a NULL caps to a buffer, this means that the
    buffer type did not change relative to the previous buffer. If no
    previous buffer was received by a downstream element, it is free to
    discard the buffer.

  - Before receiving a buffer, an element must check if the datatype of
    the buffer has changed. The element should reconfigure itself to the
    new format before processing the buffer data. If the data type on
    the buffer is not acceptable, the element should refuse the buffer by
    returning an appropriate GST_FLOW_NOT_NEGOTIATED return value from the
    chain function.
    The core will automatically call the set_caps function for this purpose
    when it is installed on the sink or source pad.

  - When requesting a buffer from a bufferpool, the prefered type should
    be passed to the buffer allocation function. After receiving a buffer
    from a bufferpool, the datatype should be checked again.

  - A bufferpool allocation function should try to allocate a buffer of the
    prefered type. If there is a good reason to choose another type, the
    alloc function should see if that other type is accepted by the other
    element, then allocate a buffer of that type and attach the type to the
    buffer before returning it.


 The general flow for a source pad starting the negotiation.

              src              sink
               |                 |
               |  accepts?       |
   type A      |---------------->|
               |      yes        |
               |<----------------|
               |                 |
  get buffer   |  alloc_buf      |
  from pool    |---------------->|
  with type A  |                 | Create buffer of type A.
               |                 |
  check type   |<----------------|
  and use A    |                 |
               |  push           |
  push buffer  |---------------->| Receive type A, reconfigure to
  with new type|                 | process type A.
               |                 |

  One possible implementation in pseudo code:

  [element wants to create a buffer]
  if not format
    # see what the peer can do
    peercaps = gst_pad_peer_get_caps (srcpad)
    # see what we can do
    ourcaps = gst_pad_get_caps (srcpad)

    # get common formats
    candidates = gst_caps_intersect (peercaps, ourcaps)

    foreach candidate in candidates
      # make sure the caps is fixed
      fixedcaps = gst_pad_fixate_caps (srcpad, candidate)

      # see if the peer accepts it
      if gst_pad_peer_accept_caps (srcpad, fixedcaps)
        # store the caps as the negotiated caps, this will
        # call the setcaps function on the pad
        gst_pad_set_caps (srcpad, fixedcaps)
        break
      endif
    done
  endif

  # if the type is different, the buffer will have different caps from
  # the src pad -- setcaps will get called on the pad_push
  buffer = gst_pad_alloc_buffer (srcpad, 0, size, GST_PAD_CAPS (fixedcaps));
  if buffer
    [fill buffer and push]
  elseif
    [no buffer, either no peer or no acceptable format found]
  endif


 The general flow for a sink pad starting a renegotiation.

              src              sink
               |                 |
               |  accepts?       |
               |<----------------| type B
               |      yes        |
               |---------------->|
               |                 |
  get buffer   |  alloc_buf      |
  from pool    |---------------->|
  with type A  |                 | Create buffer of new type B.
               |                 |
  check type   |<----------------|
  and          |                 |
  reconfigure  |                 |
               |  push           |
  push buffer  |---------------->| Receive type B, reconfigure to
  with new type|                 | process type B.
               |                 |


 Use case:


 videotestsrc ! xvimagesink

   1) Who decides what format to use?
    - src pad always decides, by convention. sinkpad can suggest a format
      by putting it high in the getcaps function GstCaps.
    - since the src decides, it can always choose something that it can do,
      so this step can only fail if the sinkpad stated it could accept
      something while later on it couldn't.

   2) When does negotiation happen?
    - before srcpad does a push, it figures out a type as stated in 1), then
      it calls the pad alloc function with the type. The sinkpad has to
      create a buffer of that type, src fills the buffer and sends it to sink.
    - since the sink stated in 1) it could accept the type, it will be able to
      create a buffer of the type and handle it.
    - sink checks media type of buffer and configures itself for this type.

   3) How can sink request another format?
    - sink asks if new format is possible for the source.
    - sink returns buffer with new type in allocfunction.
    - src receives buffer with new type, reconfigures and pushes.
    - sink can always select something it can create and handle since it takes
      the initiative. src should be able to handle the new type since it said
      it could accept it.

 videotestsrc ! queue ! xvimagesink

   - queue implements an allocfunction, proxying all calls to its srcpad peer.
   - queue proxies all accept and getcaps to the other peer pad.
   - queue contains buffers with different types.


 Pull-mode negotiation
 ~~~~~~~~~~~~~~~~~~~~~

 Rationale
 ^^^^^^^^^

 A pipeline in pull mode has different negotiation needs than one
 activated in push mode. Push mode is optimized for two use cases:

  * Playback of media files, in which the demuxers and the decoders are
    the points from which format information should disseminate to the
    rest of the pipeline; and

  * Recording from live sources, in which users are accustomed to putting
    a capsfilter directly after the source element; thus the caps
    information flow proceeds from the user, through the potential caps
    of the source, to the sinks of the pipeline.

 In contrast, pull mode has other typical use cases:

  * Playback from a lossy source, such as RTP, in which more knowledge
    about the latency of the pipeline can increase quality; or

  * Audio synthesis, in which audio APIs are tuned to producing only the
    necessary number of samples, typically driven by a hardware interrupt
    to fill a DMA buffer or a Jack[0] port buffer.

  * Low-latency effects processing, whereby filters should be applied as
    data is transferred from a ring buffer to a sink instead of
    beforehand. For example, instead of using the internal alsasink
    ringbuffer thread in push-mode wavsrc ! volume ! alsasink, placing
    the volume inside the sound card writer thread via wavsrc !
    audioringbuffer ! volume ! alsasink.

 [0] http://jackit.sf.net

 The problem with pull mode is that the sink has to know the format in
 order to know how many bytes to pull via gst_pad_pull_range(). This
 means that before pulling, the sink must initiate negotation to decide
 on a format.

 Recalling the principles of capsnego, whereby information must flow from
 those that have it to those that do not, we see that the two named use
 cases have different negotiation requirements:

  * RTP and low-latency playback are both like the normal playback case,
    in which information flows downstream.

  * In audio synthesis, the part of the pipeline that has the most
    information is the sink, constrained by the capabilities of the graph
    that feeds it. However the caps are not completely specified; at some
    point the user has to intervene to choose the sample rate, at least.
    This can be done externally to gstreamer, as in the jack elements, or
    internally via a capsfilter, as is customary with live sources.

 Given that sinks potentially need the input of sources, as in the RTP
 case and at least as a filter in the synthesis case, there must be a
 negotiation phase before the pull thread is activated. Also, given the
 low latency offered by pull mode, we want to avoid capsnego from within
 the pulling thread, in case it causes us to miss our scheduling
 deadlines.

 The pull thread is usually started in the PAUSED->PLAYING state change. We must
 be able to complete the negotiation before this state change happens.

 The time to do capsnego, then, is after _check_pull_range() has succeeded,
 but before the sink has spawned the pulling thread.


 Mechanism
 ^^^^^^^^^

 The sink determines that the upstream elements support pull based scheduling by
 calling gst_pad_check_pull_range().

 The sink initiates the negotiation process by intersecting the results
 of gst_pad_get_caps() on its sink pad and its peer src pad. This is the
 operation performed by gst_pad_get_allowed_caps(). In the simple
 passthrough case, the peer pad's getcaps() function should return the
 intersection of calling get_allowed_caps() on all of its sink pads. In
 this way the sink element knows the capabilities of the entire pipeline.

 The sink element then fixates the resulting caps, if necessary,
 resulting in the flow caps. It notifies the pipeline of the caps by
 calling gst_pad_set_caps() on its sink pad. From now on, the getcaps function
 of the sinkpad will only return these fixed caps meaning that upstream elements
 will only be able to produce this format.

 If the sink element could not set caps on its sink pad, it should post
 an error message on the bus indicating that negotiation was not
 possible.

 When negotiation succeeded, the sinkpad and all upstream internally linked pads
 are activated in pull mode. Typically, this operation will trigger negotiation
 on the downstream elements, which will now be forced to negotiation to the
 final fixed desired caps of the sinkpad.

 After these steps, the sink element returns ASYNC from the state change
 function. The state will commit to PAUSED when the first buffer is received in
 the sink. This is needed to provide a consistent API to the applications that
 expect ASYNC return values from sinks but it also allows us to perform the
 remainder of the negotiation outside of the context of the pulling thread.

 During dataflow, gst_pad_pull_range() checks the caps on the pulled
 buffer. If they are different from the sink pad's caps, it will return
 GST_FLOW_NOT_NEGOTIATED. Because of the low-latency requirements,
 changing caps in an activate pull-mode pipeline is not supported, as it
 might require e.g. the sound card to reconfigure its hardware buffers,
 and start capsnego again.
	Negotiation
	-----------

	Capabilities negotiation is the process of deciding on an adequate
	format for dataflow within a GStreamer pipeline. Ideally, negotiation
	(also known as "capsnego") transfers information from those parts of the
	pipeline that have information to those parts of the pipeline that are
	flexible, constrained by those parts of the pipeline that are not
	flexible.

	GStreamer's two scheduling modes, push mode and pull mode, lend
	themselves to different mechanisms to achieve this goal. As it is more
	common we describe push mode negotiation first.

	Push-mode negotiation
	~~~~~~~~~~~~~~~~~~~~~

	Push-mode negotiation happens when elements want to push buffers and
	need to decide on the format. This is called downstream negotiation
	because the upstream element decides the format for the downstream
	element. This is the most common case.

	Negotiation can also happen when a downstream element wants to receive
	another data format from an upstream element. This is called upstream
	negotiation.

	The basics of negotiation are as follows:

	- GstCaps (see part-caps.txt) are refcounted before they
	are attached to a buffer to describe the contents of the buffer.
	It is possible to add a NULL caps to a buffer, this means that the
	buffer type did not change relative to the previous buffer. If no
	previous buffer was received by a downstream element, it is free to
	discard the buffer.

	- Before receiving a buffer, an element must check if the datatype of
	the buffer has changed. The element should reconfigure itself to the
	new format before processing the buffer data. If the data type on
	the buffer is not acceptable, the element should refuse the buffer by
	returning an appropriate GST_FLOW_NOT_NEGOTIATED return value from the
	chain function.
	The core will automatically call the set_caps function for this purpose
	when it is installed on the sink or source pad.

	- When requesting a buffer from a bufferpool, the prefered type should
	be passed to the buffer allocation function. After receiving a buffer
	from a bufferpool, the datatype should be checked again.

	- A bufferpool allocation function should try to allocate a buffer of the
	prefered type. If there is a good reason to choose another type, the
	alloc function should see if that other type is accepted by the other
	element, then allocate a buffer of that type and attach the type to the
	buffer before returning it.


	The general flow for a source pad starting the negotiation.

	src sink
	\| \|
	\| accepts? \|
	type A \|---------------->\|
	\| yes \|
	\|<----------------\|
	\| \|
	get buffer \| alloc_buf \|
	from pool \|---------------->\|
	with type A \| \| Create buffer of type A.
	\| \|
	check type \|<----------------\|
	and use A \| \|
	\| push \|
	push buffer \|---------------->\| Receive type A, reconfigure to
	with new type\| \| process type A.
	\| \|

	One possible implementation in pseudo code:

	[element wants to create a buffer]
	if not format
	# see what the peer can do
	peercaps = gst_pad_peer_get_caps (srcpad)
	# see what we can do
	ourcaps = gst_pad_get_caps (srcpad)

	# get common formats
	candidates = gst_caps_intersect (peercaps, ourcaps)

	foreach candidate in candidates
	# make sure the caps is fixed
	fixedcaps = gst_pad_fixate_caps (srcpad, candidate)

	# see if the peer accepts it
	if gst_pad_peer_accept_caps (srcpad, fixedcaps)
	# store the caps as the negotiated caps, this will
	# call the setcaps function on the pad
	gst_pad_set_caps (srcpad, fixedcaps)
	break
	endif
	done
	endif

	# if the type is different, the buffer will have different caps from
	# the src pad -- setcaps will get called on the pad_push
	buffer = gst_pad_alloc_buffer (srcpad, 0, size, GST_PAD_CAPS (fixedcaps));
	if buffer
	[fill buffer and push]
	elseif
	[no buffer, either no peer or no acceptable format found]
	endif


	The general flow for a sink pad starting a renegotiation.

	src sink
	\| \|
	\| accepts? \|
	\|<----------------\| type B
	\| yes \|
	\|---------------->\|
	\| \|
	get buffer \| alloc_buf \|
	from pool \|---------------->\|
	with type A \| \| Create buffer of new type B.
	\| \|
	check type \|<----------------\|
	and \| \|
	reconfigure \| \|
	\| push \|
	push buffer \|---------------->\| Receive type B, reconfigure to
	with new type\| \| process type B.
	\| \|



	Use case:


	videotestsrc ! xvimagesink

	1) Who decides what format to use?
	- src pad always decides, by convention. sinkpad can suggest a format
	by putting it high in the getcaps function GstCaps.
	- since the src decides, it can always choose something that it can do,
	so this step can only fail if the sinkpad stated it could accept
	something while later on it couldn't.

	2) When does negotiation happen?
	- before srcpad does a push, it figures out a type as stated in 1), then
	it calls the pad alloc function with the type. The sinkpad has to
	create a buffer of that type, src fills the buffer and sends it to sink.
	- since the sink stated in 1) it could accept the type, it will be able to
	create a buffer of the type and handle it.
	- sink checks media type of buffer and configures itself for this type.

	3) How can sink request another format?
	- sink asks if new format is possible for the source.
	- sink returns buffer with new type in allocfunction.
	- src receives buffer with new type, reconfigures and pushes.
	- sink can always select something it can create and handle since it takes
	the initiative. src should be able to handle the new type since it said
	it could accept it.

	videotestsrc ! queue ! xvimagesink

	- queue implements an allocfunction, proxying all calls to its srcpad peer.
	- queue proxies all accept and getcaps to the other peer pad.
	- queue contains buffers with different types.


	Pull-mode negotiation
	~~~~~~~~~~~~~~~~~~~~~

	Rationale
	^^^^^^^^^

	A pipeline in pull mode has different negotiation needs than one
	activated in push mode. Push mode is optimized for two use cases:

	* Playback of media files, in which the demuxers and the decoders are
	the points from which format information should disseminate to the
	rest of the pipeline; and

	* Recording from live sources, in which users are accustomed to putting
	a capsfilter directly after the source element; thus the caps
	information flow proceeds from the user, through the potential caps
	of the source, to the sinks of the pipeline.

	In contrast, pull mode has other typical use cases:

	* Playback from a lossy source, such as RTP, in which more knowledge
	about the latency of the pipeline can increase quality; or

	* Audio synthesis, in which audio APIs are tuned to producing only the
	necessary number of samples, typically driven by a hardware interrupt
	to fill a DMA buffer or a Jack[0] port buffer.

	* Low-latency effects processing, whereby filters should be applied as
	data is transferred from a ring buffer to a sink instead of
	beforehand. For example, instead of using the internal alsasink
	ringbuffer thread in push-mode wavsrc ! volume ! alsasink, placing
	the volume inside the sound card writer thread via wavsrc !
	audioringbuffer ! volume ! alsasink.

	[0] http://jackit.sf.net

	The problem with pull mode is that the sink has to know the format in
	order to know how many bytes to pull via gst_pad_pull_range(). This
	means that before pulling, the sink must initiate negotation to decide
	on a format.

	Recalling the principles of capsnego, whereby information must flow from
	those that have it to those that do not, we see that the two named use
	cases have different negotiation requirements:

	* RTP and low-latency playback are both like the normal playback case,
	in which information flows downstream.

	* In audio synthesis, the part of the pipeline that has the most
	information is the sink, constrained by the capabilities of the graph
	that feeds it. However the caps are not completely specified; at some
	point the user has to intervene to choose the sample rate, at least.
	This can be done externally to gstreamer, as in the jack elements, or
	internally via a capsfilter, as is customary with live sources.

	Given that sinks potentially need the input of sources, as in the RTP
	case and at least as a filter in the synthesis case, there must be a
	negotiation phase before the pull thread is activated. Also, given the
	low latency offered by pull mode, we want to avoid capsnego from within
	the pulling thread, in case it causes us to miss our scheduling
	deadlines.

	The pull thread is usually started in the PAUSED->PLAYING state change. We must
	be able to complete the negotiation before this state change happens.

	The time to do capsnego, then, is after _check_pull_range() has succeeded,
	but before the sink has spawned the pulling thread.


	Mechanism
	^^^^^^^^^

	The sink determines that the upstream elements support pull based scheduling by
	calling gst_pad_check_pull_range().

	The sink initiates the negotiation process by intersecting the results
	of gst_pad_get_caps() on its sink pad and its peer src pad. This is the
	operation performed by gst_pad_get_allowed_caps(). In the simple
	passthrough case, the peer pad's getcaps() function should return the
	intersection of calling get_allowed_caps() on all of its sink pads. In
	this way the sink element knows the capabilities of the entire pipeline.

	The sink element then fixates the resulting caps, if necessary,
	resulting in the flow caps. It notifies the pipeline of the caps by
	calling gst_pad_set_caps() on its sink pad. From now on, the getcaps function
	of the sinkpad will only return these fixed caps meaning that upstream elements
	will only be able to produce this format.

	If the sink element could not set caps on its sink pad, it should post
	an error message on the bus indicating that negotiation was not
	possible.

	When negotiation succeeded, the sinkpad and all upstream internally linked pads
	are activated in pull mode. Typically, this operation will trigger negotiation
	on the downstream elements, which will now be forced to negotiation to the
	final fixed desired caps of the sinkpad.

	After these steps, the sink element returns ASYNC from the state change
	function. The state will commit to PAUSED when the first buffer is received in
	the sink. This is needed to provide a consistent API to the applications that
	expect ASYNC return values from sinks but it also allows us to perform the
	remainder of the negotiation outside of the context of the pulling thread.

	During dataflow, gst_pad_pull_range() checks the caps on the pulled
	buffer. If they are different from the sink pad's caps, it will return
	GST_FLOW_NOT_NEGOTIATED. Because of the low-latency requirements,
	changing caps in an activate pull-mode pipeline is not supported, as it
	might require e.g. the sound card to reconfigure its hardware buffers,
	and start capsnego again.