| I'll use this doc to describe how I think media info should work from the |
| perspective of the application developer and end user, and from that |
| extrapolate what we need to provide that. |
| |
| RATIONALE |
| --------- |
| One of the strong points of GStreamer is that it abstracts library dependencies |
| away. A user is free to choose whatever plug-ins he has, and a developer |
| can code to the general API that GStreamer provides without having to deal |
| with the underlying codecs. |
| |
| It is important that GStreamer also handles media info well and efficiently, |
| since more often than not the same libraries are needed to do this. So |
| to avoid applications depending on these libs just to do the media info, |
| we should make sure GStreamer provides a reasonable and fast abstraction |
| for this as well. |
| |
| GOALS |
| ----- |
| - quickly read and write "tags" |
| - quickly read stream metadata (technical properties, length, audio props, ...) |
| - cache both kinds of data transparently |
| - (possibly) provide bins that do this |
| - provide a simple API to do this |
| |
| DEFINITION OF TERMS |
| ------------------- |
| The user or developer using GStreamer is interested in all information that |
| describes the stream. The library handles these two types differently |
| however, so I will use the following terms to describe this : |
| |
| - metadata : |
| every kind of information that is tied to the "concept" of the stream, |
| and not tied to the actual encoding or representation of the stream. |
| - it can be altered without transcoding the stream |
| - it would stay the same for different encodings of the file |
| - describes properties of the information encoded into the stream |
| - examples: |
| - artist, title, author |
| - year, track order, album |
| - comments |
| |
| - mediainfo |
| every kind of information that is tied to the "codec" used. |
| - cannot be altered without transcoding |
| - is the set of parameters the stream has been encoded with |
| - describes properties of the encoded stream itself |
| - examples: |
| - bitrate targets (e.g. nominal), encoding mode (e.g. joint stereo) |
| - to this we also add "bitrate", but we query this through the pad_query |
| interface |
| |
| - format |
| every kind of information that is tied to the "raw" bitstream |
| - cannot be altered without decoding and changing the raw bitstream |
| - examples: |
| - samplerate, bit depth/width, channels |
| - length in time |
| - video size, frames per second, colorspace used |
| - the format is queried by getting the GstCaps of the pad that sources |
| the buffers |
| |
| - length in time and tracks for the whole stream |
| - gotten through pad queries |
| - stored in variables in the struct |
| |
| - immediate info |
| - examples: |
| - position in time |
| - current bitrate |
| |
| - tracks : |
| a media file or stream can contain multiple consecutive streams, which |
| we will call "tracks". GStreamer has a format for track used in querying |
| and seeking as well. |
| A track should be thought of as the whole of one single piece of media |
| inside a physical stream. |
| A track can have at most one set of tags, and has fixed "raw" properties. |
| |
| EXAMPLE PIPELINES |
| ----------------- |
| reading metadata : filesrc ! id3v1 |
| - would read metadata from file |
| - id3v1 immediately causes filesrc to seek until it has found |
| - the (first) metadata |
| - that there is no metadata present |
| - id3v1 sends out a property notification with name "metadata" and |
| a GstCaps structure |
| |
| resetting and writing content metadata : |
| id3v1 reset=true artist="Arid" ! filesink |
| |
| - effect: clear the current tag and reset it to only have Arid as artist |
| - id3v1 seeks to the right location, clears the tag, and writes the new one |
| |
| COST |
| ---- |
| Querying media info can be expensive. |
| Any application querying for media info should take this into account and |
| make sure that it doesn't block the app unnecessarily while the querying |
| happens. |
| |
| The app should create an object, hand it a bunch of locations to query, |
| and connect to the signal the app is going to send out. |
| |
| In most cases, querying content data should be fast since it doesn't involve |
| decoding |
| |
| Technical data could be harder and thus might be better done only when needed. |
| |
| CACHE |
| ----- |
| Getting media info can be an expensive operation. It makes sense to cache |
| the dia info queried on-disk to provide rapid access to this data. |
| It is important however that this is done transparently - the system should |
| be able to keep working without it, or keep working when you delete this cache. |
| |
| The API would provide a function like |
| gst_media_info_read_cached (media_info, location, |
| GST_MEDIA_INFO_METADATA, |
| GST_MEDIA_INFO_READ_CACHED); |
| |
| to try and get the cached metadata using the media info object. |
| |
| - check if the file is cached in the media info cache |
| - if no, then read the media info and store it in the cache |
| - if yes, then check the file against it's timestamp (or (part of) md5sum ?) |
| - if it was changed, force a new read and store it in the cache |
| - if it wasn't changed, just return the cached media info |
| |
| |
| For optimizations, it might also make sense to do |
| GList * gst_metadata_read_many (media_info, GList *locations, ...) |
| |
| which would allow the back-end to implement this more efficiently. |
| Suppose an application loads a playlist, for example, then this playlist |
| could be handed to this function, and a GList of metadata types could |
| be returned. |
| |
| Possible implementations : |
| - one large XML file : would end up being too heavy |
| - one XML file per dir on system : good compromise; would still make sense |
| to keep this in memory instead of reading and writing it all the time |
| Also, odds are good that users mostly use files from same dir in one app |
| (but not necessarily) |
| |
| Possible extra niceties : |
| - matching of moved files, and a similar move of metadata (through user-space |
| tool ?) |
| |
| !!! For speed reasons, it might make sense to somehow keep the cache in memory |
| instead of reparsing the same cache file each time. |
| |
| !!! For disk space reasons, it might make sense to have a system cache. |
| Not sure if the complexity added is worth it though. |
| |
| !!! For disk space reasons, we might want to add an upper limit on the size of |
| the cache. For that we might need a timestamp on last retrieval of metadata, |
| so that we can drop the old ones. |
| |
| The cache should use standard glibc. |
| FIXME: is it worth it to use gnome-vfs for this ? |
| |
| STANDARDIZATION OF MEDIAINFO |
| ---------------------------- |
| Different file formats have different "tags". It is not always possible |
| to map metadata to tags. Some level of agreement on metadata names is also |
| required. |
| |
| For media info, the names or properties should be fairly standard. |
| We also use the same names as used for properties and capabilities in |
| GStreamer. |
| |
| This means we use |
| - encoded audio |
| - "bitrate" (which is bits per second - use the most correct one, |
| ie. average bitrate for VBR for example) |
| |
| - raw audio |
| - "samplerate" - sampling frequency |
| - "channels" |
| - "bitwidth" - how wide is the audio in bits |
| |
| - encoded video |
| - "bitrate" |
| |
| - raw video |
| (FIXME: I don't know enough about video, are these correct) |
| - "width" |
| - "height" |
| - "colordepth" |
| - "colorspace" |
| - "fps" |
| - "aspectratio" |
| |
| We must find a way to avoid collision. A system stream can contain both |
| audio and video (-> bitrate) or multiple audio or video streams. One way |
| to do this might be to make a metadata set for a stream a GList of metadata |
| for elementary streams. |
| |
| For metadata and tags, the standards are less clear. |
| Some nice ones to standardize on might be |
| - artist |
| - title |
| - author |
| - year |
| - genre (messy though) |
| - RMS, inpoint, outpoint (calculated through some formula, used for mixing) |
| |
| TESTING |
| ------- |
| It is important to write a decent testsuite for this and do speed comparisons |
| between the library used and the GStreamer implementation. |
| |
| |
| API |
| --- |
| struct GstMetadata |
| { |
| gchar *location; |
| GstMetadataType type; |
| |
| GList *streams; |
| GHashtable *values; |
| }; |
| |
| (streams would be a GList of (again) GstMetadata's. |
| "location" would then be reused to indicate an identifier in the stream. |
| FIXME: is that evil ?) |
| |
| GstMetadataType - technical, content |
| GstMetadataReadType - cached, raw |
| |
| GstMetadata * |
| gst_metadata_read (const char *location, |
| GstMetadataType type, |
| GstMetadataReadType read_type); |
| GstMetadata * |
| gst_metadata_read_props (const char *location, |
| GList *names, |
| GstMetadataType type, |
| GstMetadataReadType read_type); |
| GstMetadata * |
| gst_metadata_read_cached (const char *location, |
| GstMetadataType type, |
| GstMetadataReadType read_type); |
| |
| GstMetadata * |
| gst_metadata_read_props_cached (...) |
| |
| GList * |
| gst_metadata_read_cached_many (GList *locations, |
| GstMetadataType type, |
| GstMetadataReadType read_type); |
| |
| GList * |
| gst_metadata_read_props_cached_many (GList *locations, |
| GList *names, |
| GstMetadataType type, |
| GstMetadataReadType read_type); |
| |
| GList * |
| gst_metadata_content_write (const char *location, |
| GstMetadata *metadata); |
| |
| |
| SOME USEFUL RESOURCES |
| --------------------- |
| http://www.chin.gc.ca/English/Standards/metadata_multimedia.html |
| - describes multimedia data for images |
| distinction between content (descriptive), technical and |
| administrative metadata |