| ===================== | 
 | DRM Memory Management | 
 | ===================== | 
 |  | 
 | Modern Linux systems require large amount of graphics memory to store | 
 | frame buffers, textures, vertices and other graphics-related data. Given | 
 | the very dynamic nature of many of that data, managing graphics memory | 
 | efficiently is thus crucial for the graphics stack and plays a central | 
 | role in the DRM infrastructure. | 
 |  | 
 | The DRM core includes two memory managers, namely Translation Table Maps | 
 | (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory | 
 | manager to be developed and tried to be a one-size-fits-them all | 
 | solution. It provides a single userspace API to accommodate the need of | 
 | all hardware, supporting both Unified Memory Architecture (UMA) devices | 
 | and devices with dedicated video RAM (i.e. most discrete video cards). | 
 | This resulted in a large, complex piece of code that turned out to be | 
 | hard to use for driver development. | 
 |  | 
 | GEM started as an Intel-sponsored project in reaction to TTM's | 
 | complexity. Its design philosophy is completely different: instead of | 
 | providing a solution to every graphics memory-related problems, GEM | 
 | identified common code between drivers and created a support library to | 
 | share it. GEM has simpler initialization and execution requirements than | 
 | TTM, but has no video RAM management capabilities and is thus limited to | 
 | UMA devices. | 
 |  | 
 | The Translation Table Manager (TTM) | 
 | ----------------------------------- | 
 |  | 
 | TTM design background and information belongs here. | 
 |  | 
 | TTM initialization | 
 | ~~~~~~~~~~~~~~~~~~ | 
 |  | 
 |     **Warning** | 
 |  | 
 |     This section is outdated. | 
 |  | 
 | Drivers wishing to support TTM must fill out a drm_bo_driver | 
 | structure. The structure contains several fields with function pointers | 
 | for initializing the TTM, allocating and freeing memory, waiting for | 
 | command completion and fence synchronization, and memory migration. See | 
 | the radeon_ttm.c file for an example of usage. | 
 |  | 
 | The ttm_global_reference structure is made up of several fields: | 
 |  | 
 | :: | 
 |  | 
 |               struct ttm_global_reference { | 
 |                       enum ttm_global_types global_type; | 
 |                       size_t size; | 
 |                       void *object; | 
 |                       int (*init) (struct ttm_global_reference *); | 
 |                       void (*release) (struct ttm_global_reference *); | 
 |               }; | 
 |  | 
 |  | 
 | There should be one global reference structure for your memory manager | 
 | as a whole, and there will be others for each object created by the | 
 | memory manager at runtime. Your global TTM should have a type of | 
 | TTM_GLOBAL_TTM_MEM. The size field for the global object should be | 
 | sizeof(struct ttm_mem_global), and the init and release hooks should | 
 | point at your driver-specific init and release routines, which probably | 
 | eventually call ttm_mem_global_init and ttm_mem_global_release, | 
 | respectively. | 
 |  | 
 | Once your global TTM accounting structure is set up and initialized by | 
 | calling ttm_global_item_ref() on it, you need to create a buffer | 
 | object TTM to provide a pool for buffer object allocation by clients and | 
 | the kernel itself. The type of this object should be | 
 | TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct | 
 | ttm_bo_global). Again, driver-specific init and release functions may | 
 | be provided, likely eventually calling ttm_bo_global_init() and | 
 | ttm_bo_global_release(), respectively. Also, like the previous | 
 | object, ttm_global_item_ref() is used to create an initial reference | 
 | count for the TTM, which will call your initialization function. | 
 |  | 
 | The Graphics Execution Manager (GEM) | 
 | ------------------------------------ | 
 |  | 
 | The GEM design approach has resulted in a memory manager that doesn't | 
 | provide full coverage of all (or even all common) use cases in its | 
 | userspace or kernel API. GEM exposes a set of standard memory-related | 
 | operations to userspace and a set of helper functions to drivers, and | 
 | let drivers implement hardware-specific operations with their own | 
 | private API. | 
 |  | 
 | The GEM userspace API is described in the `GEM - the Graphics Execution | 
 | Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While | 
 | slightly outdated, the document provides a good overview of the GEM API | 
 | principles. Buffer allocation and read and write operations, described | 
 | as part of the common GEM API, are currently implemented using | 
 | driver-specific ioctls. | 
 |  | 
 | GEM is data-agnostic. It manages abstract buffer objects without knowing | 
 | what individual buffers contain. APIs that require knowledge of buffer | 
 | contents or purpose, such as buffer allocation or synchronization | 
 | primitives, are thus outside of the scope of GEM and must be implemented | 
 | using driver-specific ioctls. | 
 |  | 
 | On a fundamental level, GEM involves several operations: | 
 |  | 
 | -  Memory allocation and freeing | 
 | -  Command execution | 
 | -  Aperture management at command execution time | 
 |  | 
 | Buffer object allocation is relatively straightforward and largely | 
 | provided by Linux's shmem layer, which provides memory to back each | 
 | object. | 
 |  | 
 | Device-specific operations, such as command execution, pinning, buffer | 
 | read & write, mapping, and domain ownership transfers are left to | 
 | driver-specific ioctls. | 
 |  | 
 | GEM Initialization | 
 | ~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | Drivers that use GEM must set the DRIVER_GEM bit in the struct | 
 | :c:type:`struct drm_driver <drm_driver>` driver_features | 
 | field. The DRM core will then automatically initialize the GEM core | 
 | before calling the load operation. Behind the scene, this will create a | 
 | DRM Memory Manager object which provides an address space pool for | 
 | object allocation. | 
 |  | 
 | In a KMS configuration, drivers need to allocate and initialize a | 
 | command ring buffer following core GEM initialization if required by the | 
 | hardware. UMA devices usually have what is called a "stolen" memory | 
 | region, which provides space for the initial framebuffer and large, | 
 | contiguous memory regions required by the device. This space is | 
 | typically not managed by GEM, and must be initialized separately into | 
 | its own DRM MM object. | 
 |  | 
 | GEM Objects Creation | 
 | ~~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | GEM splits creation of GEM objects and allocation of the memory that | 
 | backs them in two distinct operations. | 
 |  | 
 | GEM objects are represented by an instance of struct :c:type:`struct | 
 | drm_gem_object <drm_gem_object>`. Drivers usually need to | 
 | extend GEM objects with private information and thus create a | 
 | driver-specific GEM object structure type that embeds an instance of | 
 | struct :c:type:`struct drm_gem_object <drm_gem_object>`. | 
 |  | 
 | To create a GEM object, a driver allocates memory for an instance of its | 
 | specific GEM object type and initializes the embedded struct | 
 | :c:type:`struct drm_gem_object <drm_gem_object>` with a call | 
 | to :c:func:`drm_gem_object_init()`. The function takes a pointer | 
 | to the DRM device, a pointer to the GEM object and the buffer object | 
 | size in bytes. | 
 |  | 
 | GEM uses shmem to allocate anonymous pageable memory. | 
 | :c:func:`drm_gem_object_init()` will create an shmfs file of the | 
 | requested size and store it into the struct :c:type:`struct | 
 | drm_gem_object <drm_gem_object>` filp field. The memory is | 
 | used as either main storage for the object when the graphics hardware | 
 | uses system memory directly or as a backing store otherwise. | 
 |  | 
 | Drivers are responsible for the actual physical pages allocation by | 
 | calling :c:func:`shmem_read_mapping_page_gfp()` for each page. | 
 | Note that they can decide to allocate pages when initializing the GEM | 
 | object, or to delay allocation until the memory is needed (for instance | 
 | when a page fault occurs as a result of a userspace memory access or | 
 | when the driver needs to start a DMA transfer involving the memory). | 
 |  | 
 | Anonymous pageable memory allocation is not always desired, for instance | 
 | when the hardware requires physically contiguous system memory as is | 
 | often the case in embedded devices. Drivers can create GEM objects with | 
 | no shmfs backing (called private GEM objects) by initializing them with | 
 | a call to :c:func:`drm_gem_private_object_init()` instead of | 
 | :c:func:`drm_gem_object_init()`. Storage for private GEM objects | 
 | must be managed by drivers. | 
 |  | 
 | GEM Objects Lifetime | 
 | ~~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | All GEM objects are reference-counted by the GEM core. References can be | 
 | acquired and release by :c:func:`calling | 
 | drm_gem_object_reference()` and | 
 | :c:func:`drm_gem_object_unreference()` respectively. The caller | 
 | must hold the :c:type:`struct drm_device <drm_device>` | 
 | struct_mutex lock when calling | 
 | :c:func:`drm_gem_object_reference()`. As a convenience, GEM | 
 | provides :c:func:`drm_gem_object_unreference_unlocked()` | 
 | functions that can be called without holding the lock. | 
 |  | 
 | When the last reference to a GEM object is released the GEM core calls | 
 | the :c:type:`struct drm_driver <drm_driver>` gem_free_object | 
 | operation. That operation is mandatory for GEM-enabled drivers and must | 
 | free the GEM object and all associated resources. | 
 |  | 
 | void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are | 
 | responsible for freeing all GEM object resources. This includes the | 
 | resources created by the GEM core, which need to be released with | 
 | :c:func:`drm_gem_object_release()`. | 
 |  | 
 | GEM Objects Naming | 
 | ~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | Communication between userspace and the kernel refers to GEM objects | 
 | using local handles, global names or, more recently, file descriptors. | 
 | All of those are 32-bit integer values; the usual Linux kernel limits | 
 | apply to the file descriptors. | 
 |  | 
 | GEM handles are local to a DRM file. Applications get a handle to a GEM | 
 | object through a driver-specific ioctl, and can use that handle to refer | 
 | to the GEM object in other standard or driver-specific ioctls. Closing a | 
 | DRM file handle frees all its GEM handles and dereferences the | 
 | associated GEM objects. | 
 |  | 
 | To create a handle for a GEM object drivers call | 
 | :c:func:`drm_gem_handle_create()`. The function takes a pointer | 
 | to the DRM file and the GEM object and returns a locally unique handle. | 
 | When the handle is no longer needed drivers delete it with a call to | 
 | :c:func:`drm_gem_handle_delete()`. Finally the GEM object | 
 | associated with a handle can be retrieved by a call to | 
 | :c:func:`drm_gem_object_lookup()`. | 
 |  | 
 | Handles don't take ownership of GEM objects, they only take a reference | 
 | to the object that will be dropped when the handle is destroyed. To | 
 | avoid leaking GEM objects, drivers must make sure they drop the | 
 | reference(s) they own (such as the initial reference taken at object | 
 | creation time) as appropriate, without any special consideration for the | 
 | handle. For example, in the particular case of combined GEM object and | 
 | handle creation in the implementation of the dumb_create operation, | 
 | drivers must drop the initial reference to the GEM object before | 
 | returning the handle. | 
 |  | 
 | GEM names are similar in purpose to handles but are not local to DRM | 
 | files. They can be passed between processes to reference a GEM object | 
 | globally. Names can't be used directly to refer to objects in the DRM | 
 | API, applications must convert handles to names and names to handles | 
 | using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls | 
 | respectively. The conversion is handled by the DRM core without any | 
 | driver-specific support. | 
 |  | 
 | GEM also supports buffer sharing with dma-buf file descriptors through | 
 | PRIME. GEM-based drivers must use the provided helpers functions to | 
 | implement the exporting and importing correctly. See ?. Since sharing | 
 | file descriptors is inherently more secure than the easily guessable and | 
 | global GEM names it is the preferred buffer sharing mechanism. Sharing | 
 | buffers through GEM names is only supported for legacy userspace. | 
 | Furthermore PRIME also allows cross-device buffer sharing since it is | 
 | based on dma-bufs. | 
 |  | 
 | GEM Objects Mapping | 
 | ~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | Because mapping operations are fairly heavyweight GEM favours | 
 | read/write-like access to buffers, implemented through driver-specific | 
 | ioctls, over mapping buffers to userspace. However, when random access | 
 | to the buffer is needed (to perform software rendering for instance), | 
 | direct access to the object can be more efficient. | 
 |  | 
 | The mmap system call can't be used directly to map GEM objects, as they | 
 | don't have their own file handle. Two alternative methods currently | 
 | co-exist to map GEM objects to userspace. The first method uses a | 
 | driver-specific ioctl to perform the mapping operation, calling | 
 | :c:func:`do_mmap()` under the hood. This is often considered | 
 | dubious, seems to be discouraged for new GEM-enabled drivers, and will | 
 | thus not be described here. | 
 |  | 
 | The second method uses the mmap system call on the DRM file handle. void | 
 | \*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t | 
 | offset); DRM identifies the GEM object to be mapped by a fake offset | 
 | passed through the mmap offset argument. Prior to being mapped, a GEM | 
 | object must thus be associated with a fake offset. To do so, drivers | 
 | must call :c:func:`drm_gem_create_mmap_offset()` on the object. | 
 |  | 
 | Once allocated, the fake offset value must be passed to the application | 
 | in a driver-specific way and can then be used as the mmap offset | 
 | argument. | 
 |  | 
 | The GEM core provides a helper method :c:func:`drm_gem_mmap()` to | 
 | handle object mapping. The method can be set directly as the mmap file | 
 | operation handler. It will look up the GEM object based on the offset | 
 | value and set the VMA operations to the :c:type:`struct drm_driver | 
 | <drm_driver>` gem_vm_ops field. Note that | 
 | :c:func:`drm_gem_mmap()` doesn't map memory to userspace, but | 
 | relies on the driver-provided fault handler to map pages individually. | 
 |  | 
 | To use :c:func:`drm_gem_mmap()`, drivers must fill the struct | 
 | :c:type:`struct drm_driver <drm_driver>` gem_vm_ops field | 
 | with a pointer to VM operations. | 
 |  | 
 | struct vm_operations_struct \*gem_vm_ops struct | 
 | vm_operations_struct { void (\*open)(struct vm_area_struct \* area); | 
 | void (\*close)(struct vm_area_struct \* area); int (\*fault)(struct | 
 | vm_area_struct \*vma, struct vm_fault \*vmf); }; | 
 |  | 
 | The open and close operations must update the GEM object reference | 
 | count. Drivers can use the :c:func:`drm_gem_vm_open()` and | 
 | :c:func:`drm_gem_vm_close()` helper functions directly as open | 
 | and close handlers. | 
 |  | 
 | The fault operation handler is responsible for mapping individual pages | 
 | to userspace when a page fault occurs. Depending on the memory | 
 | allocation scheme, drivers can allocate pages at fault time, or can | 
 | decide to allocate memory for the GEM object at the time the object is | 
 | created. | 
 |  | 
 | Drivers that want to map the GEM object upfront instead of handling page | 
 | faults can implement their own mmap file operation handler. | 
 |  | 
 | Memory Coherency | 
 | ~~~~~~~~~~~~~~~~ | 
 |  | 
 | When mapped to the device or used in a command buffer, backing pages for | 
 | an object are flushed to memory and marked write combined so as to be | 
 | coherent with the GPU. Likewise, if the CPU accesses an object after the | 
 | GPU has finished rendering to the object, then the object must be made | 
 | coherent with the CPU's view of memory, usually involving GPU cache | 
 | flushing of various kinds. This core CPU<->GPU coherency management is | 
 | provided by a device-specific ioctl, which evaluates an object's current | 
 | domain and performs any necessary flushing or synchronization to put the | 
 | object into the desired coherency domain (note that the object may be | 
 | busy, i.e. an active render target; in that case, setting the domain | 
 | blocks the client and waits for rendering to complete before performing | 
 | any necessary flushing operations). | 
 |  | 
 | Command Execution | 
 | ~~~~~~~~~~~~~~~~~ | 
 |  | 
 | Perhaps the most important GEM function for GPU devices is providing a | 
 | command execution interface to clients. Client programs construct | 
 | command buffers containing references to previously allocated memory | 
 | objects, and then submit them to GEM. At that point, GEM takes care to | 
 | bind all the objects into the GTT, execute the buffer, and provide | 
 | necessary synchronization between clients accessing the same buffers. | 
 | This often involves evicting some objects from the GTT and re-binding | 
 | others (a fairly expensive operation), and providing relocation support | 
 | which hides fixed GTT offsets from clients. Clients must take care not | 
 | to submit command buffers that reference more objects than can fit in | 
 | the GTT; otherwise, GEM will reject them and no rendering will occur. | 
 | Similarly, if several objects in the buffer require fence registers to | 
 | be allocated for correct rendering (e.g. 2D blits on pre-965 chips), | 
 | care must be taken not to require more fence registers than are | 
 | available to the client. Such resource management should be abstracted | 
 | from the client in libdrm. | 
 |  | 
 | GEM Function Reference | 
 | ---------------------- | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_gem.c | 
 |    :export: | 
 |  | 
 | .. kernel-doc:: include/drm/drm_gem.h | 
 |    :internal: | 
 |  | 
 | VMA Offset Manager | 
 | ------------------ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | 
 |    :doc: vma offset manager | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | 
 |    :export: | 
 |  | 
 | .. kernel-doc:: include/drm/drm_vma_manager.h | 
 |    :internal: | 
 |  | 
 | PRIME Buffer Sharing | 
 | -------------------- | 
 |  | 
 | PRIME is the cross device buffer sharing framework in drm, originally | 
 | created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME | 
 | buffers are dma-buf based file descriptors. | 
 |  | 
 | Overview and Driver Interface | 
 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | Similar to GEM global names, PRIME file descriptors are also used to | 
 | share buffer objects across processes. They offer additional security: | 
 | as file descriptors must be explicitly sent over UNIX domain sockets to | 
 | be shared between applications, they can't be guessed like the globally | 
 | unique GEM names. | 
 |  | 
 | Drivers that support the PRIME API must set the DRIVER_PRIME bit in the | 
 | struct :c:type:`struct drm_driver <drm_driver>` | 
 | driver_features field, and implement the prime_handle_to_fd and | 
 | prime_fd_to_handle operations. | 
 |  | 
 | int (\*prime_handle_to_fd)(struct drm_device \*dev, struct drm_file | 
 | \*file_priv, uint32_t handle, uint32_t flags, int \*prime_fd); int | 
 | (\*prime_fd_to_handle)(struct drm_device \*dev, struct drm_file | 
 | \*file_priv, int prime_fd, uint32_t \*handle); Those two operations | 
 | convert a handle to a PRIME file descriptor and vice versa. Drivers must | 
 | use the kernel dma-buf buffer sharing framework to manage the PRIME file | 
 | descriptors. Similar to the mode setting API PRIME is agnostic to the | 
 | underlying buffer object manager, as long as handles are 32bit unsigned | 
 | integers. | 
 |  | 
 | While non-GEM drivers must implement the operations themselves, GEM | 
 | drivers must use the :c:func:`drm_gem_prime_handle_to_fd()` and | 
 | :c:func:`drm_gem_prime_fd_to_handle()` helper functions. Those | 
 | helpers rely on the driver gem_prime_export and gem_prime_import | 
 | operations to create a dma-buf instance from a GEM object (dma-buf | 
 | exporter role) and to create a GEM object from a dma-buf instance | 
 | (dma-buf importer role). | 
 |  | 
 | struct dma_buf \* (\*gem_prime_export)(struct drm_device \*dev, | 
 | struct drm_gem_object \*obj, int flags); struct drm_gem_object \* | 
 | (\*gem_prime_import)(struct drm_device \*dev, struct dma_buf | 
 | \*dma_buf); These two operations are mandatory for GEM drivers that | 
 | support PRIME. | 
 |  | 
 | PRIME Helper Functions | 
 | ~~~~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | 
 |    :doc: PRIME Helpers | 
 |  | 
 | PRIME Function References | 
 | ------------------------- | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | 
 |    :export: | 
 |  | 
 | DRM MM Range Allocator | 
 | ---------------------- | 
 |  | 
 | Overview | 
 | ~~~~~~~~ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
 |    :doc: Overview | 
 |  | 
 | LRU Scan/Eviction Support | 
 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
 |    :doc: lru scan roaster | 
 |  | 
 | DRM MM Range Allocator Function References | 
 | ------------------------------------------ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
 |    :export: | 
 |  | 
 | .. kernel-doc:: include/drm/drm_mm.h | 
 |    :internal: | 
 |  | 
 | CMA Helper Functions Reference | 
 | ------------------------------ | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | 
 |    :doc: cma helpers | 
 |  | 
 | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | 
 |    :export: | 
 |  | 
 | .. kernel-doc:: include/drm/drm_gem_cma_helper.h | 
 |    :internal: |