Skip to content

Commit

Permalink
Merge tag 'drm-intel-gt-next-2021-05-28' of git://anongit.freedesktop…
Browse files Browse the repository at this point in the history
….org/drm/drm-intel into drm-next

UAPI Changes:
- Add reworked uAPI for DG1 behind CONFIG_BROKEN (Matt A, Abdiel)

Driver Changes:

- Fix for Gitlab issues #3293 and #3450:
  Avoid kernel crash on older L-shape memory machines

- Add Wa_14010733141 (VDBox SFC reset) for Gen11+ (Aditya)
- Fix crash in auto_retire active retire callback due to
  misalignment (Stephane)
- Fix overlay active retire callback alignment (Tvrtko)
- Eliminate need to align active retire callbacks (Matt A, Ville,
  Daniel)
- Program FF_MODE2 tuning value for all Gen12 platforms (Caz)
- Add Wa_14011060649 for TGL,RKL,DG1 and ADLS (Swathi)
- Create stolen memory region from local memory on DG1 (CQ)
- Place PD in LMEM on dGFX (Matt A)
- Use WC when default state object is allocated in LMEM (Venkata)
- Determine the coherent map type based on object location (Venkata)
- Use lmem physical addresses for fb_mmap() on discrete (Mohammed)
- Bypass aperture on fbdev when LMEM is available (Anusha)
- Return error value when displayable BO not in LMEM for dGFX (Mohammed)
- Do release kernel context if breadcrumb measure fails (Janusz)
- Hide modparams for compiled-out features (Tvrtko)
- Apply Wa_22010271021 for all Gen11 platforms (Caz)
- Fix unlikely ref count race in arming the watchdog timer (Tvrtko)
- Check actual RC6 enable status in PMU (Tvrtko)
- Fix a double free in gen8_preallocate_top_level_pdp (Lv)
- Use trylock in shrinker for GGTT on BSW VT-d and BXT (Maarten)
- Remove erroneous i915_is_ggtt check for
  I915_GEM_OBJECT_UNBIND_VM_TRYLOCK (Maarten)

- Convert uAPI headers to real kerneldoc (Matt A)
- Clean up kerneldoc warnings headers (Matt A, Maarten)
- Fail driver if LMEM training failed (Matt R)
- Avoid div-by-zero on Gen2 (Ville)
- Read C0DRB3/C1DRB3 as 16 bits again and add _BW suffix (Ville)
- Remove reference to struct drm_device.pdev (Thomas)
- Increase separation between GuC and execlists code (Chris, Matt B)

- Use might_alloc() (Bernard)
- Split DGFX_FEATURES from GEN12_FEATURES (Lucas)
- Deduplicate Wa_22010271021 programming on (Jose)
- Drop duplicate WaDisable4x2SubspanOptimization:hsw (Tvrtko)
- Selftest improvements (Chris, Hsin-Yi, Tvrtko)
- Shuffle around init_memory_region for stolen (Matt)
- Typo fixes (wengjianfeng)

[airlied: fix conflict with fixes in i915_active.c]
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YLCbBR22BsQ/dpJB@jlahtine-mobl.ger.corp.intel.com
  • Loading branch information
Dave Airlie committed Jun 2, 2021
2 parents 43ed3c6 + 5b26d57 commit ccd1950
Show file tree
Hide file tree
Showing 90 changed files with 2,007 additions and 568 deletions.
8 changes: 8 additions & 0 deletions Documentation/gpu/driver-uapi.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
===============
DRM Driver uAPI
===============

drm/i915 uAPI
=============

.. kernel-doc:: include/uapi/drm/i915_drm.h
1 change: 1 addition & 0 deletions Documentation/gpu/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
drm-kms
drm-kms-helpers
drm-uapi
driver-uapi
drm-client
drivers
backlight
Expand Down
131 changes: 131 additions & 0 deletions Documentation/gpu/rfc/i915_gem_lmem.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
=========================
I915 DG1/LMEM RFC Section
=========================

Upstream plan
=============
For upstream the overall plan for landing all the DG1 stuff and turning it for
real, with all the uAPI bits is:

* Merge basic HW enabling of DG1(still without pciid)
* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
* At this point we can still make changes, but importantly this lets us
start running IGTs which can utilize local-memory in CI
* Convert over to TTM, make sure it all keeps working. Some of the work items:
* TTM shrinker for discrete
* dma_resv_lockitem for full dma_resv_lock, i.e not just trylock
* Use TTM CPU pagefault handler
* Route shmem backend over to TTM SYSTEM for discrete
* TTM purgeable object support
* Move i915 buddy allocator over to TTM
* MMAP ioctl mode(see `I915 MMAP`_)
* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
* Add pciid for DG1 and turn on uAPI for real

New object placement and region query uAPI
==========================================
Starting from DG1 we need to give userspace the ability to allocate buffers from
device local-memory. Currently the driver supports gem_create, which can place
buffers in system memory via shmem, and the usual assortment of other
interfaces, like dumb buffers and userptr.

To support this new capability, while also providing a uAPI which will work
beyond just DG1, we propose to offer three new bits of uAPI:

DRM_I915_QUERY_MEMORY_REGIONS
-----------------------------
New query ID which allows userspace to discover the list of supported memory
regions(like system-memory and local-memory) for a given device. We identify
each region with a class and instance pair, which should be unique. The class
here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
like DG1.

Side note: The class/instance design is borrowed from our existing engine uAPI,
where we describe every physical engine in terms of its class, and the
particular instance, since we can have more than one per class.

In the future we also want to expose more information which can further
describe the capabilities of a region.

.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions

GEM_CREATE_EXT
--------------
New ioctl which is basically just gem_create but now allows userspace to provide
a chain of possible extensions. Note that if we don't provide any extensions and
set flags=0 then we get the exact same behaviour as gem_create.

Side note: We also need to support PXP[1] in the near future, which is also
applicable to integrated platforms, and adds its own gem_create_ext extension,
which basically lets userspace mark a buffer as "protected".

.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_create_ext

I915_GEM_CREATE_EXT_MEMORY_REGIONS
----------------------------------
Implemented as an extension for gem_create_ext, we would now allow userspace to
optionally provide an immutable list of preferred placements at creation time,
in priority order, for a given buffer object. For the placements we expect
them each to use the class/instance encoding, as per the output of the regions
query. Having the list in priority order will be useful in the future when
placing an object, say during eviction.

.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_create_ext_memory_regions

One fair criticism here is that this seems a little over-engineered[2]. If we
just consider DG1 then yes, a simple gem_create.flags or something is totally
all that's needed to tell the kernel to allocate the buffer in local-memory or
whatever. However looking to the future we need uAPI which can also support
upcoming Xe HP multi-tile architecture in a sane way, where there can be
multiple local-memory instances for a given device, and so using both class and
instance in our uAPI to describe regions is desirable, although specifically
for DG1 it's uninteresting, since we only have a single local-memory instance.

Existing uAPI issues
====================
Some potential issues we still need to resolve.

I915 MMAP
---------
In i915 there are multiple ways to MMAP GEM object, including mapping the same
object using different mapping types(WC vs WB), i.e multiple active mmaps per
object. TTM expects one MMAP at most for the lifetime of the object. If it
turns out that we have to backpedal here, there might be some potential
userspace fallout.

I915 SET/GET CACHING
--------------------
In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
DG1 doesn't support non-snooped pcie transactions, so we can just always
allocate as WB for smem-only buffers. If/when our hw gains support for
non-snooped pcie transactions then we must fix this mode at allocation time as
a new GEM extension.

This is related to the mmap problem, because in general (meaning, when we're
not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
allocation mode.

Possible idea is to let the kernel picks the mmap mode for userspace from the
following table:

smem-only: WB. Userspace does not need to call clflush.

smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
memory, and always give userspace a WC mapping. GPU still does snooped access
here(assuming we can't turn it off like on DG1), which is a bit inefficient.

lmem only: always WC

This means on discrete you only get a single mmap mode, all others must be
rejected. That's probably going to be a new default mode or something like
that.

Links
=====
[1] https://patchwork.freedesktop.org/series/86798/

[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
4 changes: 4 additions & 0 deletions Documentation/gpu/rfc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,7 @@ host such documentation:

* Once the code has landed move all the documentation to the right places in
the main core, helper or driver sections.

.. toctree::

i915_gem_lmem.rst
9 changes: 9 additions & 0 deletions drivers/gpu/drm/i915/display/intel_display.c
Original file line number Diff line number Diff line change
Expand Up @@ -11660,11 +11660,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
struct drm_framebuffer *fb;
struct drm_i915_gem_object *obj;
struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
struct drm_i915_private *i915;

obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
if (!obj)
return ERR_PTR(-ENOENT);

/* object is backed with LMEM for discrete */
i915 = to_i915(obj->base.dev);
if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
/* object is "remote", not in local memory */
i915_gem_object_put(obj);
return ERR_PTR(-EREMOTE);
}

fb = intel_framebuffer_create(obj, &mode_cmd);
i915_gem_object_put(obj);

Expand Down
51 changes: 38 additions & 13 deletions drivers/gpu/drm/i915/display/intel_fbdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@
#include <drm/drm_fb_helper.h>
#include <drm/drm_fourcc.h>

#include "gem/i915_gem_lmem.h"

#include "i915_drv.h"
#include "intel_display_types.h"
#include "intel_fbdev.h"
Expand Down Expand Up @@ -137,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
size = mode_cmd.pitches[0] * mode_cmd.height;
size = PAGE_ALIGN(size);

/* If the FB is too big, just don't use it since fbdev is not very
* important and we should probably use that space with FBC or other
* features. */
obj = ERR_PTR(-ENODEV);
if (size * 2 < dev_priv->stolen_usable_size)
obj = i915_gem_object_create_stolen(dev_priv, size);
if (IS_ERR(obj))
obj = i915_gem_object_create_shmem(dev_priv, size);
if (HAS_LMEM(dev_priv)) {
obj = i915_gem_object_create_lmem(dev_priv, size,
I915_BO_ALLOC_CONTIGUOUS);
} else {
/*
* If the FB is too big, just don't use it since fbdev is not very
* important and we should probably use that space with FBC or other
* features.
*/
if (size * 2 < dev_priv->stolen_usable_size)
obj = i915_gem_object_create_stolen(dev_priv, size);
if (IS_ERR(obj))
obj = i915_gem_object_create_shmem(dev_priv, size);
}

if (IS_ERR(obj)) {
drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
return PTR_ERR(obj);
Expand Down Expand Up @@ -178,6 +188,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
unsigned long flags = 0;
bool prealloc = false;
void __iomem *vaddr;
struct drm_i915_gem_object *obj;
int ret;

if (intel_fb &&
Expand Down Expand Up @@ -232,13 +243,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
info->fbops = &intelfb_ops;

/* setup aperture base/size for vesafb takeover */
info->apertures->ranges[0].base = ggtt->gmadr.start;
info->apertures->ranges[0].size = ggtt->mappable_end;
obj = intel_fb_obj(&intel_fb->base);
if (i915_gem_object_is_lmem(obj)) {
struct intel_memory_region *mem = obj->mm.region;

info->apertures->ranges[0].base = mem->io_start;
info->apertures->ranges[0].size = mem->total;

/* Use fbdev's framebuffer from lmem for discrete */
info->fix.smem_start =
(unsigned long)(mem->io_start +
i915_gem_object_get_dma_address(obj, 0));
info->fix.smem_len = obj->base.size;
} else {
info->apertures->ranges[0].base = ggtt->gmadr.start;
info->apertures->ranges[0].size = ggtt->mappable_end;

/* Our framebuffer is the entirety of fbdev's system memory */
info->fix.smem_start =
(unsigned long)(ggtt->gmadr.start + vma->node.start);
info->fix.smem_len = vma->node.size;
/* Our framebuffer is the entirety of fbdev's system memory */
info->fix.smem_start =
(unsigned long)(ggtt->gmadr.start + vma->node.start);
info->fix.smem_len = vma->node.size;
}

vaddr = i915_vma_pin_iomap(vma);
if (IS_ERR(vaddr)) {
Expand Down
4 changes: 2 additions & 2 deletions drivers/gpu/drm/i915/display/intel_frontbuffer.c
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,6 @@ static int frontbuffer_active(struct i915_active *ref)
return 0;
}

__i915_active_call
static void frontbuffer_retire(struct i915_active *ref)
{
struct intel_frontbuffer *front =
Expand Down Expand Up @@ -266,7 +265,8 @@ intel_frontbuffer_get(struct drm_i915_gem_object *obj)
atomic_set(&front->bits, 0);
i915_active_init(&front->write,
frontbuffer_active,
i915_active_may_sleep(frontbuffer_retire));
frontbuffer_retire,
I915_ACTIVE_RETIRE_SLEEPS);

spin_lock(&i915->fb_tracking.lock);
if (rcu_access_pointer(obj->frontbuffer)) {
Expand Down
5 changes: 2 additions & 3 deletions drivers/gpu/drm/i915/display/intel_overlay.c
Original file line number Diff line number Diff line change
Expand Up @@ -384,8 +384,7 @@ static void intel_overlay_off_tail(struct intel_overlay *overlay)
i830_overlay_clock_gating(dev_priv, true);
}

__i915_active_call static void
intel_overlay_last_flip_retire(struct i915_active *active)
static void intel_overlay_last_flip_retire(struct i915_active *active)
{
struct intel_overlay *overlay =
container_of(active, typeof(*overlay), last_flip);
Expand Down Expand Up @@ -1402,7 +1401,7 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv)
overlay->saturation = 146;

i915_active_init(&overlay->last_flip,
NULL, intel_overlay_last_flip_retire);
NULL, intel_overlay_last_flip_retire, 0);

ret = get_registers(overlay, OVERLAY_NEEDS_PHYSICAL(dev_priv));
if (ret)
Expand Down
3 changes: 1 addition & 2 deletions drivers/gpu/drm/i915/gem/i915_gem_context.c
Original file line number Diff line number Diff line change
Expand Up @@ -1046,7 +1046,6 @@ struct context_barrier_task {
void *data;
};

__i915_active_call
static void cb_retire(struct i915_active *base)
{
struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
Expand Down Expand Up @@ -1080,7 +1079,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
if (!cb)
return -ENOMEM;

i915_active_init(&cb->base, NULL, cb_retire);
i915_active_init(&cb->base, NULL, cb_retire, 0);
err = i915_active_acquire(&cb->base);
if (err) {
kfree(cb);
Expand Down
Loading

0 comments on commit ccd1950

Please sign in to comment.