游戏开发论坛

 找回密码
 立即注册
搜索
查看: 1700|回复: 0

FAQ-6 当CPU的处理速度大大超过GPU时,怎样处理延迟? (例

[复制链接]

41

主题

184

帖子

184

积分

注册会员

Rank: 2

积分
184
发表于 2006-12-14 02:51:00 | 显示全部楼层 |阅读模式
当CPU的处理速度大大超过GPU时,怎样处理延迟? (例如,显著的鼠标键盘响应滞后)

最简单的方法就是锁定后帧缓存(类似于OpenGL中的glFinish函数)。这个步骤保证了CPU和GPU的同步,当然这阻碍了CPU和GPU的并行操作,因为CPU必须等到GPU处理结束才能处理下一帧的运算。(注:在同步过程中,可能GPU等待同步,会暂停工作)。

一个更好的方法是基于双缓存纹理的锁定,这是一个通用的锁定后帧缓存的方法。在每一帧的最后,渲染一个三角形到一个2x2的纹理,接着锁定,读取上一帧产生纹理的内容。 到目前为止这种方法和锁定后帧缓存的方法等价,并且具有同种类型的停顿。它确保了GPU不会比CPU超前一帧。

概括的说:它使用两个纹理并且交替的渲染和锁定它们

渲染帧 1

渲染三角形到纹理0

锁定并读取纹理1

渲染帧 2

渲染三角形到纹理1

锁定并读取纹理0

渲染帧 3

渲染三角形到纹理0

锁定并读取纹理1

渲染帧 4

渲染三角形到纹理1

锁定并读取纹理0

...

现在,GPU不会暂停了,并且CPU不会比GPU超前两帧,延迟被限制在一帧内。通常来说当GPU忙得时候效率比较高(除非GPU操作被阻塞)。你可以按照这个思想创建三缓存纹理的锁定,并在每一帧内插入许多同步点,更好的控制延迟。

另一个解决办法是使用DirectX 9的异步查询函数,在你每一帧的末尾,插入一个D3DQUERYTYPE_EVENT查询到你的渲染流水线中。你可以使用GetData函数判断GPU是否运行到这一步了,通过这个查询你可以控制CPU不会超过GPU两帧,类似与上面的情况,你也可以在一帧中插入多个同步点,来进行更好的控制。

/*************************/
How should I handle input lag (that is, apparent mouse and keyboard response time) in situations where the CPU is getting too far ahead of the GPU?

The most obvious solution would be to lock the back buffer for each frame in Direct3D (analogous to calling glFinish() in OpenGL). This ensures that all pending graphics commands are completed by the GPU before the CPU moves on. However, this completely removes any potential for asynchronous processing, as the CPU is unable to process the next frame until the current frame has finished rendering.

A better solution is double-buffered texture locking. This is a generalization of locking the back-buffer.  At the end of your frame you render a single triangle to a tiny (2x2) texture, then read the contents of your texture.  So far this solution is equivalent to locking the back-buffer, and suffers the same kind of stalls.  It ensures that the GPU never gets more than 1 frame ahead of the CPU.

Now generalize it: use two tiny textures and alternately render to them and alternately lock them:

Render frame 1

Render a triangle to texture 0

Lock and read texture 1

Render frame 2

Render a triangle to texture 1

Lock and read texture 0

Render frame 3

Render triangle to texture 0

Lock and read texture 1

Render frame 4

Render a triangle to texture 1

Lock and read texture 0

...

Now, the GPU does not get stalled; it also never gets more than 2 frames ahead of the CPU.  Lag is up to one frame, but overall efficiency is higher since the GPU is always busy (if you are GPU bound). You can further generalize it to use triple-buffered textures, and you may even be able to insert multiple sync points per frame to get finer control over lag.

A second solution is to use DirectX 9's Asynchronous Query functionality (analogous to using fences in OpenGL).  At the end of your frame, insert a D3DQUERYTYPE_EVENT query into your rendering stream.  You can then poll whether the GPU has reached this event yet by using GetData.  As in 1) you can thus ensure (i.e., busy wait w/ the CPU) that the CPU never gets more than 2 frames ahead of the GPU, while the GPU is never idled.  Similarly it is conceivable to insert multiple queries per frame to get finer control over lag.

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

作品发布|文章投稿|广告合作|关于本站|游戏开发论坛 ( 闽ICP备17032699号-3 )

GMT+8, 2026-1-26 02:01

Powered by Discuz! X3.4

Copyright © 2001-2021, Tencent Cloud.

快速回复 返回顶部 返回列表