[Compute Shader] Bitonic-sort

Graphics Techniques

[Compute Shader] Bitonic-sort

doyyy_0 2025. 2. 16. 14:58

바이토닉 정렬은 잘 쓰이는 알고리즘은 아니지만, 병렬처리로 구현할 수 있다는 부분에서 다른 정렬과는 차이가 있습니다. 퀵정렬같이 시간복잡도는 O(N*logN)입니다. 하지만 컴퓨터 쉐이더에서 병렬로 처리할 수 있다면 더 빠르겠죠?

코드는 다음과 같습니다.

    for (uint32_t k = 2; k <= numElements; k *= 2)
        for (uint32_t j = k / 2; j > 0; j /= 2) {

		#pragma omp parallel for

            for (int32_t i = 0; i < int32_t(numElements); i++) {
                int32_t l = i ^ j;
                if (l > i) {
                    if (((i & k) == 0) && (arr[i].key > arr[l].key) ||
                        ((i & k) != 0) && (arr[i].key < arr[l].key))
                        std::swap(arr[i], arr[l]);
                }
            }

        }

그런데 왜 병렬처리를 할 수 있냐면, for (int32_t i = 0; i < int32_t(numElements); i++) 이부분이 굳이 i가 순서대로 진행되지 않아도 되고 무작위 순서로 실행되더라도, 전체적으로 numElements 개의 원소가 모두 참여하면 상관없습니다. 비교교환만 잘 이루어지면 되는 것이죠.

그렇다면 for (int32_t i = 0; i < int32_t(numElements); i++)이부분을 컴퓨터쉐이더로 구현할 수 있겠죠?

1. CPP코드로 CPU로 구현할 부분을 정리해줍니다.

size_t constCount = 0;
for (uint32_t k = 2; k <= m_numElements; k *= 2)
    for (uint32_t j = k / 2; j > 0; j /= 2) { 
        context->CSSetConstantBuffers(
            0, 1, m_constsGpu[constCount++].GetAddressOf());
        context->CSSetShader(m_bitonicSortCS.Get(), 0, 0);
        context->CSSetUnorderedAccessViews(0, 1, m_array.GetAddressOfUAV(),
                                           NULL);
        context->Dispatch(UINT(ceil(m_numElements / 1024.0f)), 1, 1);
    }

2. 나머지는 컴퓨터쉐이더에서 처리해줍니다.

[numthreads(1024, 1, 1)]
void main(int3 gID : SV_GroupID, int3 gtID : SV_GroupThreadID,
          uint3 dtID : SV_DispatchThreadID)
{
    
    // 힌트: (value 말고) key로 정렬하시면 됩니다.
    
    uint i = dtID.x;
    uint l = i ^ j;
    if (l > i)
    {
        if (((i & k) == 0) && (arr[i].key > arr[l].key) ||
                        ((i & k) != 0) && (arr[i].key < arr[l].key))
            Swap(arr[i], arr[l]);
    }  
    
}

'Graphics Techniques' 카테고리의 다른 글

[유체 시뮬레이션] 3d연기 Down=> Up Sampling으로 성능향상 기법 (0)	2025.03.06
[Volume Rendering] 구름 볼륨 렌더링 + SDF (0)	2025.03.01
[ComputeShader] 컴퓨팅 쉐이더 픽셀 깨짐 (0)	2025.02.15
[shadowMapping] PCSS - 부드러운 그림자 (0)	2025.01.29
[DepthStencil] 거울 반사의 원리와 구현 (0)	2025.01.21

현재글[Compute Shader] Bitonic-sort

그래픽스 및 게임 엔진 블로그

eofud0930@naver.com

chorin's projection, 유체시뮬레이션, 렌더링파이프라인, RayTracing, rendering, 메모리, 그래픽스, 텍스춰링, sph, rasterization, C++, thread, ComputeShader, Ray Tracing, DirectX, perlin, tessellation, graphics, 렌더링, PBR,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

그래픽스 및 게임 엔진 블로그