我正在尝试使用std :: chrono时间计算并使用在[A,B]范围内的随机生成的整数数组来测量Merge Sort和Quick Sort函数的持续时间,该数组的大小从5000到100,000个整数不等。
我的代码的目的是证明,当改进快速排序中拾取(pivot)的方法时,快速排序功能最终花费的时间比合并排序所花费的时间更少,因此我选择枢轴的方式是使用随机索引方法可最大程度地降低复杂度为(n ^ 2)的可能性,但是在某些情况下(我将在下面进行描述),快速排序最终比合并排序要花费更多的时间,我想知道为什么会这样。
情况1:数组中数字的范围很小,这增加了在数组中重复数字的可能性。
情况2:当我使用像clion这样的本地IDE时,快速排序功能比合并排序花费更多的时间,但是,像IDEONE.com这样的在线编译器在两种排序算法中给出的结果相似(即使所生成整数的范围为小)
这是我在上述情况下得到的结果(第一行数字是合并排序结果,第二行是快速排序结果):
1-clion可缩小数字范围(-100、600)
2-clion结果的数字范围很广(INT_MIN,INT_MAX)
3-IDEONE结果的数字范围较小(-100、600)
4- IDEONE结果的范围很广(INT_MIN,INT_MAX)
#include <bits/stdc++.h>
#include <chrono>
#include <random>
using namespace std;
mt19937 gen(chrono::steady_clock::now().time_since_epoch().count());
int* generateArray(int size)
{
int* arr = new int[size];
uniform_int_distribution<> distribution(INT_MIN, INT_MAX);
for (int i=0; i < size; ++i)
{
arr[i] = distribution(gen);
}
return arr;
}
void merge(int* leftArr, int nL, int* rightArr, int nR, int* mainArr)
{
int i=0, j=0, k=0;
while (i < nL && j < nR)
{
if (leftArr[i] < rightArr[j]) { mainArr[k++] = leftArr[i++]; }
else { mainArr[k++] = rightArr[j++]; }
}
while (i < nL){ mainArr[k++] = leftArr[i++]; }
while (j < nR){ mainArr[k++] = rightArr[j++]; }
}
void mergeSort (int* mainArray, int arrayLength)
{
if (arrayLength < 2) { return; }
int mid = arrayLength/2;
int* leftArray = new int[mid];
int* rightArray = new int[arrayLength - mid];
for (int i=0; i<mid; ++i) {leftArray[i] = mainArray[i];}
for (int i = mid; i<arrayLength; ++i) {rightArray[i - mid] = mainArray[i];}
mergeSort(leftArray, mid);
mergeSort(rightArray, arrayLength-mid);
merge(leftArray, mid, rightArray, arrayLength-mid, mainArray);
delete[] leftArray;
delete[] rightArray;
}
int partition (int* arr, int left, int right)
{
uniform_int_distribution<> distribution(left, right);
int idx = distribution(gen);
swap(arr[right], arr[idx]);
int pivot = arr[right];
int partitionIndex = left;
for (int i = left; i < right; ++i)
{
if (arr[i] <= pivot)
{
swap(arr[i], arr[partitionIndex]);
partitionIndex++;
}
}
swap(arr[right], arr[partitionIndex]);
return partitionIndex;
}
void quickSort (int* arr, int left, int right)
{
if(left < right)
{
int partitionIndex = partition(arr, left, right);
quickSort(arr, left, partitionIndex-1);
quickSort(arr, partitionIndex+1, right);
}
}
int main()
{
vector <long long> mergeDuration;
vector <long long> quickDuration;
for (int i = 5000; i<= 100000; i += 5000)
{
int* arr = generateArray(i);
auto startTime = chrono::high_resolution_clock::now();
quickSort(arr, 0, i - 1);
auto endTime = chrono::high_resolution_clock::now();
long long duration = chrono::duration_cast<chrono::milliseconds>(endTime - startTime).count();
quickDuration.push_back(duration);
delete[] arr;
}
for (int i = 5000; i <= 100000; i += 5000 )
{
int* arr = generateArray(i);
auto startTime = chrono::high_resolution_clock::now();
mergeSort(arr, i);
auto endTime = chrono::high_resolution_clock::now();
long long duration = chrono::duration_cast<chrono::milliseconds>(endTime - startTime).count();
mergeDuration.push_back(duration);
delete[] arr;
}
for (int i = 0; i<mergeDuration.size(); ++i)
{
cout << mergeDuration[i] << " ";
}
cout << endl;
for (int i = 0; i<quickDuration.size(); ++i)
{
cout << quickDuration[i] << " ";
}
}
当输入集包含大量重复项时,Quicksort表现出较差的性能。解决方案是使用维基百科上所述的三向分区:
重复元素
With a partitioning algorithm such as the ones described above (even with one that chooses good pivot values), quicksort exhibits poor performance for inputs that contain many repeated elements. The problem is clearly apparent when all the input elements are equal: at each recursion, the left partition is empty (no input values are less than the pivot), and the right partition has only decreased by one element (the pivot is removed). Consequently, the algorithm takes quadratic time to sort an array of equal values.
To solve this problem (sometimes called the Dutch national flag problem), an alternative linear-time partition routine can be used that separates the values into three groups: values less than the pivot, values equal to the pivot, and values greater than the pivot. ... The values equal to the pivot are already sorted, so only the less-than and greater-than partitions need to be recursively sorted. In pseudocode, the quicksort algorithm becomes
algorithm quicksort(A, lo, hi) is if lo < hi then p := pivot(A, lo, hi) left, right := partition(A, p, lo, hi) // note: multiple return values quicksort(A, lo, left - 1) quicksort(A, right + 1, hi)
The partition algorithm returns indices to the first ('leftmost') and to the last ('rightmost') item of the middle partition. Every item of the partition is equal to p and is therefore sorted. Consequently, the items of the partition need not be included in the recursive calls to quicksort.
进行以下修改可以quickSort
得到更好的结果:
pair<int,int> partition(int* arr, int left, int right)
{
int idx = left + (right - left) / 2;
int pivot = arr[idx]; // to be improved to median-of-three
int i = left, j = left, b = right - 1;
while (j <= b) {
auto x = arr[j];
if (x < pivot) {
swap(arr[i], arr[j]);
i++;
j++;
} else if (x > pivot) {
swap(arr[j], arr[b]);
b--;
} else {
j++;
}
}
return { i, j };
}
void quickSort(int* arr, int left, int right)
{
if (left < right)
{
pair<int, int> part = partition(arr, left, right);
quickSort(arr, left, part.first);
quickSort(arr, part.second, right);
}
}
输出:
0 1 2 3 4 5 6 7 8 9 11 11 12 13 14 15 16 19 18 19
0 0 0 1 0 1 1 1 1 1 2 3 2 2 2 2 3 3 3 3
0 1 2 3 4 5 6 6 8 8 9 12 11 12 13 14 16 17 18 19
0 0 1 1 1 2 3 3 3 4 4 4 5 6 5 6 7 7 8 8
因此,现在有很多重复的运行要快得多。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句