数据结构中的常用算法总结
首先介绍4个可视化数据结构的网址,推荐度从前往后
书籍
排序 sort
- insert
- 折半
- shell
- bubble
- quick
- select
- heap
- merge
- 基数
Insertion Sort
ALGORITHM
1 | for i = 1:n-1, |
注解: 默认数组a下标就是1开始. k从i开始,例如一开始就是从第2个元素起步a[2]和第一个元素a[1]比较, 一趟比较完后k随着i++进行下一趟比较, i是用来确定一趟的最后一个元素位置. k>1是说最后比较只会是a[1] a[2] 不会a[0] a[1]因为不存在a[0], 下标从1开始, 最后一趟是从a[n-1] a[n]开始比较.
注意这个不需要一个数组 用来特别往后移动.
外层循环记录着每趟起始和终止位置,代表趟数;内层是比较方式. 要根据可视化的过程来决定外层怎么写. 比如select的 是小的放最前还是大的放最外层不一样.
1 | /* |
DISCUSSION
尽管存在最坏情况(worst-case)是O(n2), 也就是逆序(reversed)的情况下, insertion sort 在data几乎有序(这个叫adaptive)和问题size很小(这个叫low overhead 低开销)的情况下还是一个很好的选择.
还有他是stable的, insertion sort is often used as the recursive base case (when the problem size is small) for higher overhead 分治divide-and-conquer sorting algorithms, such as merge sort or quick sort.
PROPERTIES
- Stable
- O(1) extra space
- O(n2) comparisons and swaps
- Adaptive: O(n) time when nearly sorted
- Very low overhead
二分
1 | 将直接插入排序中寻找a[i]插入位置的方法改为二分查找,然后再一次性向右移动元素。 |
Shell Sort
ALGORITHM
1 | h = 1 |
注解: 首先按n大小计算递增序列,这里是按1 4 13 40 来, 然后从最大的h开始. 注意有一步h = h / 3
这就是比n小的最大的h, 因为前一个while跳出就是h>n的情况,还得返回去. 然后a[1,4] a[2, 5] 这样按insertion sort比较 a[k:h:n] 是从k到n以k为间隔
1 | /* |
1 | void shell_sort(int a[], int n) |
DISCUSSION
shell sort最坏情况时间复杂度(The worse-case time complexity)依赖于递增序列(the increment sequence). 这里所使用的the increments 1 4 13 40 121…, 时间复杂度是 O(n3/2). For other increments, time complexity is known to be O(n4/3) and even O(n·lg2(n)). 既不存在时间复杂度的紧上界,也不知道最佳增量序列。
Because shell sort is based on insertion sort, shell sort inherits insertion sort’s adaptive properties. The adapation is not as dramatic because shell sort requires one pass through the data for each increment, but it is significant. For the increment sequence shown above, there are log3(n) increments, so the time complexity for nearly sorted data is O(n·log3(n)).
Because of its low overhead, relatively simple implementation, adaptive properties, and sub-quadratic time complexity, shell sort may be a viable alternative to the O(n·lg(n)) sorting algorithms for some applications when the data to be sorted is not very large.
PROPERTIES
- Not stable
- O(1) extra space
- O(n3/2) time as shown (see below)
- Adaptive: O(n·lg(n)) time when nearly sorted
Bubble Sort
ALGORITHM
1 | for i = n-1:0, |
注意 Bubble sort每趟排序完后最后一个元素到位
1 | /* |
DISCUSSION
Bubble sort has many of the same properties as insertion sort, but has slightly higher overhead. In the case of nearly sorted data, bubble sort takes O(n) time, but requires at least 2 passes through the data (whereas insertion sort requires something more like 1 pass).
PROPERTIES
- Stable
- O(1) extra space
- O(n2) comparisons and swaps
- Adaptive: O(n) when nearly sorted
Quick Sort
ALGORITHM
1 | _ |
注意 Quick sort有好几种
1 | /* |
1 | void quick_sort(int arr[],int low,int high) |
DISCUSSION
When carefully implemented, quick sort is robust and has low overhead. When a stable sort is not needed, quick sort is an excellent general-purpose sort – although the 3-way partitioning version should always be used instead.
The 2-way partitioning code shown above is written for clarity rather than optimal performance; it exhibits poor locality, and, critically, exhibits O(n2) time when there are few unique keys. A more efficient and robust 2-way partitioning method is given in Quicksort is Optimal by Robert Sedgewick and Jon Bentley. The robust partitioning produces balanced recursion when there are many values equal to the pivot, yielding probabilistic guarantees of O(n·lg(n)) time and O(lg(n)) space for all inputs.
With both sub-sorts performed recursively, quick sort requires O(n) extra space for the recursion stack in the worst case when recursion is not balanced. This is exceedingly unlikely to occur, but it can be avoided by sorting the smaller sub-array recursively first; the second sub-array sort is a tail recursive call, which may be done with iteration instead. With this optimization, the algorithm uses O(lg(n)) extra space in the worst case.
PROPERTIES
- Not stable
- O(lg(n)) extra space (see discussion)
- O(n2) time, but typically O(n·lg(n)) time
- Not adaptive
Quick Sort 3 Way
ALGORITHM
1 | _ |
DISCUSSION
The 3-way partition variation of quick sort has slightly higher overhead compared to the standard 2-way partition version. Both have the same best, typical, and worst case time bounds, but this version is highly adaptive in the very common case of sorting with few unique keys.
The 3-way partitioning code shown above is written for clarity rather than optimal performance; it exhibits poor locality, and performs more swaps than necessary. A more efficient but more elaborate 3-way partitioning method is given in Quicksort is Optimal by Robert Sedgewick and Jon Bentley.
When stability is not required, quick sort is the general purpose sorting algorithm of choice. Recently, a novel dual-pivot variant of 3-way partitioning has been discovered that beats the single-pivot 3-way partitioning method both in theory and in practice.
PROPERTIES
- Not stable
- O(lg(n)) extra space
- O(n2) time, but typically O(n·lg(n)) time
- Adaptive: O(n) time when O(1) unique keys
Selection Sort
ALGORITHM
1 | for i = 0:n-1, |
注意 每一趟可以选择最小的放到最前面,也可以选最大的放最后面. k使用存放临时极小值(极大值). 最后一趟比完才最后确定.
1 | /* |
DISCUSSION
从比较结果来看, 最好不要用selection sort should never be used. It does not adapt to the data in any way (notice that the four animations above run in lock step), 运行时间一直是平方项(quadratic).
但是有一个优点 selection sort 可以减少交换项数目. 可以应用于交换项cost很大的的情况.
PROPERTIES
- Not stable
- O(1) extra space
- Θ(n2) comparisons
- Θ(n) swaps
- Not adaptive
Heap Sort
ALGORITHM
1 |
|
注意完全二叉, 用的是数组,首先是建堆,然后再排序.
建堆(heapify)中确定大顶堆还是小顶堆. 从最后一个非叶子节点(n/2)开始比较,往上比较.
1 | //堆排序:树形选择排序,将带排序记录看成完整的二叉树,第一步:建立初堆,第二步:调整堆 |
1 |
|
1 | void HeapAdjust(int a[],int s,int n) //但习惯上用大顶堆 |
DISCUSSION
Heap sort is simple to implement, performs an O(n·lg(n)) in-place sort, but is not stable.
The first loop, the Θ(n) “heapify” phase, puts the array into heap order. The second loop, the O(n·lg(n)) “sortdown” phase, repeatedly extracts the maximum and restores heap order.
The sink function is written recursively for clarity. Thus, as shown, the code requires Θ(lg(n)) space for the recursive call stack. However, the tail recursion in sink() is easily converted to iteration, which yields the O(1) space bound.
Both phases are slightly adaptive, though not in any particularly useful manner. In the nearly sorted case, the heapify phase destroys the original order. In the reversed case, the heapify phase is as fast as possible since the array starts in heap order, but then the sortdown phase is typical. In the few unique keys case, there is some speedup but not as much as in shell sort or 3-way quicksort.
PROPERTIES
- Not stable
- O(1) extra space (see discussion)
- O(n·lg(n)) time
- Not really adaptive
Merge Sort
ALGORITHM
1 |
|
1 | /* |
1 | 归并排序是建立在归并操作上的一种有效的排序算法。该算法是采用分治法的一个非常典型的应用。 |
DISCUSSION
Merge sort is very predictable. It makes between 0.5lg(n) and lg(n) comparisons per element, and between lg(n) and 1.5lg(n) swaps per element. The minima are achieved for already sorted data; the maxima are achieved, on average, for random data. If using Θ(n) extra space is of no concern, then merge sort is an excellent choice: It is simple to implement, and it is the only stable O(n·lg(n)) sorting algorithm. Note that when sorting linked lists, merge sort requires only Θ(lg(n)) extra space (for recursion).
Merge sort is the algorithm of choice for a variety of situations: when stability is required, when sorting linked lists, and when random access is much more expensive than sequential access (for example, external sorting on tape).
There do exist linear time in-place merge algorithms for the last step of the algorithm, but they are both expensive and complex. The complexity is justified for applications such as external sorting when Θ(n) extra space is not available.
PROPERTIES
- Stable
- Θ(n) extra space for arrays (as shown)
- Θ(lg(n)) extra space for linked lists
- Θ(n·lg(n)) time
- Not adaptive
- Does not require random access to data