摘要: 两阶段归并排序算法是数据库查询的一个基础技术,在数据库应用中,常常采用“两阶段多路归并排序算法”来解决对海量数据的排序问题(这里的海量数据是指数据大小远远超过了数据库可用的主存的大小,无法将所有数据一次性的载入主存进行排序)。

前言

基于斯坦福大学的《数据库系统实现》,实现两阶段多路归并排序算法,通过merge-sort算法的实现,理解外存算法所基于的I/O模型与内存算法基于的RAM模型的区别;理解不同的磁盘访问优化方法是如何提高数据访问性能的。

首先生成一个具有10,000,000个记录的文本文件,其中每个记录由100个字节组成。实验只考虑记录的一个属性A,假定A为整数类型。记录在block上封装时,采用non-spanned方式,即块上小于一个记录的空间不使用。Block的大小可在自己的操作系统上查看,xp一般为4096 bytes。在内存分配50M字节的空间用于外部merge-sort。要求设计和实现程序完成下列功能:

  1. 生成文本文件,其中属性A的值随机产生。
  2. 对文本文件中的记录,按照属性A进行排序,其中在第二阶段的排序中每个子列表使用一个block大小的缓冲区缓冲数据。
  3. 按照教材cylinder-based buffers(1M bytes)的方法,修改第二阶段的算法。
  4. 比较两种方法的时间性能,如果有更大的内存空间,算法性能还能提高多少?

算法描述

Two-Phase Multiway Merge-Sort算法的具体描述分为2个阶段,如下所示:

Phase 1

11)Fill main memory with records.
2
32)Sort with favorite main memory sorting algorithms.
4
53)Write sorted list to disk.
6
74)Repeat until all records have been put into one of the sorted lists.

Phase 2

11)Initially load input buffers with the first block of their respective sorted lists.
2
32)Repeated run a competition among the first unchosen records of each of the buffered blocks.
4
53)If an input block is exhausted, get the next block from the same file.
6
74)If the output block is full, write it to disk.

我的设计思路

从上述的算法描述中,我们知道,系统主要由2大模块组成:Phase1和Phase2。Phase1阶段主要将生成的记录文件按内存块大小(本实验中是50MB)分成多个(本实验中是20个)相应的子记录文件,把这些文件中的记录读进内存进行排序,再写回磁盘上。Phase2阶段利用多路归并排序算法,将Phase1阶段已经排好序的子记录文件重新归并为1个有序的记录文件,写回到磁盘上。由于我们在Phase1和Phase2阶段之前必须先生成1个含有10000000个100B记录的文件,所以系统必须再加上1个生成记录文件的Generate Record File模块。终上所述,系统由3大模块组成,分别为:Generate Record File、Phase1、Phase2。Phase1模块可以细分为内存块排序模块Main Memory Sort和写回磁盘模块Write To Disk。Phase2模块可以细分为多路归并排序模块Merge-Sort和写回磁盘模块Write To Disk。

详细的系统逻辑结构图如下图所示:
TPMMS系统逻辑结构图

TPMMS系统逻辑结构图

数据结构

我们讨论单个记录的数据结构。由于1个记录有100个字节,其中4字节是由随机整数组成的主键属性Primary Key,另外96个字节是随意填充的数据content,而且本系统又是由C语言进行实现的,所以我们可以采取结构体来作为记录的数据结构。其中整形字段key记录4字节的主键属性,以便进行排序工作。数组字段contents用来填充剩余的96个字节,内容可以随意(本实验中均为0)。具体的数据结构如下所示:

1/* the data structrue of a record */
2typedef struct record           
3{
4	int key;                    // primary key
5	char contents[ROW_NUMBER + 1];   // content
6}Record;
7//单个记录的数据结构

具体实现

Generate Record File阶段:

Generate Record File阶段比较简单,首先打开一个文件,然后生成随机数key并将其写入文件中,再填充96个任意内容的字节(本实验中均为0),即能生成1条完整的记录。重复10000000次,生成我们所需的记录文件。核心代码实现如下所示,其中MAX_RECORD_NUMBER大小为10000000,ROW_NUMBER大小为95。

 1// open file
 2FILE *fp = fopen( fileName.c_str(), "w" );    
 3if ( !fp )     // open failed
 4    {
 5        printf("File could not be created!\n");
 6        fprintf( stderr, "File could not be created!\n" );
 7        exit( 1 );
 8    }
 9
10// generate random integers and records
11srand( (unsigned)time( NULL ) );       // srand seed 
12for ( int i = 0; i < MAX_RECORD_NUMBER; i ++ )  // generate MAX_RECORD_NUMBER records
13{ 
14    if ( i > 0 )
15        fprintf( fp, "\n" );
16    int key = rand();     // primary key, random integer, 4 bytes       
17
18    // write record to disk, every record has 100 bytes
19    fprintf( fp, "%d ", key );               // write key as the first 4 bytes
20    for ( int j = 0; j < ROW_NUMBER; j ++ )  // write '0' for content as the other 96 bytes
21        fprintf( fp, "%c", '0' );
22}
23fclose( fp );       // close output file 
24//Generate Record File阶段的实现

Phase1阶段

Phase1阶段重点在于如何进行内存排序,并写回到磁盘上。这里我们采用了STL的sort函数帮助我们进行排序。首先读入50MB记录,利用sort函数进行排序后,写到磁盘上生成1个有序的子记录文件。重复进行20次,生成20个相应的子记录文件。核心代码实现如图4-3所示,其中BLOCK_SIZE大小为50M,SUB_LIST_NUMBER大小为20。

 1// read all records to main memory
 2for ( int k = 0; k < SUB_LIST_NUMBER; k ++ )
 3{
 4    // read records of a block size to main memory 
 5    for ( i = 0; i < BLOCK_SIZE; i ++ ) 
 6    {
 7        fgets( str, ROW_NUMBER + 10, infp );
 8        sscanf( str, "%d %s", &subRecord[i].key, subRecord[i].contents );
 9    }
10
11    // use STL algorithm sort to sort records
12    sort( subRecord, subRecord + BLOCK_SIZE, cmp );   
13
14    temp = generateFileName( k );  // sorted list name
15    FILE *outfp = fopen( temp.c_str(), "w" );   // open output file
16    if ( !outfp )                 // open failed
17    {
18        printf( "File %s could not be opened!\n", temp.c_str());
19        fprintf( stderr, "File %s could not be opened!\n", temp.c_str() );
20        exit( 1 );
21    }
22
23    // write the sorted records to sub list file
24    for ( i = 0; i < BLOCK_SIZE; i ++ )
25    {
26        if ( i > 0 )
27            fprintf( outfp, "\n" );
28
29        fprintf( outfp, "%d %s", subRecord[i].key, subRecord[i].contents );
30    }
31    printf( "The sorted list %s generated successfully!\n", temp.c_str() );
32    fclose( outfp );       // close output stream
33}
34//Phase1阶段的实现

Phase2阶段

Phase2阶段是本系统实现的难点所在。主要的实现大致可以分为以下几部分进行讨论:

1)输入缓冲的实现

将Phase1阶段中得到的20个子记录文件的首字符分别读入长度为20的输入缓冲数组inputBuffer,核心代码实现如下所示:

 1// read all of the sublist's first record to input buffer
 2for ( int k = 0; k < SUB_LIST_NUMBER; k ++ )
 3{
 4    temp = generateFileName( k );
 5    infp[k] = fopen( temp.c_str(), "r" );    // open sorted list file
 6    if ( !infp[k] )                         // open failed
 7    {
 8        printf( "File %s could not be created!\n", temp.c_str() );
 9        fprintf( stderr, "File %s could not be created!\n", temp.c_str() );
10        exit( 1 );
11    }          
12
13    // read record to input buffer
14    fgets( str, ROW_NUMBER + 10, infp[k] );
15    sscanf( str, "%d %s", &inputBuffer[k].key, inputBuffer[k].contents );
16}
17//输入缓冲的实现

2)输出缓冲的实现

选取输入缓冲数组inputBuffer中主键属性key最小的那个缓冲区,输入到输出缓冲数组outputBuffer中,然后循环执行,核心代码实现如下所示:

1//  select the smallest record
2int index = selectMinRecord( SUB_LIST_NUMBER );
3int newIndex = index;
4
5// write record to output buffer
6copyRecord( outputBuffer[count], inputBuffer[index] );
7count ++;
8total ++;
9//输出缓冲的实现

3)多路归并排序的实现

如果输出缓冲数组outputBuffer已经填满,此时可知输出缓冲是有序的,且之后的主键属性key的值都不会小于该输出缓冲区,这时我们即可将其输出到最后想要得到的结果文件上,核心代码实现如下所示:

1// output buffer is full, write to disk
2if ( count >= BLOCK_SIZE )
3{
4    count = 0;               // reset count
5    writeToDisk( outfp );
6}
7//多路归并排序的实现

算法结果

50MB内存TPMMS结果

采用50MB内存块大小进行TPMMS实验的结果如下图所示:
image
从上图可以看出,生成1GB大小10000000条记录的文件需要152秒,phase1阶段需要136秒,phase2阶段需要150秒。所以整个排序过程需要286秒,即4分46秒的时间才能完成。

10MB内存TPMMS结果

将50MB内存缩减5倍,进行10MB内存块大小的TPMMS计算。这将产生100个子记录文件。算法结果如下图所示:
image
生成1GB大小10000000条记录的文件所需时间不变,仍为152秒左右。我们注重于phase1阶段和phase2阶段的所需时间。从图中可以看出,phase1阶段需要147秒,phase2阶段需要152秒。整个排序过程需要300秒,即5分钟的时间才能完成。

100MB内存TPMMS算法结果

将50MB内存增加2倍,进行100MB内存块大小的TPMMS计算。这将产生10个子记录文件。运行结果如下图所示:
image

三者的比较

从上面的实验结果,我们可以很明显地看出,内存块大小越大,算法所需时间越少。这是因为内存块越小,生成的子记录文件个数就越多,这样phase1阶段生成子记录文件的时间就增加了。并且这还使得phase2阶段的输出缓冲区变小,导致多路归并时程序读写磁盘的次数增多,所以phase2阶段时间也增加了。这样整个排序过程时间当然增加。

终上所述,当在理想条件下,我们应使用内存块大小较大的方法来进行TPMMS算法的实现。在本章中TPMMS算法的性能为:100MB优于50MB优于10MB。所以在可能的情况下,应该考虑采纳100MB的TPMMS算法。

算法问题及解决

Phase2阶段遇到的问题和解决方法

前文已经详细描述了Phase2阶段的3个主要的实现阶段,但是仅仅依靠这3个阶段还不能完全实现Phase2阶段,必须解决以下几个关键问题才能完成Phase2阶段的所有任务。

读完某个子记录文件后,输入缓冲的填充方法

当某个输入缓冲数组inputBuffer[i]相对应的子记录文件infp[i]已经读完时,我们就必须重新查找其余可用的子记录文件,按数组下标i搜索到第一个可用的文件infp[k]后,将它的第一个字节继续填充到输入缓冲数组inputBuffer[i]中。

特别的,当数组下标i超过子记录文件总数SUB_LIST_NUMBER(本实验中为20)时,我们就认为所有子记录文件已经读取完毕,这时可以设置一个bool型变量flag = true,进行标识。核心代码实现如下所示:

 1if ( feof( infp[index] ) )     // the whole sub file hase been resd
 2{
 3    // select a file that has more record to be read
 4    for ( i = 0; i < SUB_LIST_NUMBER; i ++ )
 5    {
 6        if ( !feof( infp[i] ) )
 7        {
 8            newIndex = i;
 9            break;
10        }
11    }
12
13    // all sorted file have been read
14    if ( i >= SUB_LIST_NUMBER )
15        flag = true;
16}
17//读完某个子记录文件后,输入缓冲的填充方法

读完所有子记录文件后,处理最后一组输入缓冲数据的方法

利用在上一步中设置的bool型变量flag,当flag=true时,我们知道子记录文件已经全部读取完毕。这时在输入缓冲数组inputBuffer中只剩下最后一组数据,并且根据Phase2阶段的定义,它们肯定比之前输入缓冲中的数据要大。所以我们只需利用STL提供的sort函数对它们进行排序后,直接输出到最终结果文件即可。核心代码实现如下所示:

 1// handle the last number of size records 
 2sort( inputBuffer, inputBuffer + SUB_LIST_NUMBER, cmp );    
 3for ( i = 1; i < SUB_LIST_NUMBER; i ++ )
 4{
 5    // copy to output buffer
 6    copyRecord( outputBuffer[BLOCK_SIZE - SUB_LIST_NUMBER + i], inputBuffer[i] );
 7    count ++;
 8    total ++;
 9} 
10writeToDisk( outfp );    // write to disk
11			
12//读完所有子记录文件后,处理最后一组输入缓冲数据的方法

生成子记录文件名的方法

当我们生成子记录文件时,想要赋予文件类似于record_k.txt (k = i+1, 0 <= i <= 19) 的文件名。由于在C语言中,不支持字符串和整数的直接连接。在这里我们需要一个generateFileName函数,采用itoa函数将整数k = i+1转换成字符串,再连接到“record_”后面,从而得到想要的文件名。核心代码实现如下所示:

 1/* give an integer, to generate a file name */
 2string generateFileName( int i )
 3{
 4    char str[20];             // temporary charater array
 5    string temp = "";         // temporary string
 6    
 7    itoa( i+1, str, 10 );       // store integer k+1 to array str
 8    temp = str;               // convert array str to temporary string 
 9    temp = "D:/record_" + temp + ".txt";  // form the file name
10    
11    return temp;  // return the temporary string of file name 
12}
13//生成子记录文件名的方法

完整代码实现

  1#include <algorithm>    // for sort function
  2#include <string>       // for strcpy 
  3#include <cstdio>       // for fscanf, fprintf, fopen     
  4#include <ctime>        // for clock
  5using namespace std;
  6
  7/* define the constants used in this program */
  8const int MAX_RECORD_NUMBER = 10000000;     // max record number
  9const int BLOCK_SIZE =  500000;             // main memory block size
 10const int ROW_NUMBER = 95;                  // for record to fill the other 96 bytes
 11const int SUB_LIST_NUMBER = ( MAX_RECORD_NUMBER / BLOCK_SIZE );   // sub list number
 12const int MAX = 99999999;   // for function selectMinRecord to initialize the variable "min" 
 13
 14/* the data structrue of a record */
 15typedef struct record           
 16{
 17	int key;                    // primary key
 18	char contents[ROW_NUMBER + 1];   // content
 19}Record;
 20
 21Record subRecord[BLOCK_SIZE];         // main memory buffer
 22Record inputBuffer[BLOCK_SIZE + 1];   // input buffer      
 23Record outputBuffer[BLOCK_SIZE + 1];  // output buffer
 24
 25/* generate a file of MAX_RECORD_NUMBER (= 10000000) records, 
 26   every record is 100 bytes */
 27void generateFile( string fileName )
 28{
 29	// calculate time
 30	printf("The records is now under generating ...\n");
 31	clock_t start, finish;
 32	double duration;
 33    start = clock();    // start time
 34
 35	// open file
 36	FILE *fp = fopen( fileName.c_str(), "w" );    
 37	if ( !fp )     // open failed
 38	{
 39		printf("File could not be created!\n");
 40		fprintf( stderr, "File could not be created!\n" );
 41		exit( 1 );
 42	}
 43
 44	// generate random integers and records
 45	srand( (unsigned)time( NULL ) );       // srand seed 
 46	for ( int i = 0; i < MAX_RECORD_NUMBER; i ++ )  // generate MAX_RECORD_NUMBER records
 47	{ 
 48		if ( i > 0 )
 49			fprintf( fp, "\n" );
 50		int key = rand();     // primary key, random integer, 4 bytes		
 51
 52		// write record to disk, every record has 100 bytes
 53		fprintf( fp, "%d ", key );               // write key as the first 4 bytes
 54		for ( int j = 0; j < ROW_NUMBER; j ++ )  // write '0' for content as the other 96 bytes
 55			fprintf( fp, "%c", '0' );
 56	}
 57	fclose( fp );       // close output file 
 58
 59    // calculate time
 60	finish = clock();     // finish time 
 61	duration = (double)( finish - start ) / CLOCKS_PER_SEC;   // run time 
 62	printf ( "It takes %f seconds to genetate the whole records.\n", duration );
 63}
 64
 65/* use for phase 1 of two phase multiway merge sort
 66   compare two record by primary key, with ascending order */
 67bool cmp( const Record &r1, const Record &r2 )
 68{
 69	return r1.key < r2.key;
 70}
 71
 72/* give an integer, to generate a file name */
 73string generateFileName( int i )
 74{
 75	char str[20];             // temporary charater array
 76	string temp = "";         // temporary string
 77
 78	itoa( i+1, str, 10 );       // store integer k+1 to array str
 79	temp = str;               // convert array str to temporary string 
 80	temp = "D:/record_" + temp + ".txt";  // form the file name
 81
 82	return temp;  // return the temporary string of file name 
 83}
 84
 85/* phase 1 of two phase multiway merge sort
 86   read record with maximum block size to main memory 
 87   and sort them by primary key */
 88void phase1( string fileName )
 89{
 90	// open file
 91	FILE *infp = fopen( fileName.c_str(), "r" );   
 92	if ( !infp )            // open failed
 93	{
 94		printf( "File %s could not be opened!\n", fileName.c_str() );
 95		fprintf( stderr, "File %s could not be opened!\n", fileName.c_str() );  
 96		exit( 1 );
 97	}
 98
 99	string temp = "";         // temporary string
100	int i = 0, j = 0;
101
102	// calculate time
103	printf( "The sorted list of records is now under generating ...\n" );
104	clock_t start, finish;
105	double duration;
106    start = clock();    // start time
107
108	char str[ROW_NUMBER + 10];
109
110	// read all records to main memory
111	for ( int k = 0; k < SUB_LIST_NUMBER; k ++ )
112	{
113		// read records of a block size to main memory 
114		for ( i = 0; i < BLOCK_SIZE; i ++ ) 
115		{
116			fgets( str, ROW_NUMBER + 10, infp );
117			sscanf( str, "%d %s", &subRecord[i].key, subRecord[i].contents );
118		}
119
120		// use STL algorithm sort to sort records
121		sort( subRecord, subRecord + BLOCK_SIZE, cmp );   
122
123		temp = generateFileName( k );  // sorted list name
124		FILE *outfp = fopen( temp.c_str(), "w" );   // open output file
125		if ( !outfp )                 // open failed
126		{
127			printf( "File %s could not be opened!\n", temp.c_str());
128			fprintf( stderr, "File %s could not be opened!\n", temp.c_str() );
129			exit( 1 );
130		}
131
132		// write the sorted records to sub list file
133		for ( i = 0; i < BLOCK_SIZE; i ++ )
134		{
135			if ( i > 0 )
136				fprintf( outfp, "\n" );
137
138			fprintf( outfp, "%d %s", subRecord[i].key, subRecord[i].contents );
139		}
140		printf( "The sorted list %s generated successfully!\n", temp.c_str() );
141		fclose( outfp );       // close output stream
142	}
143	fclose( infp );         // close input file
144
145	// calculate time
146	finish = clock();     // finish time 
147	duration = (double)( finish - start ) / CLOCKS_PER_SEC;   // run time 
148	printf( "It takes %f seconds to genetate the sorted list of records.\n", duration );
149}
150
151/* copy record r2 to record r1 */
152void copyRecord( Record &r1, Record &r2 )
153{
154	r1.key = r2.key;
155	strcpy( r1.contents, r2.contents );
156}
157
158/* copy a record to input buffer */
159void copyToInputBuffer( FILE *fp, int index )
160{
161	char str[ROW_NUMBER + 10];
162	fgets( str, ROW_NUMBER + 10, fp );
163	sscanf( str, "%d %s", &inputBuffer[index].key, inputBuffer[index].contents );
164}
165
166/* write the records in output buffer to disk
167   when the output buffer is full */
168void writeToDisk( FILE *fp )
169{
170	// flush output buffer to disk
171	for ( int j = 0; j < BLOCK_SIZE; j ++ )
172	{
173		fprintf( fp, "%d %s\n", outputBuffer[j].key, outputBuffer[j].contents );
174	}
175}
176
177/* select the minimum record in input buffer 
178   by primary key */
179int selectMinRecord( int size )
180{
181	int min = MAX;     
182	int index = 0;
183	for ( int i = 0; i < size; i ++ )
184	{
185		if ( inputBuffer[i].key < min )
186		{
187			min = inputBuffer[i].key;
188			index = i;
189		}
190	}
191	return index;
192}
193
194/* phase 2 of two phase multiway merge sort
195   merge the sorted sub list to a sorted result file
196   of ascending order */
197void phase2()
198{
199	// open output file to store the sorted records
200	FILE *outfp = fopen( "D:/result.txt", "w" ); 
201	if ( !outfp )      // open failed
202	{
203		printf( "Output file could not be created!\n" );
204		fprintf( stderr, "Output file could not be created!\n" );
205		exit( 1 );
206	}
207
208	string temp = "";         // temporary string
209	int count = 0;            // the used output buffer size
210	int total = 0;            // the record number written to disk
211	int i = 0, j = 0;
212
213	// array of input stream object, to open sub list of sorted records
214	FILE * *infp = new FILE*[ SUB_LIST_NUMBER ]; 
215
216	// calculate time
217	printf( "Merge all of the sorted lists of records ...\n" );
218	clock_t start, finish;
219	double duration;
220    start = clock();     // start time
221
222	char str[ROW_NUMBER + 10];
223
224	// read all of the sublist's first record to input buffer
225	for ( int k = 0; k < SUB_LIST_NUMBER; k ++ )
226	{
227		temp = generateFileName( k );
228		infp[k] = fopen( temp.c_str(), "r" );    // open sorted list file
229		if ( !infp[k] )                         // open failed
230		{
231			printf( "File %s could not be created!\n", temp.c_str() );
232			fprintf( stderr, "File %s could not be created!\n", temp.c_str() );
233			exit( 1 );
234		}          
235
236		// read record to input buffer
237		fgets( str, ROW_NUMBER + 10, infp[k] );
238		sscanf( str, "%d %s", &inputBuffer[k].key, inputBuffer[k].contents );
239	}
240	
241	// mark whether all the sored list have been read
242	bool flag = false;       
243
244	// merge the sorted list
245	while ( total < MAX_RECORD_NUMBER )
246	{
247		//  select the smallest record
248		int index = selectMinRecord( SUB_LIST_NUMBER );
249		int newIndex = index;
250
251		// write record to output buffer
252		copyRecord( outputBuffer[count], inputBuffer[index] );
253		count ++;
254		total ++;
255
256		// output buffer is full, write to disk
257		if ( count >= BLOCK_SIZE )
258		{
259			count = 0;               // reset count
260			writeToDisk( outfp );
261		}
262
263		if ( feof( infp[index] ) )     // the whole sub file hase been resd
264		{
265			// select a file that has more record to be read
266			for ( i = 0; i < SUB_LIST_NUMBER; i ++ )
267			{
268				if ( !feof( infp[i] ) )
269				{
270					newIndex = i;
271					break;
272				}
273			}
274
275			// all sorted file have been read
276			if ( i >= SUB_LIST_NUMBER )
277				flag = true;
278		}
279
280		if ( !flag )     // not all sublist have been read 
281			copyToInputBuffer( infp[newIndex], index  );
282		else              // all sublist have been read 
283		{
284			// handle the last number of size records 
285			sort( inputBuffer, inputBuffer + SUB_LIST_NUMBER, cmp );    
286			for ( i = 1; i < SUB_LIST_NUMBER; i ++ )
287			{
288				// copy to output buffer
289				copyRecord( outputBuffer[BLOCK_SIZE - SUB_LIST_NUMBER + i], inputBuffer[i] );
290				count ++;
291				total ++;
292			} 
293			writeToDisk( outfp );    // write to disk
294			break;
295		}
296	}
297
298    // close all input file
299	for ( i = 0; i < SUB_LIST_NUMBER; i ++ )
300		fclose( infp[i] );
301	fclose( outfp );        // close output file
302
303	// calculate time
304	finish = clock();     // finish time 
305	duration = (double)( finish - start ) / CLOCKS_PER_SEC;   // run time 
306	printf( "It takes %f seconds to merge the sorted list of records.\n", duration );
307}
308
309/* the entrance of the program */
310int main()
311{
312	// generate record file
313	string fileName = "D:/record.txt";
314	generateFile( fileName );
315
316	// calculate time
317	clock_t start, finish;
318	double duration;
319    start = clock();
320
321	// phase1 and phase2
322	phase1( fileName );
323	phase2();
324
325	// calculate time
326	finish = clock();     // finish time 
327	duration = (double)( finish - start ) / CLOCKS_PER_SEC;   // run time 
328	printf( "It takes %f seconds to sort the original records.\n", duration );
329
330	return 0;
331}

该算法的优点是磁盘中的数据只被读入主存两次,就完成了海量数据的排序。减少了IO时间。基于以上的实现,就完成了数据库的海量数据的排序!


微信公众号

潘建锋

关于版权和转载

本文由 潘建锋 创作,采用 署名 4.0 国际 (CC BY 4.0) 国际许可协议进行授权。
本站文章除注明转载/出处外,均为本站原创或翻译,转载时请务必署名,否则,本人将保留一切追究责任的权利。
署名 4.0 国际 (CC BY 4.0)

转载规范

标题:数据库内部排序算法之两阶段多路归并排序算法实现
作者:潘建锋
原文:HTTPS://strikefreedom.top/database-sort-algorithm

关于留言和评论

如果您对本文《数据库内部排序算法之两阶段多路归并排序算法实现》的内容有任何疑问、补充或纠错,欢迎在下面的评论系统中留言,与作者一起交流进步,谢谢!(~ ̄▽ ̄)~