2009年06月14日

《数据结构与算法分析C++描述》搜索二叉树的C++实现

《数据结构与算法分析C++描述》
搜索二叉树的C++实现

write by 九天雁翎(JTianLing) -- www.jtianling.com

《数据结构与算法分析c++描述》 Mark Allen Weiss著
人民邮电大学出版
中文版第93-100面，搜索二叉树

需要说明一点的是，此搜索二叉树并没有平衡算法，所以可能会导致有可能出现O(M logN)的最坏情况。

并且几乎所有代码都用递归实现，所以效率并不是太高，并且当N足够大的时候，很多操作都可能导致栈溢出。但是因为对于树的操作用递归描述起来理解上还是比循环好的多，并且以后可以用平衡算法，所以这里都用递归了。

搜索二叉树的实现：

  1
  2 #ifndef __BINARY_SEARCH_TREE_H__
  3 #define __BINARY_SEARCH_TREE_H__
  4
  5 template<typename T>
  6 class CBinarySearchTree
  7 {
  8 public:
  9     CBinarySearchTree():mpRoot(NULL) { }
10     CBinarySearchTree(const CBinarySearchTree& aOrig)
11     {
12         mpRoot = Clone(aOrig.mpRoot);
13     }
14     ~CBinarySearchTree()
15     {
16         MakeEmpty();
17     }
18
19     ////////////////////////////////////////////
20     // const member function
21     ////////////////////////////////////////////
22     const T* FindMin() const;
23     const T* FindMax() const;
24
25     bool Contains( const T& aElement) const;
26     bool IsEmpty() const
27     {
28         return (mpRoot != NULL) ? true : false;
29     }
30
31     // I don't know how to print it in a good format
32     //void PrintTree() const;
33
34     ////////////////////////////////////////////
35     // non-const member function
36     ////////////////////////////////////////////
37     void MakeEmpty();
38     void Insert( const T& aElement);
39     void Remove( const T& aElement);
40
41     const CBinarySearchTree& operator=(const CBinarySearchTree& aOrig);
42
43 private:
44     struct CBinaryNode
45     {
46         CBinaryNode(const T& aElement, CBinaryNode* apLeft, CBinaryNode* apRight)
47             : mElement(aElement),mpLeft(apLeft),mpRight(apRight) {  }
48
49         T mElement;
50         CBinaryNode *mpLeft;
51         CBinaryNode *mpRight;
52     };
53
54     // Root Node
55     CBinaryNode *mpRoot;
56
57     ////////////////////////////////////////////
58     // private member function to call recursively
59     ////////////////////////////////////////////
60
61     // I don't like to use reference to pointer
62     // so I used pointer to pointer instead
63     void Insert(const T& aElement, CBinaryNode** appNode) const;
64     void Remove(const T& aElement, CBinaryNode** appNode) const;
65
66     CBinaryNode* FindMin(CBinaryNode* apNode) const;
67     CBinaryNode* FindMax(CBinaryNode* apNode) const;
68     bool Contains(const T& aElement, CBinaryNode * apNode) const;
69     void MakeEmpty(CBinaryNode** apNode);
70     //void PrintTree(CBinaryNode* apNode) const;
71     CBinaryNode* Clone(CBinaryNode* apNode) const;
72
73 };
74
75
76 template<typename T>
77 bool CBinarySearchTree::Contains(const T& aElement) const
78 {
79     return Contains(aElement, mpRoot);
80 }
81
82 template<typename T>
83 bool CBinarySearchTree::Contains(const T &aElement;, CBinaryNode *apNode) const
84 {
85     if( NULL == apNode )
86     {
87         return false;
88     }
89     else if ( aElement < apNode->mElement )
90     {
91         return Contains(aElement, apNode->mpLeft);
92     }
93     else if ( aElement > apNode->mElement )
94     {
95         return Contains(aElement, apNode->mpRight);
96     }
97     else
98     {
99         return true;      // Find it
100     }
101 }
102
103 template<typename T>
104 void CBinarySearchTree::Insert(const T &aElement;)
105 {
106     Insert(aElement, &mpRoot;);
107 }
108
109 template<typename T>
110 void CBinarySearchTree::Insert(const T& aElement, CBinaryNode** appNode) const
111 {
112     CBinaryNode *lpNode = *appNode;
113     if(NULL == lpNode)
114     {
115         *appNode = new CBinaryNode(aElement, NULL, NULL);
116     }
117     else if( aElement < lpNode->mElement )
118     {
119         Insert(aElement, &(lpNode->mpLeft) );
120     }
121     else if( aElement > lpNode->mElement)
122     {
123         Insert(aElement, &(lpNode->mpRight) );
124     }
125
126     // had not deal with duplicate
127 }
128
129 template<typename T>
130 void CBinarySearchTree::Remove(const T &aElement;)
131 {
132     Remove(aElement, &mpRoot;);
133 }
134
135 template<typename T>
136 void CBinarySearchTree::Remove(const T &aElement;, CBinaryNode** appNode) const
137 {
138     CBinaryNode* lpNode = *appNode;
139     if(NULL == lpNode)
140     {
141         return;       // Item removing is not exist
142     }
143
144     if( aElement < lpNode->mElement )
145     {
146         Remove(aElement, &(lpNode->mpLeft) );
147     }
148     else if( aElement > lpNode->mElement )
149     {
150         Remove(aElement, &(lpNode->mpRight) );
151     }
152     else if( NULL != lpNode->mpLeft && NULL != lpNode->mpRight) // Two children
153     {
154         lpNode->mElement = FindMin(lpNode->mpRight)->mElement;
155         Remove( lpNode->mElement, &(lpNode->mpRight) );
156     }
157     else
158     {
159         CBinaryNode *lpOldNode = lpNode;
160         // Even if lpNode equal NULL, this is still the right behavior we need
161         // Yeah,When lpNode have no children,we make lpNode equal NULL
162         *appNode = (lpNode->mpLeft != NULL) ? lpNode->mpLeft : lpNode->mpRight;
163         delete lpOldNode;
164     }
165 }
166
167
168 template<typename T>
169 const T* CBinarySearchTree::FindMin() const
170 {
171     CBinaryNode* lpNode = FindMin(mpRoot);
172     return (lpNode != NULL) ? &(lpNode->mElement) : NULL;
173 }
174
175
176 // damn it! So redundant words to fit to C++ syntax
177 // the only way to fix this problom is compositing defines and declares
178 // I even doubt that are there programmers could write it right
179 template<typename T>
180 typename CBinarySearchTree::CBinaryNode * CBinarySearchTree::FindMin(CBinaryNode* apNode) const
181 {
182     if( NULL == apNode)
183     {
184         return NULL;
185     }
186     else if( NULL == apNode->mpLeft)
187     {
188         // Find it
189         return apNode;
190     }
191     else
192     {
193         return FindMin(apNode->mpLeft);
194     }
195 }
196
197 template<typename T>
198 const T* CBinarySearchTree::FindMax() const
199 {
200     CBinaryNode* lpNode = FindMax(mpRoot);
201     return (lpNode != NULL) ? &(lpNode->mElement) : NULL;
202 }
203
204 template<typename T>
205 typename CBinarySearchTree::CBinaryNode * CBinarySearchTree::FindMax(CBinaryNode* apNode) const
206 {
207     if( NULL == apNode)
208     {
209         return NULL;
210     }
211     else if( NULL == apNode->mpRight)
212     {
213         // Find it
214         return apNode;
215     }
216     else
217     {
218         return FindMax(apNode->mpRight);
219     }
220 }
221
222 template<typename T>
223 void CBinarySearchTree::MakeEmpty()
224 {
225     MakeEmpty(&mpRoot;);
226 }
227
228
229 template<typename T>
230 void CBinarySearchTree::MakeEmpty(CBinaryNode** appNode)
231 {
232     CBinaryNode* lpNode = *appNode;
233     if( lpNode != NULL)
234     {
235         MakeEmpty( &(lpNode->mpLeft) );
236         MakeEmpty( &(lpNode->mpRight) );
237         delete lpNode;
238     }
239
240     *appNode = NULL;
241 }
242
243 // how long the syntax is...............
244 template<typename T>
245 const CBinarySearchTree& CBinarySearchTree::operator =(const CBinarySearchTree &aOrig;)
246 {
247     if(&aOrig; == this)
248     {
249         return *this;
250     }
251
252     MakeEmpty();
253     mpRoot = Clone(aOrig.mpRoot);
254
255     return *this;
256
257 }
258
259 // when you use nest class and template both,you will find out how long the C++ syntax is.....
260 // I use it once,I ask why couldn't we have a short once again.
261 template<typename T>
262 typename CBinarySearchTree::CBinaryNode* CBinarySearchTree::Clone(CBinaryNode *apNode) const
263 {
264     if(NULL == apNode)
265     {
266         return NULL;
267     }
268
269     // abuse recursion
270     return new CBinaryNode(apNode->mElement, Clone(apNode->mpLeft), Clone(apNode->mpRight));
271 }
272
273
274
275
276 #endif // __BINARY_SEARCH_TREE_H__

测试代码：

1 #include
2 #include "BinarySearchTree.h"
3 using namespace std;
4
5 int _tmain(int argc, _TCHAR* argv[])
6 {
7     CBinarySearchTree<int> loTree;
8
9     loTree.Insert(10);
10     loTree.Insert(20);
11     loTree.Insert(30);
12     loTree.Insert(40);
13     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  "<20) <
14     loTree.Remove(40);
15     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  " <20) <
16     loTree.Remove(30);
17     loTree.Remove(20);
18     loTree.Remove(10);
19
20
21     loTree.Insert(40);
22     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  " <20) <
23     loTree.Insert(30);
24     loTree.Insert(20);
25     loTree.Insert(10);
26     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  " <20) <
27     loTree.Remove(40);
28     loTree.Remove(30);
29     loTree.Remove(20);
30     loTree.Remove(10);
31
32     loTree.Insert(30);
33     loTree.Insert(40);
34     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  " <20) <
35     loTree.Insert(10);
36     loTree.Insert(20);
37     cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20)  " <20) <
38     CBinarySearchTree<int> loTree2 = loTree;
39     cout <<"Min: " <<*loTree2.FindMin() <<" Max: " <<*loTree2.FindMax() <<" IsContains(20)  " <20) <
40
41     loTree.MakeEmpty();
42
43
44
45     system("pause");
46     return 0;
47 }
48

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月14日

《数据结构与算法分析C++描述》分离链接(separate chaining)哈希表的C++实现

write by 九天雁翎(JTianLing) -- www.jtianling.com

《数据结构与算法分析c++描述》 Mark Allen Weiss著
人民邮电大学出版
中文版第138-142面，分离链接(separate chaining)哈希表,侯捷将其翻译成开链

这应该是最容易实现的哈希表方法了,次容易的应该是线性搜索.

想起我到目前公司干的第二件事情，就是实现了一个文件系统，其核心模块就是一个类似MPQ的打包文件格式.而这个打包格式的核心模块就是一个线性哈希表的实现。只不过这次的实现不是在内存中，而是在文件上。这里顺便想说明是MPQ的实现是个很有意思的东西，感兴趣的可以去看看

http://shadowflare.samods.org/inside_mopaq/

inside mopaq是我见过最详细也最有用的资料，至于我刚开始工作的一些原始的资料记录就是非常凌乱了，但是希望有人在做同样工作的时候还能有一些参考价值吧。

http://www.jtianling.com/archive/2008/06/02/2504503.aspx

http://www.jtianling.com/archive/2008/06/02/2504515.aspx

并且，因为我以前已经实现了这么一个线性搜索的哈希表了，所以此次也不准备再实现一次了。

最后。。。。暴雪那个哈希算法的确是很不错，要求和一般的哈希算法不一样，一般的要求是哈希表总数为质数，其要求为2的幂。我在一次测试中发现，2万个文件的冲突次数大概在2千次,即1/10,远远低于书中低于1.5次的预期.

这一次是在VS中实现的,直接拷贝过来了,没有用vim了.

分离链接(separate chaining)哈希表的实现：

#ifndef
__SL_HASH_TABLE_H__

#define
__SL_HASH_TABLE_H__

#include

using
namespace
std;

// 两个Hash函数,第一个由书上的例子提供，散列效果不明

int
hash( const
string& key)

{

int
liHashVal = 0;

for( int
i = 0; i < key.length(); ++i)

{

liHashVal = 37 * liHashVal + key[i];

}

return
liHashVal;

}

// 书中没有提供这个散列函数的实现。。。。。郁闷了,随便写一个了。。。。

int
hash( int
key)

{

return
key;

}

// 参考了<>

static
const
int
gPrimesCount = 10;

static
unsigned
long
gPrimesArray[gPrimesCount] =

{

53, 97, 193, 389, 769,

1543, 3079, 6151, 12289, 24593

};

inline
unsigned
long
NextPrime(unsigned
long
n)

{

const
unsigned
long* first = gPrimesArray;

const
unsigned
long* last = gPrimesArray + gPrimesCount;

const
unsigned
long* pos = lower_bound(first, last, n);

return
pos == last ? *(last - 1) : *pos;

}

template <typename
HashedObj>

class
CSLHashTable

{

public:

// 书中无实现，无提示,我第一次编译才发现。。。。。

explicit
CSLHashTable(size_t
aiSize = 101) : miCurrentSize(aiSize)

{

moLists.resize(aiSize);

}

bool
Contains( const
HashedObj& x ) const

{

const
list<HashedObj> & liListFinded = moLists[ MyHash(x)];

return
find( liListFinded.begin(), liListFinded.end(), x) != liListFinded.end();

}

void
MakeEmpty()

{

for( int
i=0; i<moLists.size(); ++i)

{

moLists[i].clear();

}

bool
Insert( const
HashedObj& x)

{

list<HashedObj> & liListFinded = moLists[ MyHash(x)];

if( find( liListFinded.begin(), liListFinded.end(), x) != liListFinded.end() )

{

return
false;

}

liListFinded.push_back(x);

if(++miCurrentSize > moLists.size())

{

ReHash();

}

return
true;

}

bool
Remove( const
HashedObj& x)

{

list<HashedObj>& liListFinded = moLists[ MyHash(x)];

list<HashedObj>::iterator
lit = find(liListFinded.begin(), liListFinded.end(), x);

if(lit == liListFinded.end())

{

return
false;

}

liListFinded.erase(lit);

--miCurrentSize;

return
true;

}

private:

vector<list<HashedObj> > moLists;

size_t
miCurrentSize;

void
ReHash()

{

vector<list<HashedObj> > loOldLists = moLists;

// 书中又一次的没有提供相关关键函数的实现,而且没有一点提示，NextPrime的含义自然是移到下一个素数上

moLists.resize( NextPrime( 2 * moLists.size()));

for( int
j=0; j<moLists.size(); ++j)

{

moLists[j].clear();

}

miCurrentSize = 0;

for(int
i=0; i<loOldLists.size(); ++i)

{

list<HashedObj>::iterator
lit = loOldLists[i].begin();

while(lit != loOldLists[i].end())

{

Insert(*lit++);

}

int
MyHash( const
HashedObj& x) const

{

int
liHashVal = hash(x);

liHashVal %= moLists.size();

if(liHashVal < 0)

{

liHashVal += moLists.size();

}

return
liHashVal;

}

};

#endif
// __SL_HASH_TABLE_H__

测试代码

#include
"SLHashTable.h"

#include

using
namespace
std;

// 这里为了稍微纠正我最近用宏上瘾的问题。。。。强制自己使用了模板

// 其实还是有个问题。。。呵呵，具体的名字没有办法输出来了，当然，每次调用函数

// 输入字符串永远不在考虑的范围内

// 另外.....看到最后标准库的类型全名的时候,总是会感叹一下...实在是太长了,记得

// 有一次,一个复杂的带string的map,我根本没有办法从鼠标下面看到即时显示的调试信息

// 原因是类型太长了,加起来超出了一个屏幕!!!,所以实际的调试数值被挤到了屏幕以外!

// 所以只能通过添加watch的方式才能看到值-_-!!

template <typename
HashedObj, typename
Table >

void
Test(HashedObj
x, Table& table)

{

if(table.Contains(x))

{

cout <<typeid(table).name() <<" Constains " <<x <<endl;

}

else

{

cout <<typeid(table).name() <<" don't Constains " <<x <<endl;

}

int
main()

{

// test Int

CSLHashTable<int> loIntTable;

loIntTable.Insert(10);

loIntTable.Insert(20);

loIntTable.Insert(30);

loIntTable.Insert(40);

loIntTable.Insert(50);

Test(20, loIntTable);

Test(30, loIntTable);

Test(40, loIntTable);

Test(60, loIntTable);

Test(70, loIntTable);

CSLHashTable<string> loStrTable;

loStrTable.Insert(string("10"));

loStrTable.Insert(string("20"));

loStrTable.Insert(string("30"));

loStrTable.Insert(string("40"));

loStrTable.Insert(string("50"));

Test(string("20"), loStrTable);

Test(string("30"), loStrTable);

Test(string("40"), loStrTable);

Test(string("60"), loStrTable);

Test(string("70"), loStrTable);

return 0;

}

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月09日

ASIO—下一代C++标准可能接纳的网络库（3）UDP网络应用

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

一、综述

接着前面

《ASIO—下一代C++标准可能接纳的网络库（1）简单的应用》

《ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用》

继续，讲了简单应用，讲了TCP,自然而然是UDP了。其实个人感觉UDP与TCP的接口假如经过封装是可以做到接口比较一致的，但是遗憾的是asio没有遵循这样的接口设计方案。

二、 Tutorial

1. Daytime.4 - A synchronous UDP daytime client（同步UDP daytime客户端）

还是先看看普通的socket API的情况：

#include <stdio.h>

#include <string.h>

#include "Winsock2.h"

#include "errno.h"

#include "stdlib.h"

#define MAXLINE 1000

void str_cli(SOCKET sockfd, const struct sockaddr* pservaddr, int servlen)

{

int n;

char recvline[MAXLINE] = {0};

char sendline[2] = {0};

sendto(sockfd, sendline, 2, 0, pservaddr, servlen);

n = recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL);

recvline[n] = 0;

printf("%s", recvline);

}

int main(int argc, char **argv)

{

WORD wVersionRequested = 0;

WSADATA wsaData;

int err;

wVersionRequested = MAKEWORD( 2, 2 );

// windows下此初始化为必须，实际是初始化WinsockDLL的过程

err = WSAStartup( wVersionRequested, &wsaData );

if ( err != 0 ) {

return -1;

}

SOCKET sockfd;

struct sockaddr_in servaddr;

if (argc != 2)

{

printf("usage: tcpcli <IPaddress>");

exit(1);

}

sockfd = socket(AF_INET, SOCK_DGRAM, 0);

ZeroMemory(&servaddr, sizeof(servaddr));

servaddr.sin_family = AF_INET;

servaddr.sin_port = htons(13);

servaddr.sin_addr.s_addr = inet_addr(argv[1]);

str_cli(sockfd, (const struct sockaddr*)&servaddr, sizeof(servaddr));

system("pause");

WSACleanup();

exit(0);

}

相对来说，假如用了socket API，会发现UDP的程序编写逻辑是比TCP要简单的，起码省略了connect的过程，但是难就难在当网络情况不好时UDP程序的处理。这么简单的程序不再加更多说明了。

看看asio的例子:

#include <iostream>

#include <boost/array.hpp>

#include <boost/asio.hpp>

using boost::asio::ip::udp;

int main(int argc, char* argv[])

{

try

{

if (argc != 2)

{

std::cerr << "Usage: client <host>" << std::endl;

return 1;

}

boost::asio::io_service io_service;

udp::resolver resolver(io_service);

udp::resolver::query query(udp::v4(), argv[1], "daytime");

udp::endpoint receiver_endpoint = *resolver.resolve(query);

udp::socket socket(io_service);

socket.open(udp::v4());

boost::array<char, 1> send_buf = { 0 };

socket.send_to(boost::asio::buffer(send_buf), receiver_endpoint);

boost::array<char, 128> recv_buf;

udp::endpoint sender_endpoint;

size_t len = socket.receive_from(

boost::asio::buffer(recv_buf), sender_endpoint);

std::cout.write(recv_buf.data(), len);

}

catch (std::exception& e)

{

std::cerr << e.what() << std::endl;

}

return 0;

}

甚至没有感觉到有任何简化。一大堆的resolver(为了适应ipv6)，一大堆的array,其实并不优雅，在很简单的程序中，会发现，似乎asio就是简单的为socket进行了非常浅的封装一样，你还得了解一大堆本来可以不了解的东西，asio内在的高效率，异步啊，用的那些模式啊，都看不到。。。。。。。。。。呵呵，也许socket API本来就是Make the simple things simple吧，而asio就是为了应付绝对复杂的情况而做出相对复杂设计的吧。这样的例子没有任何说服力能让人放弃socket API而使用asio。。。。。。。不知道asio的文档中加入这些干啥。。。仅仅为了说明？-_-!

2. A synchronous UDP daytime server（同步的UDP daytime服务器）

还是先来个原始的socket API写的版本：

#include <time.h>

#include "Winsock2.h"

#include "errno.h"

#include "stdlib.h"

#define MAXLINE 1000

void str_svr(SOCKET sockfd, struct sockaddr* pcliaddr, int clilen)

{

int n = 0;

time_t ticks = 0;

int len;

char recvline[2] = {0};

char sendline[MAXLINE] = {0};

for(;;)

{

len = clilen;

if( (n = recvfrom(sockfd, recvline, 2, 0, pcliaddr, &len)) == INVALID_SOCKET)

{

printf("recvfrom failed: %d/n", WSAGetLastError());

return;

}

ticks = time(NULL);

_snprintf(sendline, sizeof(sendline), "%.24s/r/n", ctime(&ticks));

sendto(sockfd, sendline, strlen(sendline), 0, pcliaddr, len);

}

int main(int argc, char **argv)

{

WORD wVersionRequested = 0;

WSADATA wsaData;

int err;

wVersionRequested = MAKEWORD( 2, 2 );

// windows下此初始化为必须，实际是初始化WinsockDLL的过程

err = WSAStartup( wVersionRequested, &wsaData );

if ( err != 0 ) {

return -1;

}

SOCKET sockfd;

struct sockaddr_in servaddr,cliaddr;

sockfd = socket(AF_INET, SOCK_DGRAM, 0);

ZeroMemory(&servaddr, sizeof(servaddr));

servaddr.sin_family = AF_INET;

servaddr.sin_addr.s_addr = htonl(INADDR_ANY);

servaddr.sin_port = htons(13); /* daytime server */

if( bind(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr))

== SOCKET_ERROR)

{

printf("bind failed: %d/n", WSAGetLastError());

closesocket(sockfd);

WSACleanup();

return 1;

}

str_svr(sockfd, (struct sockaddr*)&cliaddr, sizeof(cliaddr));

closesocket(sockfd);

WSACleanup();

return 1;

}

与上篇文章中tcp服务器的例子很像，基本上来说，用socket写客户端还是相对简单一些，但是写个服务器就相对要复杂很多，这个例子还没有精细的判断每个返回值（比如send函数），但是已经比较复杂了。

接着是asio版本：

#include <ctime>

#include <iostream>

#include <string>

#include <boost/array.hpp>

#include <boost/asio.hpp>

using boost::asio::ip::udp;

std::string make_daytime_string()

{

using namespace std; // For time_t, time and ctime;

time_t now = time(0);

return ctime(&now);

}

int main()

{

try

{

boost::asio::io_service io_service;

udp::socket socket(io_service, udp::endpoint(udp::v4(), 13));

for (;;)

{

boost::array<char, 1> recv_buf;

udp::endpoint remote_endpoint;

boost::system::error_code error;

socket.receive_from(boost::asio::buffer(recv_buf),

remote_endpoint, 0, error);

if (error && error != boost::asio::error::message_size)

throw boost::system::system_error(error);

std::string message = make_daytime_string();

boost::system::error_code ignored_error;

socket.send_to(boost::asio::buffer(message),

remote_endpoint, 0, ignored_error);

}

catch (std::exception& e)

{

std::cerr << e.what() << std::endl;

}

return 0;

}

我甚至觉得在asio中写服务器比写客户端更加简单-_-!也许一开始asio就是为了写高性能服务器而设计的，所以导致写客户端相对那么麻烦，但是写服务器却又简单很多吧。不过，熟悉socket API对于使用asio也是有意义的，比如这里的receive_from,send_to不过是对应的socket API函数换汤不换药的版本而已，使用起来除了参数传递方式上的变化，最后效果一致。

3. An asynchronous UDP daytime server(异步 UDP daytime 服务器)

又是有点意思了的程序了，asio的命名就是表示异步的io，所以展现异步的程序才能体现asio的实力及其简化了底层操作的本事。TCP的应用是这样，这里也不例外。

不过出于UDP应用相对于TCP应用本身的简单性，所以这个示例程序比对应的TCP版本就要简化很多，只是，不知道asio的UDP实现了更丰富的UDP特性没有，比如超时重发等机制。。。。

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月07日

ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

一、综述

本文仅仅是附着在boost::asio文档的一个简单说明和讲解，没有boost::asio文档可能你甚至都不知道我在讲什么，boost::asio的文档自然是需要从www.boost.org上去下。

基本上，网络编程领域的”Hello World”程序就是类似Echo，daytime等服务器应用了。大牛Stevens经典的《Unix Network Programming》一书更是在这两个服务器上折腾了半本书，Comer的《Internetworking With TCP/IP vol III》也不例外。boost::asio的文档也就更不例外了，全部的网络方面的例子都是以daytime服务为蓝本来讲解的。呵呵，大家这样做是有道理的，毕竟从讲解网络编程的原理来看，echo,daytime等足够的简单：）

二、 Tutorial

首先，因为客户端程序相对服务器程序更为简单，所以一般都从客户端开始，boost::asio也是如此，第一节，给出了一个TCP 的Daytime的实现所谓示例，这里，我不拷贝其源码了，只是列出一个用windows 下用套接字接口实现的同样程序作为对比。

1. A synchronous TCP daytime client（一个同步的TCP daytime客户端程序）

原始的套接字实现：

#include <stdio.h>

#include <string.h>

#include "Winsock2.h"

#include "errno.h"

#include "stdlib.h"

#define MAXLINE 1000

void str_cli(SOCKET sockfd)

{

char recvline[MAXLINE] = {0};

while ( (recv(sockfd, recvline, MAXLINE, 0)) != NULL)

{

printf("%s", recvline);

}

closesocket(sockfd);

}

int main(int argc, char **argv)

{

WORD wVersionRequested = 0;

WSADATA wsaData;

int err;

wVersionRequested = MAKEWORD( 2, 2 );

// windows下此初始化为必须，实际是初始化WinsockDLL的过程

err = WSAStartup( wVersionRequested, &wsaData );

if ( err != 0 ) {

return -1;

}

SOCKET sockfd;

struct sockaddr_in servaddr;

if (argc != 2)

{

printf("usage: tcpcli <IPaddress>");

exit(1);

}

sockfd = socket(AF_INET, SOCK_STREAM, 0);

ZeroMemory(&servaddr, sizeof(servaddr));

servaddr.sin_family = AF_INET;

servaddr.sin_port = htons(13);

servaddr.sin_addr.s_addr = inet_addr(argv[1]);

if( SOCKET_ERROR == connect(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr)))

{

printf("connet failed, Error Code: %d", WSAGetLastError());

closesocket(sockfd);

return -1;

}

str_cli(sockfd); /* do it all */

system("pause");

exit(0);

}

共六十一行，并且需要处理socket创建，初始化等繁琐细节，做任何决定时基本上是通过typecode，其实相对来说也不算太难，因为除了socket的API接口属于需要额外学习的东西，没有太多除了C语言以外的东西需要学习，并且因为BSD socket是如此的出名，以至于几乎等同与事实的标准，所以这样的程序能被大部分学习过一定网络编程知识的人了解。

为了方便对比，我还是贴一下boost::asio示例中的代码：

#include <iostream>

#include <boost/array.hpp>

#include <boost/asio.hpp>

using boost::asio::ip::tcp;

int main(int argc, char* argv[])

{

try

{

if (argc != 2)

{

std::cerr << "Usage: client <host>" << std::endl;

return 1;

}

boost::asio::io_service io_service;

tcp::resolver resolver(io_service);

tcp::resolver::query query(argv[1], "daytime");

tcp::resolver::iterator endpoint_iterator = resolver.resolve(query);

tcp::resolver::iterator end;

tcp::socket socket(io_service);

boost::system::error_code error = boost::asio::error::host_not_found;

while (error && endpoint_iterator != end)

{

socket.close();

socket.connect(*endpoint_iterator++, error);

}

if (error)

throw boost::system::system_error(error);

for (;;)

{

boost::array<char, 128> buf;

boost::system::error_code error;

size_t len = socket.read_some(boost::asio::buffer(buf), error);

if (error == boost::asio::error::eof)

break; // Connection closed cleanly by peer.

else if (error)

throw boost::system::system_error(error); // Some other error.

std::cout.write(buf.data(), len);

}

catch (std::exception& e)

{

std::cerr << e.what() << std::endl;

}

return 0;

}

boost::asio的文档中的实现也有47行，用了多个try,catch来处理异常，因为其实现的原因，引入了较多的额外复杂度，除了boost::asio以外，即便你很熟悉C++,你也得进一步的了解诸如boost::array, boost:system等知识，（虽然其实很简单）并且，从使用上来说，感觉并没有比普通的socket API简单，虽然如此，boost::asio此例子还是有其优势的，比如ipv4,ipv6的自适应（原socket API仅仅支持ipv4），出错时更人性化的提示(此点由C++异常特性支持，相对比C语言中常常只能有个error code)。

当然，此例子过于简单，而asio是为了较大规模程序的实现而设计的，假如这么小规模的程序用原始的套接字就足够了。这点是需要说明的。

2. Daytime.2 - A synchronous TCP daytime server（同步的TCP daytime服务器）

有了客户端没有服务器，那客户端有什么用呢？^^所以，接下来boost::asio适时的给出了一个daytime的服务器实现，这里还是先给出使用一个原始套接字的例子：

#include <time.h>

#include "Winsock2.h"

#include "errno.h"

#include "stdlib.h"

#define MAXLINE 1000

int main(int argc, char **argv)

{

WORD wVersionRequested = 0;

WSADATA wsaData;

int err;

wVersionRequested = MAKEWORD( 2, 2 );

// windows下此初始化为必须，实际是初始化WinsockDLL的过程

err = WSAStartup( wVersionRequested, &wsaData );

if ( err != 0 ) {

return -1;

}

SOCKET listenfd, connfd;

struct sockaddr_in servaddr;

char buff[MAXLINE];

time_t ticks;

listenfd = socket(AF_INET, SOCK_STREAM, 0);

ZeroMemory(&servaddr, sizeof(servaddr));

servaddr.sin_family = AF_INET;

servaddr.sin_addr.s_addr = htonl(INADDR_ANY);

servaddr.sin_port = htons(13); /* daytime server */

if( bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr))

== SOCKET_ERROR)

{

printf("bind failed: %d/n", WSAGetLastError());

closesocket(listenfd);

WSACleanup();

return 1;

}

if (listen( listenfd, SOMAXCONN ) == SOCKET_ERROR)

{

printf("Error listening on socket./n");

WSACleanup();

return 1;

}

for ( ; ; )

{

connfd = accept(listenfd, (struct sockaddr*) NULL, NULL);

if (connfd == INVALID_SOCKET)

{

printf("accept failed: %d/n", WSAGetLastError());

closesocket(listenfd);

WSACleanup();

return 1;

}

ticks = time(NULL);

_snprintf(buff, sizeof(buff), "%.24s/r/n", ctime(&ticks));

if( SOCKET_ERROR == send(connfd, buff, strlen(buff), 0))

{

printf("send failed: %d/n", WSAGetLastError());

closesocket(connfd);

WSACleanup();

return 1;

}

closesocket(connfd);

}

WSACleanup();

return 0;

}

全程序75行，大部分用于socket的初始化，及其状态的转换，直到真正的进入监听状态并开始接受连接，每个socket API的调用都需要判断返回值，这也算是C语言程序共同的特点。

另外，看看boost::asio的实现。

#include <ctime>

#include <iostream>

#include <string>

#include <boost/asio.hpp>

using boost::asio::ip::tcp;

std::string make_daytime_string()

{

using namespace std; // For time_t, time and ctime;

time_t now = time(0);

return ctime(&now);

}

int main()

{

try

{

boost::asio::io_service io_service;

tcp::acceptor acceptor(io_service, tcp::endpoint(tcp::v4(), 13));

for (;;)

{

tcp::socket socket(io_service);

acceptor.accept(socket);

std::string message = make_daytime_string();

boost::system::error_code ignored_error;

boost::asio::write(socket, boost::asio::buffer(message),

boost::asio::transfer_all(), ignored_error);

}

catch (std::exception& e)

{

std::cerr << e.what() << std::endl;

}

return 0;

}

全程序35行，比使用原始套接字的版本省略了一半，并且还是保持着可移植性（我的例子只能在windows下运行）。

从其文档和实现来看，实现上将很多函数转化为类了，使用上也没有简化一些。。。做出这样的结论还是当我处于对socket API的熟悉程度要远大于boost::asio的情况。也许对于纯粹的初学者，要学习asio会比socket API简单更多一些。毕竟相当多的细节，比如各种情况下的错误返回，各类接口需要传入的适当的参数，甚至套接字初始化，状态的转换等等在boost::asio中都简化了太多。此例中的例子就是accept函数在boost::asio中实现为了acceptor类。

另外，这里值得说明一下，虽然BSD socket套接字属于事实上的标准，但是其实同一套程序不经过一定的更改要放在Linux,Windows上同时运行是不可能的，因为其中总有些细微的差别，总记得刚开始工作的时候，拿着《Unix Network Programming》在Windows下去学习，结果一个小程序都用不了。。。结果是完全不知道Windows下特有的WSAStartup初始化-_-!但是boost::asio就彻底的消除了这样的差别。这也应该算是boost::asio的一个优势吧。

3. An asynchronous TCP daytime server（异步TCP daytime服务器）

与原有asio的简单应用一样，从第三个例子开始就已经是有点意思了的程序了，程序的复杂性上来了，异步相对同步来说效率更高是不争的事实，并且其不会阻塞的特性使得应用范围更广，并且异步也是大部分高性能服务器实际上使用的方式，比如Windows下的完成端口，Linux下的Epoll等，asio的底层就是用这些方式实现的，只不过将其封装起来，使得使用更加简单了。这里提供异步的例子就不是那么简单了-_-!呵呵，偷懒的我暂时就不提供了。其实用select也是可以模拟出异步的特性的，asio在操作系统没有很好的支持异步特性的API时，就是利用select模拟出异步的。但是作为select的例子，可以参考我以前学习时写的《服务器Select模型的实现》。

例子中用tcp_server类处理accept事件，用tcp_connection类来处理连接后的写入事件，并且用shared_ptr来保存tcp_connection类的对象。

总结：

boost::asio的确在某种程度上简化了网络客户端/服务器程序的编写，并且易于编写出效率较高的网络应用，（效率能高到什么程度没有实测）但是，作为与C程序员一脉相承的C++程序员，在完全不了解诸如asio:: async_write，asio:: async_accept等函数的实现时有多大的胆量去放心使用，这是个问题。说要去真的理解其实现吧。。。那么就将陷入真正的boost精密C++技巧使用的泥潭，因为boost::asio与其他boost库结合的是如此的紧密，特别是boost::bind,而boost::bind现在的实现实在不是那么优美，并且在下一版的C++标准中variadic templates的加入，是会使其实现简化很多的，这样说来，用boost::asio还是不用。。。是个问题。也许真正能让人下定决心在项目中使用boost::asio的时候，就是在下一代C++标准中其变成了std::asio的时候吧^^

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月07日

ASIO—下一代C++标准可能接纳的网络库（1）简单的应用

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

一、综述

第一次看boost::asio的文档，觉得非常详细，可是因为boost库的惯例，与其他boost库结合的比较紧密，而个人除了对一些很基础的boost库,比如智能指针，还有正则表达式boost::regex有所了解外，以前对boost库还是没有太多的了解的，为了很好的学习和了解boost::asio，我做了很多前期工作，到今天，总算可以正式的开始boost::asio的学习了，另外，今天，从使用开始，对asio的学习作为前段时间网络编程方面学习的一种延续，不是像其他boost库一样，仅仅会使用即可，我要的是深入其原理及精髓。。。。（其实已经知道boost::asio在windows下使用的是完成端口，在Linux下使用的是EPoll）

基本上，asio的文档很详尽也是有道理的（从Overview,到Tutorial到Examples到Reference一共870页），作为一个封装良好的网络(接口？)库，虽然对普通的网络接口进行了很详尽的封装，但是因为网络程序本身的复杂性，asio在使用方式上的复杂度还是比较大，这在B.S.的语言中是，绝对复杂的事情需要相对复杂的工具来解决。

二、 Tutorial

首先，从使用上对ASIO库进行一定的了解，因为文档如此的详尽，Tutorial解释如此的精细，我就不再干C-C,C-V的工作了，也就为E文不太好的兄弟们稍微解释一下各个例子，大家可以对照boost::asio的文档来看。

1. Timer.1 - Using a timer synchronously（使用同步定时器）

就连asio库的使用也是从”Hello world”开始，可见K&R影响之深远。此例解释了展示了asio库的基本使用，包括包含boost/asio.hpp头文件，使用asio需要有boost::asio::io_service对象。还有asio::deadline_timer的使用（在此例中的使用和sleep区别不大）

2. Timer.2 - Using a timer asynchronously（使用异步定时器）

异步定时就是你写好函数等其调用的方式了，这里比同步多的复杂性在于你的函数/callable对象（称为handler）的编写，其他基本一样，不同的在于asio::deadline_timer的成员函数调用从wait换成了async_wait。

3. Timer.3 - Binding arguments to a handler（绑定参数到handler）

已经是有点意思了的程序了，程序的复杂性上来了，在异步调用中继续异步调用，形成类似嵌套的结构，用expires_at来定位超时，用boost::bind来绑定参数。bind也算是比较有意思的boost库之一，也是几乎已经拍板进入下一代C++标准的东西了，值得大家去看看其文档：）

4. Timer.4 - Using a member function as a handler

基本上也就是例3的一个类的翻版，展示的其实也就是boost::bind对于类成员函数的绑定功能，比起例三并没有太多新意（假如你熟悉boost::bind库的话，呵呵，还好我扎实过boost基本功），并且因为无端的使用了类结构，增加了很多程序的复杂性，当然，对于在一个类中组织类似的程序还是有一定的指导意义。

5. Timer.5 - Synchronising handlers in multithreaded programs

加入了多线程的实现，展示了boost::asio::strand类在多线程程序中同步回调handler的用法。

这里我看到几个关键点，首先，asio保证，所有的回调函数只在调用了io_service::run()函数的线程中才可能被调用。其次，假如需要多个线程同时能调用回调函数(其实这里都不算太准，因为callable的东西都行，这里用callback object也许更好)，可以再多个线程中调用io_service::run()，这样，可以形成一种类似线程池的效果。

这里展示的同步方式是将多个callback object用一个strand包装起来实现的。其实，用其他的线程同步方式明显也是可行的。

没有展示太多asio的新东西（假如你熟悉boost::thread的话，关于boost::thread可以参考《boost::thread库，奇怪的文档没有Tutorial的库，但是却仍然相当强大》，呵呵，关于boost的学习不是白学的：），懂的了一些相关库，看asio的例子起码是没有难度的。。。。。想当年我第一次看的时候那个一头雾水啊。。。。。

其实按文档中的例子还可能让初学者有点不清楚，

将原文中的两句改为（不会还要问我是哪两句吧？-_-!以下是在Windows中的例子）

std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 1: " << count_ << "/n";

std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 2: " << count_ << "/n";

这样的形式，那么就能很显著的看到多个线程了。

boost::thread t(boost::bind(&boost::asio::io_service::run, &io));

这样的形式其实是利用boost::thread库创建了一个新的线程，创建新线程的回调又是利用了boost::bind库绑定类成员函数的用法，传递&io作为成员函数的第一参数this，调用io_service::run（），紧接着主线程又调用了io.run()，这样就形成了同时两个线程的情况。

6. MyExample1：Synchronising handlers in multithreaded programs in normal way

这里展示不用boost::asio::strand而是利用常规线程同步的手段来完成线程的同步。

#include <iostream>

#include <boost/asio.hpp>

#include <boost/thread.hpp>

#include <boost/thread/mutex.hpp>

#include <boost/bind.hpp>

#include <boost/date_time/posix_time/posix_time.hpp>

class printer

{

public:

printer(boost::asio::io_service& io):

timer1_(io, boost::posix_time::seconds(1)),

timer2_(io, boost::posix_time::seconds(1)),

count_(0)

{

timer1_.async_wait(boost::bind(&printer::print1, this));

timer2_.async_wait(boost::bind(&printer::print2, this));

}

~printer()

{

std::cout << "Final count is " << count_ << "/n";

}

void print1()

{

mutex_.lock();

if (count_ < 10)

{

std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 1: " << count_ << "/n";

++count_;

timer1_.expires_at(timer1_.expires_at() + boost::posix_time::seconds(1));

timer1_.async_wait(boost::bind(&printer::print1, this));

}

mutex_.unlock();

}

void print2()

{

mutex_.lock();

if (count_ < 10)

{

std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 2: " << count_ << "/n";

++count_;

timer2_.expires_at(timer2_.expires_at() + boost::posix_time::seconds(1));

timer2_.async_wait(boost::bind(&printer::print2, this));

}

mutex_.unlock();

}

private:

boost::asio::deadline_timer timer1_;

boost::asio::deadline_timer timer2_;

int count_;

boost::mutex mutex_;

};

int main()

{

boost::asio::io_service io;

printer p(io);

boost::thread t(boost::bind(&boost::asio::io_service::run, &io));

io.run();

t.join();

return 0;

}

这样的效果和原boost::asio的例5是差不多的，boost::asio除了支持原生的线程同步方式外还加入了新的asio::strand是有意义的，因为这两种方式还是有区别的。

1. 用mutex的方式阻塞的位置是已经进入printe函数以后，而strand是阻塞在函数调用之前的。

2. 相对来说，当大量的同样回调函数需要同步时，asio::strand的使用更为简单一些。

3. 用mutex的方式明显能够更加灵活，因为不仅可以让线程阻塞在函数的开始，也可以阻塞在中间，结尾。

4. 对于同步的对象来说，asio::strand就是对其支持的回调对象，mutex是对本身线程的一种同步。

基本上，两者是相辅相成的，各有用处，但是实际上，假如从通用性出发，从额外学习知识触发，个人感觉strand似乎是可有可无的，不知道有没有必须使用strand的情况。。。。

到此，asio文档中tutorial中的timer系列例子是结束了。其实这里展示的以asio基本原理为主，甚至都还没有接触到任何与网络相关的东西，但是，这些却是进一步学习的基础。。。。。。

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月06日

boost::thread库，奇怪的文档没有Tutorial的库，但是却仍然相当强大

boost::thread库，奇怪的文档没有Tutorial的库，但是却仍然相当强大

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

一直以来感觉boost的库作为开源的库文档是非常详细的，绝大部分库的文档由浅入深，一般先有Overview,从Introduction到简单的Tutorial到复杂的example,再到rationale，应有尽有，但是boost::thread是个例外，没有任何Introduction,Tutorial的内容，上来就是class/type的member function，头文件列举，列举完了了事，连一个example也没有，最奇怪的 boost库文档绝对非其莫属，甚至《Beyond the C++ Standard Library: An Introduction to Boost》这本书中也只字未提thread库，这样的确为学习boost::thread库加大了难度。对于初学者就更难受了，毕竟，本来多线程就是一个不那么容易的东西。。。。

但是，不要以为此库就是boost中最默默无名的库了，为C++添加多线程库的呼声一直比较高（虽然B.S.以前在D&E中认为其应该由第三方库来完成这样和操作平台相关性比较大的内容），亏boost::thread库还提案了好几次，结果文档都没有完善-_-!起码也算是可能进入C++标准的东西，咋能这样呢？

最新的提案信息，可以在其文档中搜寻到，已经进入Revision 1的阶段了。《Multi-threading Library for Standard C++ (Revision 1)》

其实，个人认为，一个多线程库可以很简单，实现简单的临界区用于同步就足够应付绝大部分情况了，相对而言，boost::thread这样的库还是稍微庞大了一点。类似于Python中的thread库其实就不错了（据《Programming Python》作者说原型来自于JAVA），通过继承形式使用线程功能（template method模式），还算比较自然，其实我们公司自己内部也实现了一套与之类似的C++版的线程库，使用也还算方便。但是boost::thread走的是另一条路。由于其文档中没有Introduction和Tutorial，我纯粹是摸石头过河似的实验，有用的不对的地方那也就靠大家指出来了。

一、 Introduction：

boost::thread不是通过继承使用线程那种用了template method模式的线程模型，而是通过参数传递函数(其实不仅仅是函数，只要是Callable，Copyable（因为需要复制到线程的本地数据）的就行）。这种模型是好是坏，我一下也没有结论，但是boost::thread库的选择总归是有些道理的，起码从个人感觉来说，也更符合标准库一贯的优先使用泛型而不是继承的传统和作风，这样的模型对于与boost::function,boost::bind等库的结合使用的确也是方便了很多，

1. 题外话：

假如你对win32/linux下的多线程有一定的了解有助于理解boost::thread的使用，假如没有win32/linux的多线程使用经验，那么起码也需要对多线程程序有概念性的了解，起码对于3个概念要有所了解，context switching,rare conditions, atomic operation，最好也还了解线程间同步的一些常见形式，假如对于我上面提及的概念都不了解，建议先补充知识，不然，即便是HelloWorld，估计也难以理解。另外，毕竟本文仅仅是个人学习boost::thread库过程中的一些记录，所以不会对操作系统，线程等知识有透彻的讲解，请见谅。

2. boost::thread的HelloWorld:

example1:

#include <windows.h>

#include <boost/thread.hpp>

#include <iostream>

using namespace std;

using namespace boost;

void HelloWorld()

{

char* pc = "Hello World!";

{

cout <<*pc;

}while(*pc++);

cout <<endl;

}

void NormalFunThread()

{

thread loThread1(HelloWorld);

thread loThread2(HelloWorld);

HelloWorld();

Sleep(100);

}

int main()

{

NormalFunThread();

return 0;

}

不知道如此形式的程序够不够的上一个thread的helloworld程序了。但是你会发现，boost::thread的确是通过构造函数的方式，（就是构造函数），老实的给我们创建了线程了，所以我们连一句完成的helloworld也没有办法正常看到，熟悉线程的朋友们，可以理解将会看到多么支离破碎的输出，在我的电脑上，一次典型的输出如下：

HHeellloHl eoWl olWrool rdWl!od

l d

呵呵，其实我不一次输出整个字符串，就是为了达到这种效果-_-!这个时候需要同步，join函数就是boost::thread为我们提供的同步的一种方式，这种方式类似于利用windows API WaitForSingleObject等待线程结束。下面利用这种方式来实现。

example2:

#include <boost/thread.hpp>

#include <iostream>

using namespace std;

using namespace boost;

void HelloWorld()

{

char* pc = "Hello World!";

{

cout <<*pc;

}while(*pc++);

cout <<endl;

}

void NormalFunThread()

{

thread loThread1(HelloWorld);

loThread1.join();

thread loThread2(HelloWorld);

loThread2.join();

HelloWorld();

}

int main()

{

NormalFunThread();

return 0;

}

这样，我们就能完成的看到3句hello world了。但是这种方式很少有意义，因为实际上我们的程序同时还是仅仅存在一个线程，下一个线程只在一个线程结束后才开始运行，所以，实际中使用的更多的是其他同步手段，比如，临界区就用的非常多，但是我在boost::thread中没有找到类似的使用方式，倒是有mutex（互斥），其实两者对于使用是差不多的。下面看使用了mutex同步线程的例子：

example3:

#include <windows.h>

#include <boost/thread.hpp>

#include <boost/thread/mutex.hpp>

#include <iostream>

using namespace std;

using namespace boost;

mutex mu;

void HelloWorld()

{

mu.lock();

char* pc = "Hello World!";

{

cout <<*pc;

}while(*pc++);

cout <<endl;

mu.unlock();

}

void NormalFunThread()

{

thread loThread1(HelloWorld);

thread loThread2(HelloWorld);

HelloWorld();

loThread1.join();

loThread2.join();

}

int main()

{

NormalFunThread();

return 0;

}

我们还是能看到3个完好的helloworld，并且，这在实际使用中也是有意义的，因为，在主线程进入HelloWorld函数时，假如第一个线程还没有执行完毕，那么，可能同时有3个线程存在，第一个线程正在输出，第二个线程和主线程在mu.lock();此句等待（也叫阻塞在此句）。其实,作为一个多线程的库，自然同步方式不会就这么一种，其他的我就不讲了。

作为boost库，有个很大的有点就是，互相之间结合的非常好。这点虽然有的时候加大了学习的难度，当你要使用一个库的时候，你会发现一个一个顺藤摸瓜，结果都学会了,比如现在，关于boost库的学习进行了很久了，（写了4，5篇相关的学习文章了），从boost::for_each,boost::bind,boost::lambda,boost::function,boost:: string_algo,到现在的boost::thread，其实原来仅仅是想要好好学习一下boost::asio而已。当你真的顺着学下来，不仅会发现对于C++语言的理解，对STL标准库的理解，对于泛型的理解，等等都有更深入的了解，我甚至在同时学习python的时候，感觉到boost库改变了C++的很多语言特性。。。虽然是模拟出来的。呵呵，题外话说多了，其实要表达的意思仅仅是boost::thread库也是和其他boost库有很多紧密结合的地方，使得其使用会非常的方便。这里一并的在一个例子中演示一下。

example4:

#include <boost/thread.hpp>

#include <boost/thread/mutex.hpp>

#include <iostream>

#include <boost/function.hpp>

#include <boost/bind.hpp>

#include <boost/lambda/lambda.hpp>

#include <boost/lambda/bind.hpp>

using namespace std;

using namespace boost;

void HelloWorld()

{

char* pc = "Hello World!";

{

cout <<*pc;

}while(*pc++);

cout <<endl;

}

void NormalFunThread()

{

thread loThread1(HelloWorld);

thread loThread2(HelloWorld);

HelloWorld();

loThread1.join();

loThread2.join();

}

void BoostFunThread()

{

thread loThread1(HelloWorld);

function< void(void) > lfun = bind(HelloWorld);

thread loThread2(bind(HelloWorld));

thread loThread3(lfun);

loThread1.join();

loThread2.join();

loThread3.join();

}

int main()

{

// NormalFunThread();

BoostFunThread();

return 0;

}

一如既往的乱七八糟：

HHHeeelllllolo o W WoWoorrrlldld!d!

但是，正是这样的乱七八糟，告诉了我们，我们进入了真实的乱七八糟的多线程世界了-_-!

还记得可怜的Win32 API怎么为线程传递参数吗？

看看其线程的原型

DWORD ThreadProc(

  LPVOID lpParameter

);

这里有个很大的特点就是，运行线程的函数必须是这样的，规矩是定死的，返回值就是这样，参数就是LPVOID(void*)，你没有选择，函数原型没有选择，参数传递也没有选择，当你需要很多数据时，唯一的办法就是将其塞入一个结构，然后再传结构指针，然后再强行使用类型转换。其实这是很不好的编程风格，不过也是无奈的折衷方式。

注意到没有，其实我们的HelloWold根本就是没有符合这个要求，不过我们一样使用了，这也算是boost::thread的一个很大优点，最大的优点还是在于参数传递的方式上，彻底摆脱了原来的固定死的框架，让你到了随心所欲的使用线程的地步。

看个例子：

example5:

#include <boost/thread.hpp>

#include <boost/thread/mutex.hpp>

#include <iostream>

#include <boost/function.hpp>

#include <boost/bind.hpp>

#include <boost/lambda/lambda.hpp>

#include <boost/lambda/bind.hpp>

using namespace std;

using namespace boost;

mutex mu;

void HelloTwoString(char *pc1, char *pc2)

{

mu.lock();

if(pc1)

{

cout <<*pc1;

}while(*pc1++);

}

if(pc2)

{

cout <<*pc2;

}while(*pc2++);

cout <<endl;

}

mu.unlock();

}

void BoostFunThread()

{

char* lpc1 = "Hello ";

char* lpc2 = "World!";

thread loThread1(HelloTwoString, lpc1, lpc2);

function< void(void) > lfun = bind(HelloTwoString, lpc1, lpc2);

thread loThread2(bind(HelloTwoString, lpc1, lpc2));

thread loThread3(lfun);

loThread1.join();

loThread2.join();

loThread3.join();

}

int main()

{

BoostFunThread();

return 0;

}

这里不怀疑线程的创建了，用了同步机制以方便查看结果，看看参数的传递效果，是不是真的达到了随心所欲的境界啊：）

最最重要的是，这一切还是建立在坚实的C++强类型机制磐石上，没有用hack式的强制类型转换，这个重要性无论怎么样强调都不过分，这个优点说他有多大也是合适的。再一次的感叹，当我责怪牛人们将C++越弄越复杂的时候。。。。。。。。先用用这种复杂性产生的简单的类型安全的高效的库吧。。。。。。关于boost::thread库就了解到这里了，有点浅尝辄止的感觉，不过，还是先知其大略，到实际使用的时候再来详细了解吧，不然学习效率也不会太高。

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年06月01日

Python爱好者的幽默，关于SIP的命名

为方便C/C++为Python写扩展，有个很著名的工具，叫SWIG（本身也是一个单词，意为痛饮），英文全称是The Simplified Wrapper and Interface Generator，即简单化的包装和接口生成器，然后呢，当Python的爱好者准备建立一个轻量级的专门为Python建立的类似工具时，就命名为SIP（小口的喝），其命名让人叫绝。。。。。。。。。。。。我发现基本上用C/C++的程序员碰到程序相关话题的幽默感都比较强,呵呵，不要小看了SIP,PyQT就是依赖其建立的。

阅读全文....

2009年05月30日

继续boost的相关学习， boost::string_algo让std::string这样的废柴类有用一点

继续boost的相关学习， boost::string_algo让std::string这样的废柴类有用一点

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

一、对string态度的慢慢转变

作为一个初学者，本来没有批评大牛们创造的语言及其标准库的资本，初期看的一些教学书籍总是推荐你在绝大多数时候不要使用char*，使用string,比如C++ Primer,C++ Effective等，但是再接着看深点，就开始有书批评C++中的某些库了，比如C++ STL Library，（批评string的书比较多，我倒是一下子举不出例子-_-!）但是，甚至在看过很多书批评std::string后，我还是感觉尽量去使用std::string比较好，知道越用感觉越多的不方面，除了与标准库算法搭配较为方便外，std::string甚至比不过CString好用，性能上的问题也是比我想象的要严重的多，以前在做代码优化的时候发现，按照常用的for循环记法，绝大多数标准库容器都会将begin(),end()这样的迭代器函数调用内联，但是唯独std::string不会，傻傻的每次都调用（在VS2005上测试,全优化结果一致）。

假如这方面你还可以说是MS在string实现上的缺陷的话（事实上现有STL几乎统一是来源于HP的库），那么作为一个string，在std中甚至连一个大小写转换的函数都没有，甚至连大小写不敏感的比较都没有，那就实在说不过去了，想想python,lua等语言中一个string是何其的强大！难道是因为C ++的设计者们都做操作系统，一般不处理字符串？-_-!没有正则表达式的历史原因我就不多说了。。。。

二、轮子没有总会有人造的

于是乎，在需要相关功能的时候，需要用replace，stricmp加上一大串参数来完成，从理解的角度来说其实并不那么直观，对于普通功能的实现我能理解，但是对于常用功能老是这样用，感觉总是不爽，于是问题又来了，自己做个自己的库吧，下次重复利用就完了，才发现，当一种语言需要人造一个本应该存在的轮子的时候，这个语言本身就有问题。

再于是乎，造轮子的人boost::string_algo诞生了，可见std::string的恶劣。。。。自然，像这种库要进标准肯定是没有戏了，自己偷着用吧。

三、 boost::string_algo个人概述

string_algo我粗略的看了一下，相对而言还是比较强大，实现了很多功能。个人觉得必要的功能，比如大小写转换：to_upper，to_lower有了。在python中用的很愉快的字符串处理函数，很有用的功能,比如trim, trim_left , trim_right (Python中对应的函数名字是strip，lstrip,rstrip)，Python中经常用到的starts_with也有了，相关的很多find函数也出来了，最最让我高兴的是，splice函数也有了。。。。这个函数在处理CSV数据的时候相当有用，一个函数往往可以节省很多的工作量。另外，其实从函数的参数来看就能看出两种语言的文化，C++的很多算法喜欢以_if的形式提供，要求用户提供判断函数，甚至是trim_left这样的函数也不提供一个以字符形式提供出来的参数以指定到底削除啥，而是提供了trim_left_if的函数来由你判断。。。。个人感觉，虽然_if的用法有的时候能极大的提高算法的适用范围，但是对于简单应用来说，实在离make simple things easy原则，何况，在没有lambda语法，没有好用的bind,却有强类型的复杂函数定义形式，复杂的函数指针语法，超复杂的成员函数指针语法的C++中，要好好的用对，用好一个类似的情况，实在并不见得那么简单。。。。。。。。。。。

最近老是将C++与Python做比较，感觉自己似乎更像是一个初学C++的Python程序员在不停的抱怨。。。其实本人还是完全靠C++吃饭的，仅仅是业余时间学习一下Python(工作中也用，但是比较少)，本人的观点一向是，虽然将不关心效率，奇慢的动态类型语言Python和以效率为根本的强类型C++做语法比较对C++非常不公平，但是，来自另一个语言的经验其实对于反思一个语言需要什么和怎么样更好的使用现有语言还是很有帮助的，就是因为这种相互的促进，我才会从需求的角度去学习了很多boost的库，比如《多想追求简洁的极致，但是无奈的学习C++中for_each的应用》，《其实C++比Python更需要lambda语法，可惜没有。。。。》，《boost::function，让C++的函数也能当第一类值使用》不然，都是纯粹的跟着文档走的学习，那个效果要差的太多。

四、例子：

这里将我试用我感兴趣函数的例子列出来，其实，知道个大概以后，留份参考文档，以后有需要的时候，会发现工具箱中实在是又多了很重要的一项工具。

1. 大小写转换

void CaseChange()

{

string lstr("abcDEF");

string lstr2(lstr);

// 标准形式

transform(lstr.begin(), lstr.end(), lstr.begin(), tolower);

cout <<lstr <<endl;

// string_algo

to_lower(lstr2);

cout <<lstr2 <<endl;

}

比较奇怪的就是，为啥to_lower这样的函数不返回转换后的字符串，这样可以支持很方便的操作，难道也是出于不为可能不需要的操作付出代价的原则考虑？。。。。。。很有可能，想想标准的map.erase函数吧，看看《C++ STL Library》中相关的评价及说明。这是C++程序员的一贯作风。

2. 大小写不敏感的比较

void CaseIComp()

{

string lstrToComp("ABCdef");

string lstr("abcDEF");

string lstr2(lstr);

string lstrTemp;

string lstrTemp2;

// 标准形式worst ways

transform(lstr.begin(), lstr.end(), back_inserter(lstrTemp), tolower);

transform(lstrToComp.begin(), lstrToComp.end(), back_inserter(lstrTemp2), tolower);

cout <<(lstrTemp == lstrTemp2) <<endl;

// 标准形式

cout << !stricmp(lstr.c_str(), lstrToComp.c_str()) <<endl;

// string_algo 1

cout << (to_lower_copy(lstr2) == to_lower_copy(lstrToComp)) <<endl;

// no changed to original values

cout << lstr2 <<" " <<lstrToComp <<endl;

// string_algo 2 best ways

cout << iequals(lstr2, lstrToComp) <<endl;

}

好了，我们有比调用stricmp更好的办法了，如例子中表示的。

3. 修剪

void TrimUsage()

{

// 仅以trim左边来示例了，trim两边和右边类似

string strToTrim=" hello world!";

cout <<"Orig string:[" <<strToTrim <<"]" <<endl;

// 标准形式:

string str1;

for(int i = 0; i < strToTrim.size(); ++i)

{

if(strToTrim[i] != ' ')

{

// str1 = &(strToTrim[i]); at most time is right,but not be assured in std

str1 = strToTrim.c_str() + i;

break;

}

cout <<"Std trim string:[" <<str1 <<"]" <<endl;

// string_algo 1

string str2 = trim_left_copy(strToTrim);

cout <<"string_algo string:[" <<str2 <<"]" <<endl;

// string_algo 2

string str3 = trim_left_copy_if(strToTrim, is_any_of(" "));

cout <<"string_algo string2:[" <<str3 <<"]" <<endl;

}

可见一个常用操作在std中多么复杂，甚至直接用char*实现这样的操作都要更简单，效率也要更高，string版本不仅失去了效率，也失去了优雅。但是在有了合适的库后多么简单，另外，假如std可以有更简化的办法的话，欢迎大家提出，因为毕竟我在实际工作中也没有boost库可用。

4. 切割

不是经常处理大量数据的人不知道字符串切割的重要性和常用长度，在比较通用的数据传递方式中，CSV（Comma separated values）是比较常用的一种，不仅仅在各种数据库之间传递很方便，在各种语言之间传递数据，甚至是用excel打开用来分析都是非常方便，Python中string有内置的split函数，非常好用，我也用的很频繁，C++中就没有那么好用了，你只能自己实现一个切割函数，麻烦。

示例：

void SplitUsage()

{

// 这是一个典型的CSV类型数据

string strToSplit("hello,world,goodbye,goodbye,");

typedef vector< string > SplitVec_t;

SplitVec_t splitVec1;

// std algo 无论在任何时候string原生的搜寻算法都是通过index返回而不是通过iterator返回，总是觉得突兀

// 还不如使用标准库的算法呢，总怀疑因为string和STL的分别设计，是不是string刚开始设计的时候,还没有加入迭代器？

string::size_type liWordBegin = 0;

string::size_type liFind = strToSplit.find(',');

string lstrTemp;

while(liFind != string::npos)

{

lstrTemp.assign(strToSplit, liWordBegin, liFind - liWordBegin);

splitVec1.push_back(lstrTemp);

liWordBegin = liFind+1;

liFind = strToSplit.find(',', liWordBegin);

}

lstrTemp.assign(strToSplit, liWordBegin, liFind - liWordBegin);

splitVec1.push_back(lstrTemp);

BOOST_FOREACH(string str, splitVec1)

{

cout <<" split string:" <<str <<endl;

}

// string_algo

SplitVec_t splitVec2;

split( splitVec2, strToSplit, is_any_of(",") );

BOOST_FOREACH(string str, splitVec2)

{

cout <<" split string:" <<str <<endl;

}

5. 其他：

至于其他的搜寻，替换，由于标准库中的已经比较强大了，string_algo也就算锦上添花吧，我的对搜寻和替换的需求不是那么强烈。大家参考boost本身的文档吧。在搜寻中提供搜寻迭代器的方式倒是还是使易用性有一定提高的。至于正则表达式这个字符串处理的重头戏嘛，因为tr1中都有了regex库了，平时使用的时候还是以此为主吧，相对string_algo这个宿命就是存在于boost的库来说，tr1的regex库在将来肯定可移植性及通用性更强。

五、稀泥终究糊不上墙

虽然string_algo可以弥补std::string的功能缺陷，但是string的性能问题又怎么解决啊？。。。。因为string如此的弱，导致stringstream的使用都有很大的问题（在内存上），有兴趣的可以再网上搜搜，当年工作的时候就是因为太相信标准库，结果用了stringstream来实现整数到string的转换，结果经历除了痛苦就是痛苦，公司服务器因为此问题全服停止了2次。stringstream在作为局部变量存在的时候都会无止尽的消耗内存，这点我以前怎么也无法想象。

实际中的使用也是反应了std::string的悲哀，看看有多少实际的项目中没有实现自己的一套string?

其实寄希望于regex库出来后，std::string就能涅槃也不是太可能，std::string的string可能在平时的使用中是个简化编程的好办法，但是在对性能和内存占用稍微有点追求的程序，估计，std::string的使用还是会少之又少。。。。。。。。。。。

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年05月27日

恐怖的boost库，难道还有什么是没有的吗？改变了对原有跨平台支持库开发想法。假如我以后不能使用boost库那怎么办啊？！

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

最近这段时间以理解asio为契机，开始学习了一些以前并不用到的boost库，慢慢的发现boost库的强大远超过我的想象，以前我也就用用boost库中的智能指针，后来TR1出来后，在学习正则表达式的时候，尝试用过其boost::regex这个以后肯定会进C++09标准的东西，其他东西用的还真是不多，毕竟工作中学习的时间还是少了些，公司的开发又完全不准用boost的，加上又有太多东西要学，直到最近说要学习网络编程的相关知识，然后找到了asio，才开始又一次的慢慢接触boost库了，再加上学习python的过程中，不断的对C++现有体系进行了反思（其实主要还是语法上的，毕竟我层次还没有那么高），常常回过头来看看C++中的对应用法，虽然这里将一个超级慢的动态语言的语法优点来和以效率为生命的C++来对比不是太公平，但是起码我加深了对很多东西的理解，在连续的三篇文章中可以得到体现：

分别是《多想追求简洁的极致，但是无奈的学习C++中for_each的应用》，《其实C++比Python更需要lambda语法，可惜没有。。。。》，《boost::function，让C++的函数也能当第一类值使用》，其实，也许还可以为boost::bind专门写一篇，但是后来觉得自己对嵌套反复的bind语法还是比较反感，就作罢了，虽然boost::bind比现在标准库中的bind1st,bind2nd加一大堆的mem_funXXX的东西优美一些，但是即便新的标准加上bind,加上mem_fn，其实感觉还是在语言上打补丁一样，还不是那么的优美，唉。。。具体要多优美，这个我也就不好说了。。。。。。。弱弱的说一句，看看python的列表解析。。。。

但是，对boost库的进一步了解，倒是让我对自己写一个可移植网络框架的心少了很多，一方面的确也是精力不太够，另一方面我发现自己没有办法写的比boost中现有的更好（也许能更加精简）。这里将原有的计划列出来，对比一下boost中已经有的东西。详细计划见原来的文章《工作之外的学习/开发计划(1) -- windows/linux服务器程序支持库的开发》

工作之外的学习/开发计划(1) -- windows/linux服务器程序支持库的开发

简化服务器程序的开发，设计合理的服务器框架程序。 --- 主要目的

实现工程文件（对我来说一般就是服务器程序）以同一套源码在windows和linux下的编译运行。 -- 次要目的，但是必完成-_-!

其实对于支持库来说，很重要的一点来说，就是尽量对外实现操作系统无关的同一套接口。。。。

需求列表：（仅仅是知识还非常欠缺的我目前能想到的，以后需要再添加吧）

网络框架支持：

windows服务器TCP用IOCP模型

linux服务器TCP用epoll模型

UDP都用同一套

另外，为简单的网络应用程序（其实在我们公司中是客户端跑的）使用select模型。（就目前我浅显的网络知识了解，我不明白为什么要这样，select其实也是属于I/O复用的模型了，但是实际客户端方面用的都是单套接字，单连接，单处理线程，难道仅仅是为了防止输入阻塞和套接字API阻塞的冲突？）

服务器其他支持库

1. 序列化支持。

2. 日志系统支持。

3. 脚本语言配置文件支持。（可考虑上脚本系统）

4. 运行时异常dump支持。

5. 多线程支持。（其实看很多书说Linux下弄单进程更好，但是那样和windows下共用一套源码估计就更难了。）

6. windows下也能跑的posix库。

7. ODBC库/mySQL API 支持库(任选)

8. 多进程支持（待选）。

这里给出一个boost中已有实现的对比。。。

首先，我的目标。。。。可移植的网络框架，以简化开发服务器为目的。。

boost作为准标准库可移植性实在是做的够好了。。。。看看其文档中支持的编译器/环境列表就知道了，远超我以前的目标Windows/Linux可移植。。。。

其次，核心模块，网络框架，boost::asio的实现就是和我原有设想的实现一模一样。。。。。windows用IOCP模型，linux用epoll，并且作为一个已经获得较广泛使用的轻量级网络支持库，见《Who is using Asio? 》并且甚至可能进入下一版的C++标准（虽然个人认为可能性还是比较小），还是远超我以前的目标。。。。。当然，我从来没有目标去达到想ACE一样，asio就很不错了。

然后，其他支持库：

1.序列化支持。==》boost::serialize库，强大，强大，还是强大，我有一系列文章，介绍了其强大和使用方法。《序列化支持(1)》，《序列化支持(2)—Boost的序列化库》，《序列化支持(3)—Boost的序列化库的使用》，《序列化支持(4)—Boost的序列化库的强大之处》。但是比较遗憾的是，虽然据说BS以前很遗憾的说，现在C++中最大的遗憾就是没有数据持久化和垃圾回收机制的话，但是，boost::serialize似乎还是没有希望进入下一版C++标准，也就是说，我们以后的C++只能还是没有一个标准的序列化方式可以使用。。。。

2.日志系统支持。==》这个boost种据说以前有过，但是后来停止开发了，最近也在某个地方看到过一个使用标准iostream的日志库，一下子没有找到，但是其实实在没有，用log4cxx也不错。再没有自己写一个也不难（在公司我就完成过用额外的线程写日志的工作，包括测试也就用了一天时间）

3. 脚本语言配置文件支持。（可考虑上脚本系统）==》这点更好了，boost::Program Options就是完全对应的选择，并且，假如我想用Python，我还可以不仅仅是使用Python的C API，个人感觉要完全用Python C API去映射C++的类相对来说还是比较难，但是有了boost::Python库又不一样了。。。。生活可以更美的。

4.运行时异常dump支持。==》这个嘛。。。。除了我以前研究过的google的breakpad似乎不知道还有其他更好的选择没有，实在不行也就只能自己做了。注意啊，这个dump不仅仅是出错/异常的时候，我应该能在我需要的任何时候进行core dump，而不影响程序的运行。

5.多线程支持。==》天哪。。。boost::thread库就是为这个准备的，并且其实现的确不错，虽然假如自己要实现一套也不算太难。

6. windows下也能跑的posix库。==》虽然说boost::System库设计的就像是错误代码集合库，但是在大量的使用boost库后，posix的很多功能有没有必要使用还难说。。。。。比如说boost::Filesystem,boost:: Date Time的支持作用还是很强的

7. 多进程支持==》创建进程的方式在Windows(CreateProcess)和Linux（Fork）下的差异还是有的，但是也不是完全不能统一，麻烦点的就是Windows没有僵尸进程的概念导致进程ID不能真正区别出一个进程，比如说一个ID100的进程关闭了，然后一个新的进程启动后ID为100，并且父子进程之间的联系比Linux下要弱的多。，比如没有getppid系统调用，其实CreateProcess后，父进程将子进程的句柄都close后，两者就几乎没有关系了，可能仅有的联系就是子进程继承了父进程的核心对象列表,而Linux下子进程可是必须等待父进程来‘收割的啊：）但是最最核心的问题还是IPC（进程通信）的问题，还好，boost:: Interprocess又是完美的满足了需求

如上所述。。。一个可移植的网络服务器开发支持库，基本上也就是boost库的一个子集。。。。我们能使用的还有如boost::Signals这样的库来实现网络包的分发及映射，有如智能指针，boost::funtion, boost::lambda, boost::bind，boost::foreach来简化我们的程序，这个C++的世界还能更加美好吗？。。。。。。。呵呵，除非等下一代C++的标准出来了。。。。。

也许，使用boost库唯一的问题是。。。。让我使用了这么好用的库，假如我以后不能使用boost库那怎么办啊？！

呵呵，当然，总是要在自己的程序中带上很多的boost库的dll，其实也算是个问题。

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

2009年05月27日

boost::function，让C++的函数也能当第一类值使用

boost::function，让C++的函数也能当第一类值使用

write by 九天雁翎(JTianLing) -- www.jtianling.com

讨论新闻组及文件

最近在学习Python比较特殊的语法的时候，顺便也研究了一下C++中的类似实现。。。本文中的一些内容也可以参考前几天写的文章《多想追求简洁的极致，但是无奈的学习C++中for_each的应用

》《其实C++比Python更需要lambda语法，可惜没有。。。。》最后发现类似的实现都要通过boost,并且还是比较扭曲的通过，然后，起码还算是实现那么回事了。然后前天工作正好属于配合工作，稍微有点时间，大致的看了下《Beyond the C++ Standard Library: An Introduction to Boost》，加深了一些理解，这里感叹一下，其起到的引导性作用还是不错的，可以有个大概的概念了以后再看boost的文档，那样更加事半功倍，虽然boost文档已经非常的好了，但是毕竟文档还是不同于教学。

当然，就算这个库真的进了C++09标准了其实也还是没有真的将函数作为第一类值那样简洁高效。。。。但是BS的原则嘛。。。尽量不扩张语言功能，而是去用库来实现。。。直到其没有办法，D&E上其认为C++中不需要原生的并发支持，用库就好了（虽然其实连标准库也没有）的言语还历历在目，09标准中却已基本通过增加并发的内存模型了。也不知道为啥。。。其语言只要符合需要就好，不成为特性的拼凑的思想，很显然让reflect，closure这样的特性也没有办法进入09标准了。。。无奈一下。

简单描述一下问题：在函数是第一类值的语言中，你可以保存，动态创建及传递一个函数，就像是一个普通的整数一样。比如python,lua中你就能这样。但是在C++这样的语言中，你需要用函数指针来保存一个普通的函数，然后用类对象来保存函数对象。

参考《其实C++比Python更需要lambda语法，可惜没有。。。。》一文中的例子比如在Python中，你可以这样：

1
2 def add1(a,b):  return a + b
3 add2 = lambda a,b : a + b
4
5 class Add():
6     def __init__(self):
7         self._i = 0
8
9     def reset(self):
10         self._i = 0
11
12     def add(self, a, b):
13         self._i += a + b
14         return self._i
15
16 addobj = Add()
17 add3 = addobj.add
18
19 print add1(1,1)
20 print add2(1,2)
21 print add3(1,3)
22 addobj.reset()
23
24 fun = lambda f,a,b : f(a,b)
25
26 print fun(add1, 1, 1)
27 print fun(add2, 1, 2)
28 print fun(add3, 1, 3)
29 print fun(lambda a,b : a + b, 1, 4)

一个函数一旦定义以后，然后在以后再使用此函数，就像任何普通的函数定义一样，调用此函数并不关心函数实际的定义方式，是用lambda语法还是实际的完整函数定义，甚至是类的一个成员函数也是一样，语法完全一模一样。

在C++中需要区别的就是，到底是函数对象还是函数指针，到底是普通函数还是对象的成员函数。。。假如加上boost库的话，还需要加上boost::lambda的定义的保存功能。假如函数真是第一类值的话，那么可以将以上的多种形式的函数语法统一，你尝试用用C++实现上述的例子就知道这些东西之间的语法差异多大了。但是，很明显，我们暂时不能奢望这一点，还好有boost::funciton库。

其实，就C++中的实现而言，光是需要实现函数的回调，通过模板来实现已经可以比较简化，因为函数模板的自动推导模板参数的功能可以让我们几乎不用写任何别扭的语法，也能实现上述的一些功能，比如普通函数，函数对象，还有boost::lambda库。

例子如下：

#include <list>

#include <iostream>

#include <boost/lambda/lambda.hpp>

#include <boost/bind.hpp>

using namespace std;

using namespace boost;

int add1(int a, int b)

{

return a + b;

}

class add2

{

public:

int operator()(int lhs, int rhs)

{

return lhs + rhs;

}

};

class CAdd

{

public:

int add3(int a, int b) { return a + b; }

};

template<typename FUN, typename T>

T fun(FUN function, T lhs, T rhs)

{

cout <<typeid(function).name() <<endl;

return function(lhs, rhs);

}

int add4(int a, int b, int c)

{

return a+b+c;

}

int main()

{

cout << fun(add1, 1, 1) <<endl;

cout << fun(add2(), 1, 2) <<endl;

cout << fun(lambda::_1+lambda::_2, 1, 3) <<endl;

cout << fun(bind(add4, 0, _1, _2), 1, 4) <<endl;

system("PAUSE");

}

直到这里，问题都不大，语法也算足够的简洁，模板的强大可见一斑。但是一旦你准备开始使用类的成员函数指针或者碰到需要将lambda生成的对象保存下来，需要将boost::bind生成的对象保存下来重复使用的时候，就碰到问题了，先不说类成员函数指针这种非要看过《Inside C++Object》一书，才能理解个大概，并且违反C++一贯常规的指针行为，甚至大小都可能超过sizeof(void*)的指针，起码在不了解boost这些库的源码的时候，我们又怎么知道bind,lambda生成的是啥吗？我是不知道。我想boost::function给了我们一个较为一致的接口来实现这些内容，虽然说实话，在上面例子中我列出来的情况中boost::function比起单纯的使用模板（如例子中一样）甚至要更加复杂，起码你得指明返回值和参数。（熟称函数签名）

我在上面例子中特意用运行时类型识别输出了lambda和bind生成的类型，分别是一下类型：

int (__cdecl*)(int,int)

class add2

class boost::lambda::lambda_functor<class boost::lambda::lambda_functor_base<cl

ss boost::lambda::arithmetic_action<class boost::lambda::plus_action>,class boo

t::tuples::tuple<class boost::lambda::lambda_functor<struct boost::lambda::plac

holder<1> >,class boost::lambda::lambda_functor<struct boost::lambda::placehold

r<2> >,struct boost::tuples::null_type,struct boost::tuples::null_type,struct b

ost::tuples::null_type,struct boost::tuples::null_type,struct boost::tuples::nu

l_type,struct boost::tuples::null_type,struct boost::tuples::null_type,struct b

ost::tuples::null_type> > >

class boost::_bi::bind_t<int,int (__cdecl*)(int,int,int),class boost::_bi::list

<class boost::_bi::value<int>,struct boost::arg<1>,struct boost::arg<2> > >

当我觉得int (*)(int,int)形式的函数指针都觉得复杂的是否。。。怎么样去声明一个个看都看不过来的lambda类型和bind类型啊。。。。。看看同样的用boost::function的例子。

#include "stdafx.h"

#include <list>

#include <iostream>

#include <boost/lambda/lambda.hpp>

#include <boost/bind.hpp>

#include <boost/function.hpp>

#include <boost/ref.hpp>

using namespace std;

using namespace boost;

int add1(int a, int b)

{

return a + b;

}

class add2

{

public:

int operator()(int lhs, int rhs)

{

return lhs + rhs;

}

};

class CAdd

{

public:

int add3(int a, int b) { return a + b; }

};

template<typename FUN, typename T>

T fun(FUN function, T lhs, T rhs)

{

cout <<typeid(function).name() <<endl;

return function(lhs, rhs);

}

int add4(int a, int b, int c)

{

return a+b+c;

}

int fun2(boost::function< int(int,int) > function, int a, int b )

{

cout <<typeid(function).name() <<endl;

return function(a,b);

}

int main()

{

cout << fun2(add1, 1, 1) <<endl;

cout << fun2(add2(), 1, 2) <<endl;

cout << fun2(lambda::_1+lambda::_2, 1, 3) <<endl;

cout << fun2(bind(add4, 0, _1, _2), 1, 4) <<endl;

system("PAUSE");

}

还是输出了最后函数中识别出来的类型：

class boost::function<int __cdecl(int,int)>

你会看到，当使用了boost::funciton以后，输出的类型得到了完全的统一，输出结果与前一次对比，我简直要用赏心悦目来形容了。特别是，boost::funtion还可以将这些callable(C++中此概念好像不强烈，但是在很多语言中是一个很重要的概念)的东西保存下来，并且以统一的接口调用，这才更加体现了boost::funtion的强大之处，就如我标题所说，将C++中的函数都成为第一类值一样（虽然其实还是不是），并且将callable都统一起来。

例子如下：

#include "stdafx.h"

#include <list>

#include <iostream>

#include <boost/lambda/lambda.hpp>

#include <boost/bind.hpp>

#include <boost/function.hpp>

#include <boost/ref.hpp>

using namespace std;

using namespace boost;

int add1(int a, int b)

{

return a + b;

}

class add2

{

public:

int operator()(int lhs, int rhs)

{

return lhs + rhs;

}

};

class CAdd

{

public:

int add3(int a, int b) { return a + b; }

};

template<typename FUN, typename T>

T fun(FUN function, T lhs, T rhs)

{

cout <<typeid(function).name() <<endl;

return function(lhs, rhs);

}

int add4(int a, int b, int c)

{

return a+b+c;

}

int fun2(boost::function< int(int,int) > function, int a, int b )

{

cout <<typeid(function).name() <<endl;

return function(a,b);

}

int main()

{

boost::function< int(int,int) > padd1 = add1;

boost::function< int(int,int) > padd2 = add2();

boost::function< int(int,int) > padd3 = lambda::_1+lambda::_2;

boost::function< int(int,int) > padd4 = bind(add4, 0, _1, _2);

cout << fun2(padd1, 1, 1) <<endl;

cout << fun2(padd2, 1, 2) <<endl;

cout << fun2(padd3, 1, 3) <<endl;

cout << fun2(padd4, 1, 4) <<endl;

system("PAUSE");

}

当你发现你能够这样操作C++中的callable类型的时候，你是不是发现C++的语言特性都似乎更加前进一步了？^^起码我是这样发现，甚至有了在Python,lua中将函数做第一类值使用的感觉。

好消息是。。。bind,function都比较有希望进C++09标准^^至于lambda嘛。。。似乎还有点悬。。。虽然BS还是会一如既往的说，一个好的编程语言绝不是一大堆有用特性的堆积，而是能在特定领域出色完成特定任务，但是。。。多一些功能，总比没有好吧，毕竟，人们还可以选择不用，没有特性。。。则只能通过像boost开发者这样的强人们扭曲的实现了，甚至将C++引入了追求高超技巧的歧途，假如一开始这些特性就有的话，人们何必这样做呢。。。。。。话说回来，那些纯Ｃ的使用者又该发言了。。。心智包袱。。。。呵呵，我倒是很想去体会一下在纯C情况下真实的开发应该是怎么样组织大规模的程序的，然后好好的体会一下没有这样的心智包袱的好处（虽然读过很多纯Ｃ的书籍。。。。但是实际开发中都没有机会用纯C，所以对C的理解其实主要还是停留在C++的层面上，这点颇为遗憾）

参考资料：

1.《Beyond the C++ Standard Library: An Introduction to Boost

》By Björn Karlsson

2. boost.org

write by 九天雁翎(JTianLing) -- www.jtianling.com

阅读全文....

上一页 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 下一页

文章列表

《数据结构与算法分析C++描述》 分离链接(separate chaining)哈希表的C++实现

分离链接(separate chaining)哈希表的实现：

测试代码

ASIO—下一代C++标准可能接纳的网络库（3）UDP网络应用

一、 综述

二、 Tutorial

1. Daytime.4 - A synchronous UDP daytime client（同步UDP daytime客户端）

2. A synchronous UDP daytime server（同步的UDP daytime服务器）

3. An asynchronous UDP daytime server(异步 UDP daytime 服务器)

ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用

一、 综述

二、 Tutorial

1. A synchronous TCP daytime client（一个同步的TCP daytime客户端程序）

2. Daytime.2 - A synchronous TCP daytime server（同步的TCP daytime服务器）

3. An asynchronous TCP daytime server（异步TCP daytime服务器）

ASIO—下一代C++标准可能接纳的网络库（1）简单的应用

一、 综述

二、 Tutorial

1. Timer.1 - Using a timer synchronously（使用同步定时器）

2. Timer.2 - Using a timer asynchronously（使用异步定时器）

3. Timer.3 - Binding arguments to a handler（绑定参数到handler）

4. Timer.4 - Using a member function as a handler

5. Timer.5 - Synchronising handlers in multithreaded programs

6. MyExample1：Synchronising handlers in multithreaded programs in normal way

一、 Introduction：

1. 题外话：

2. boost::thread的HelloWorld:

一、 对string态度的慢慢转变

二、 轮子没有总会有人造的

三、 boost::string_algo个人概述

四、 例子：

1. 大小写转换

2. 大小写不敏感的比较

3. 修剪

4. 切割

5. 其他：

五、 稀泥终究糊不上墙

恐怖的boost库，难道还有什么是没有的吗？改变了对原有跨平台支持库开发想法。假如我以后不能使用boost库那怎么办啊？！

《数据结构与算法分析C++描述》分离链接(separate chaining)哈希表的C++实现

一、综述

一、综述

一、综述

一、对string态度的慢慢转变

二、轮子没有总会有人造的

四、例子：

五、稀泥终究糊不上墙