请教: 关于服务器端内存管理与分配

liam · 发表于 2005-11-7 15:01:00

大家都知道,做为服务器端会涉及各种大小不同数据的大量传输,如果用NEW和DELETE来分配的话势必会造成服务器端效率的大大降低
我刚接触游戏不久,针对这个我想了一个办法,不知道合适不合适,请大家帮我参考一下他的可行性,呵,我经验太少...理论和实践毕竟差一大截

内存形式是类似分页管理,每一页是固定大小的,代表一块内存

分配内存时候是这样,先查看一下前一个从池中分配的内存中剩余的空间够不够其占用,如果够就在其中分配,不够则从新从池中去一块新的页.比如说前一个操作是从池中取得了一页page1,为其分配了一个1K空间大小,那么其中将剩余4K可用,下一个数据来了,需要3K空间,然后就在page1中再分配3K出去,page1中剩余1K可用,然后又一个数据来了,是3K,page1中空间不够了,那就从池中再取一页PAGE2出来,然后给其分配.如果说一页中某一小块内存使用后被释放了,那么其剩余页内可用内存一般不会扩大.页内分配内存其行为类似一个栈,只有释放紧挨着可用内存的才会扩大可用内存.

释放内存是这样的,只有每一内存页中全部占用的内存小块释放掉,整个内存才能加入可用内存页栈,否则不会加入.只释放页

看一下我的图,大家帮我参考一下吧,呵呵,谢谢..!!~

sssxueren · 发表于 2005-11-7 21:59:00

其实，为什么不干脆一块一块的用呢？

譬如分配一个由一系列5kb的内存块组成的内存池，每次需要用其中一个就好了，不用就还入内存池队列，这样操作会少很多，效率自然也就高一些

而且，还有就是windows里面存在内存锁的概念，内核io操作的时候，会锁定内存分页的，所以windows缺省的buff都是4kb，正好一个内存分页

建议对最大的消息做一个限制，缓冲是该限制的2倍大，这样就可以避免消息合并的操作了。

千里马肝 · 发表于 2005-11-8 09:56:00

去看一下侯捷老师写的《池内春秋》

tarkey · 发表于 2005-11-8 10:32:00

一个我以前用过的方法，感觉挺好的。
根据16byte,32byte,64byte,128byte 一直到1k，每种事先new一定的数量，分配的时候就最小匹配的原则分配。我相信一般的服务器程序不会有一句需要申请1K+内存的情况吧，当然，WEB服务除外，这里讨论的是纯逻辑服务器。

liam · 发表于 2005-11-9 13:42:00

sssxueren: Re:请教: 关于服务器端内存管理与分配

其实，为什么不干脆一块一块的用呢？

譬如分配一个由一系列5kb的内存块组成的内存池，每次需要用其中一个...

主要是考虑到数据块大小可变,实际有很多小块,如果那样怕造成大量内存浪费..

疯子阿虹 · 发表于 2005-11-9 19:50:00

标准的内存池系统

template< bool threads, int inst >
class CDefaultMemPool
{
public:

//////////////////////////////////////////////////////////////////
// 内存分配
static void * allocate( size_t n );

//////////////////////////////////////////////////////////////////
// 内存回收
static void deallocate( void *p, size_t n );

//////////////////////////////////////////////////////////////////
// 内存的重新分配 ( C++中需要慎用！ )
static void * reallocate( void* p, size_t old_sz, size_t new_sz );

private:
class lock
{
lock() { __NODE_ALLOCATOR_LOCK; }
~lock() { __NODE_ALLOCATOR_UNLOCK; }
};

union obj
{
union obj* free_list_link; // 指向下一块无用内存
char client_data[1]; // 指向用户内存
};

private:
/////////////////////////////////////////////////////////////////
// 分别是内存倍数对齐大小、内存池最大控制大小、和块控制器数量
// 其单位均是bytes.
static const int __ALIGN = 8;
static const int __MAX_BYTES = 128;
static const int __NFREELISTS = __MAX_BYTES/__ALIGN;

/////////////////////////////////////////////////////////////////
// 当前的内存池的状态
static char* start_free;
static char* end_free;
static size_t heap_size;

/////////////////////////////////////////////////////////////////
// 内存块管理链表
static obj * __VOLATILE free_list[__NFREELISTS];

#ifdef _MT
/////////////////////////////////////////////////////////////////
// 池的多线控制器
static CRITICAL_SECTION __node_allocator_lock;
static bool __node_allocator_lock_initialized;
#endif

private:

/////////////////////////////////////////////////////////////////
// 对齐到__ALIGN，返回__ALIGN的倍数
static size_t ROUND_UP( size_t bytes )
{
return ( ( ( bytes ) + __ALIGN-1 ) & ~( __ALIGN - 1 ) );
}

/////////////////////////////////////////////////////////////////
// 获取相应大小的块管理器
static size_t FREELIST_INDEX(size_t bytes)
{
return (((bytes) + __ALIGN-1)/__ALIGN - 1);
}

/////////////////////////////////////////////////////////////////
// Returns an object of size n, and optionally adds to size n free list.
static void *refill( size_t n );

/////////////////////////////////////////////////////////////////
// Allocates a chunk for nobjs of size "size". nobjs may be reduced
// if it is inconvenient to allocate the requested number.
static char *chunk_alloc( size_t size, int &nobjs );
};

template <bool threads, int inst>
char *CDefaultMemPool<threads, inst>::start_free = 0;

template <bool threads, int inst>
char *CDefaultMemPool<threads, inst>::end_free = 0;

template <bool threads, int inst>
size_t CDefaultMemPool<threads, inst>::heap_size = 0;

template <bool threads, int inst>
typename CDefaultMemPool<threads, inst>:

bj * __VOLATILE
CDefaultMemPool<threads, inst>::free_list[ __NFREELISTS ] = { 0 };

#ifdef __STL_WIN32THREADS
template <bool threads, int inst> CRITICAL_SECTION
CDefaultMemPool<threads, inst>::__node_allocator_lock;

template <bool threads, int inst> bool
CDefaultMemPool<threads, inst>::__node_allocator_lock_initialized
= false;
#endif

/* We allocate memory in large chunks in order to avoid fragmenting    */
/* the malloc heap too much.                                           */
/* We assume that size is properly aligned.                            */
/* We hold the allocation lock.                                        */
template <bool threads, int inst>
char*
CDefaultMemPool<threads, inst>::chunk_alloc(size_t size, int& nobjs)
{
char* result = NULL; // 结果指针
size_t total_bytes = size * nobjs; // 一共请求内存
size_t bytes_left = end_free - start_free; // 池内现有内存大小

if ( bytes_left >= total_bytes )
{
// 如果池内剩余内存大小足够请求者请求的最大限度
// 将请求者请求的内存大小从池内拨给请求者
result = start_free;
start_free += total_bytes;
return( result );
}
else if ( bytes_left >= size )
{
// 如果池内剩余内存大小仅仅够请求者请求的最小限度
// 那么将拨给能给请求者的最大的最小限度倍数
nobjs = bytes_left/size;
total_bytes = size * nobjs;
result = start_free;
start_free += total_bytes;
return( result );
}
else
{
// 如果池内连请求者请求的最小限度都不够了
// 那么内存池向系统请求n倍于请求者大小的内存
size_t bytes_to_get = ( 2 * total_bytes ) + ROUND_UP( heap_size >> 4 );

// 看看池内的剩余的那点内存拨给其他的小内存块
if ( bytes_left > 0 )
{
obj * __VOLATILE * my_free_list =
free_list + FREELIST_INDEX( bytes_left );
( ( obj* )start_free )->free_list_link = ( *my_free_list );
*my_free_list = ( obj* )start_free;
}

// 向系统申请
start_free = ( char* )malloc( bytes_to_get );
if ( 0 == start_free )
{
// 系统此时也没有内存了……
// 内存池把其他块中没有使用的内存回收
// 并重新计算
int i;
obj * __VOLATILE * my_free_list, *p;
// Try to make do with what we have.  That can't
// hurt.  We do not try smaller requests, since that tends
// to result in disaster on multi-process machines.

for (i = size; i <= __MAX_BYTES; i += __ALIGN)
{
my_free_list = free_list + FREELIST_INDEX(i);
p = *my_free_list;
if (0 != p) {
*my_free_list = p -> free_list_link;
start_free = (char *)p;
end_free = start_free + i;
return(chunk_alloc(size, nobjs));
// Any leftover piece will eventually make it to the
// right free list.
}
}
end_free = 0; // In case of exception.
start_free = (char*)malloc( bytes_to_get );//(char *)malloc_alloc::allocate(bytes_to_get);
// This should either throw an
// exception or remedy the situation.  Thus we assume it
// succeeded.
}

heap_size += bytes_to_get;
end_free = start_free + bytes_to_get;

return( chunk_alloc( size, nobjs ) );
}
}

/* Returns an object of size n, and optionally adds to size n free list.*/
/* We assume that n is properly aligned.                               */
/* We hold the allocation lock.                                        */
template <bool threads, int inst>
void* CDefaultMemPool<threads, inst>::refill( size_t n )
{
int nobjs = 20;
char * chunk = chunk_alloc( n, nobjs );
obj * __VOLATILE * my_free_list;
obj * result;
obj * current_obj, * next_obj;

// 如果仅从池要了一块内存，那么直接将索得的返回即可
if (1 == nobjs) return( chunk );

// 如果要了很多，便将其余的放到块管理器中管理
my_free_list = free_list + FREELIST_INDEX(n);
result = ( obj* )chunk;
*my_free_list = next_obj = ( obj* )( chunk + n );

for ( int i = 1; ; i++ )
{
current_obj = next_obj;
next_obj = ( obj* )( ( char* )next_obj + n );
if ( nobjs - 1 == i )
{
// 注意块管理列表的最后一个必须为0
// 用此区别后面是否还是多余的内存块
current_obj->free_list_link = 0;
break;
}
else
{
current_obj->free_list_link = next_obj;
}
}

return ( result );
}

template <bool threads, int inst>
void*
CDefaultMemPool<threads, inst>::allocate( size_t n )
{
obj * __VOLATILE * my_free_list;
obj *  result;

if ( n > (size_t) __MAX_BYTES )
{
// 当内存够大时直接用windows的内存分配策略
// return(malloc_alloc::allocate(n));
}

// 寻找到块管理列表
my_free_list = free_list + FREELIST_INDEX(n);

lock lock_instance;

result = *my_free_list;
if ( result == 0 )
{
// 如果当前的块管理列表已经没有可使用内存了
// 就去向内存池申请
void *r = refill( ROUND_UP( n ) );
return r;
}
*my_free_list = result -> free_list_link;

return ( result );
};

template <bool threads, int inst>
void
CDefaultMemPool<threads, inst>::deallocate( void *p, size_t n )
{
obj *q = ( obj* )p;
obj * __VOLATILE * my_free_list;

if ( n > ( size_t )__MAX_BYTES )
{
// 如果不是内存池的内存就提交给第二级配置器
//malloc_alloc::deallocate(p, n);
return;
}

// 寻找到块管理列表
my_free_list = free_list + FREELIST_INDEX(n);

// 多线程做准备
lock lock_instance;

// 将回收的内存放置在块管理列表的前端
q -> free_list_link = *my_free_list;
*my_free_list = q;
}

template < bool threads, int inst >
void *CDefaultMemPool<threads, inst>::
reallocate( void *p, size_t old_sz, size_t new_sz )
{

void * result;
size_t copy_sz;

// 如果原来的内存或者即将分配的内存都不归内存池管的话
// 将其提交给系统的内存重新分配函数
if ( ( old_sz > ( size_t )__MAX_BYTES ) &&
( new_sz > ( size_t )__MAX_BYTES ) )
{
return ( realloc( p, new_sz ) );
}

if ( ROUND_UP( old_sz ) == ROUND_UP( new_sz ) )
{
// 如果调整前和调整后的内存大小一样则直接返回
return ( p );
}

result = allocate( new_sz );
copy_sz = new_sz > old_sz? old_sz : new_sz;
memcpy( result, p, copy_sz );
deallocate( p, old_sz );

return( result );
}
//////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////
// 几种默认的内存池
typedef CDefaultMemPool<__NODE_ALLOCATOR_THREADS, 0> CAlloc;
typedef CDefaultMemPool<false, 0> CSingleMemPool;
typedef CDefaultMemPool<true, 0> CMultiMemPool;
//////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////
// Default node allocator.
// With a reasonable compiler, this should be roughly as fast as the
// original STL class-specific allocators, but with less fragmentation.
// Default_alloc_template parameters are experimental and MAY
// DISAPPEAR in the future.  Clients should just use alloc for now.
//
// Important implementation properties:
// 1. If the client request an object of size > __MAX_BYTES, the resulting
// object will be obtained directly from malloc.
// 2. In all other cases, we allocate an object of size exactly
// ROUND_UP(requested_size).  Thus the client has enough size
// information that we can return the object to the proper free list
// without permanently losing part of the object.
//

// The first template parameter specifies whether more than one thread
// may use this allocator.  It is safe to allocate an object from
// one instance of a default_alloc and deallocate it with another
// one.  This effectively transfers its ownership to the second one.
// This may have undesirable effects on reference locality.
//
// The second parameter is unreferenced and serves only to allow the
// creation of multiple default_alloc instances.
// Node that containers built on different allocator instances have
// different types, limiting the utility of this approach.
//////////////////////////////////////////////////////////////

sssxueren · 发表于 2005-11-10 13:54:00

其实从效率的角度来讲，固定块，或者类似stlport的那种内存池都是很好的，就是上面另外一个朋友说的16\64\128那种

主要看你用来做什么，如果是很底层的buff，其实这样用事最好的，毕竟是用少量的空间来换取效率，如果是上层的buff，就无所谓了

liam · 发表于 2005-11-14 17:27:00

呵呵,谢谢各位,我从中学到了不少.

我进入游戏行业还短.不到两个月.很多主要是对于可行性的考虑

很感谢各位提的好建议.

其实STL的那个也为负载平衡提供了很好的例子,如果能够再改进一下就很好了....

liam · 发表于 2005-11-22 10:53:00

sssxueren: Re:请教: 关于服务器端内存管理与分配

其实从效率的角度来讲，固定块，或者类似stlport的那种内存池都是很好的，就是上面另外一个朋友说的16\64\1...

前一段和一个同事聊,他谈到了他们遇到的一种情况,就是当我们用系统BUFFER时候,发现系统的BUFFER在用户量大的情况下BUFFER会被用光.丢包情况严重,更严重情况下系统会崩溃,这让我觉得用自己申请的内存代替系统BUFFER是有必要的..各位的意见呢?

lingjingqiu · 发表于 2005-11-22 12:22:00

slab,樱的建议。

账号		自动登录	找回密码
密码			立即注册

请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re: Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配

Re: Re:请教: 关于服务器端内存管理与分配

Re:请教: 关于服务器端内存管理与分配