world leader in high performance signal processing
Trace: » memory_allocation

Memory in the Kernel

One of the Kernel's main tasks is to allocate memory to processes and kernel functions. There are, in fact, three primary Kernel based memory allocation schemes.

  • Bootmem allocator - only used during system boot
  • Buddy Allocator - gives out whole pages
  • Slab Allocator - allocates smaller blocks of memory

Kmalloc

Kmalloc is used to grab small pieces of memory in kernel space.

Typical use of kmalloc is in driver code. More info can be found on the Man Pages

    // get memory
    char * p = kmalloc(size, flags);
 
   // free memory
   kfree(p);
   p=NULL;

This will return NULL if the allocation failed.

The flags argument may be one of:

  • GFP_USER - Allocate memory on behalf of user. May sleep.
  • GFP_KERNEL - Allocate normal kernel ram. May sleep.
  • GFP_ATOMIC - Allocation will not sleep. Use inside interrupt handlers.

Additionally, the GFP_DMA flag may be set to indicate the memory must be suitable for DMA. This can mean different things on different architectures. For example, on i386, it means that the memory must come from the first 16MB and thus be suitable for devices restricted to 24 bit addressing.

Get_Free_Page(s)

Get_free_page(s) is used to allocate larger contiguous blocks of memory.

The min page size is 1 page 4k (4096). The “normal” increment is by powers of 2 called the “order”.

Refer to the following sources:

  • linux-2.6.x/mm/page_alloc.c
  • linux-2.6.x/include/linux/gfp.h

The use of the kernel call is as follows:

      char * kbuf;
 
      kbuf = (char *)__get_free_page(GFP_KERNEL);
      if (!kbuf)
          return -ENOMEM;
 
      char * resp;
      int order;
      order = 2; //( 16K page please )
      resp = (char *) __get_free_pages(GFP_KERNEL, order);
      if (!resp)
          return -ENOMEM;

This system uses the buddy allocator that allocates memory pages sized by a power of 2 factor. This means that you can get 1 , 2, 4, 8, 16, 32, 64 pages ( each of 4 K bytes in size ), but nothing in between.

The memory is contiguous. If no memory can be found the call may block unless the GFP_ATOMIC flag is specified. In this case the call will return a NULL pointer to indicate failure.

Vmalloc

Used on MMU systems. This allows the mapping of buffers greater than the kmalloc limit ( Normally 128K ). Memory is mapped one page at a time and the resultant virtual memory is logically contiguous but may not be actually physically contiguous.

In an MMU based system, the physical memory is mapped into a special area reserved in the 4G kernel memory map for this sort of operation.

On noMMU systems the vmalloc function is defined in linux-2.6.x/mm/nommu.c. This code is linked into the kernel instead of the MMU based code.

This extract from the Kernel Makefile shows how:

mmu-y                   := nommu.o
mmu-$(CONFIG_MMU)       := fremap.o highmem.o madvise.o memory.o mincore.o \
                           mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \
                           shmem.o vmalloc.o

This extract first selects the noMMU code option and then overwrites that selection if CONFIG_MMU is set to “y” indicating the use of a MMU.

noMMU Systems

In general the transition between MMU and noMMU operation is restricted to a minimal number of changes in a few source files.

For interest see these files for more instances using the CONFIG_MMU configuration option.

   ./drivers/char/mem.c
   ./drivers/ieee1394/dma.c
   ./drivers/video/fbmem.c
   ./fs/compat.c
   ./fs/exec.c
   ./kernel/fork.c
   ./mm/filemap.c
   ./mm/nommu.c
   ./mm/page_alloc.c
   ./mm/slab.c
   ./include/linux/blkdev.h
   ./include/linux/kmalloc_sizes.h
   ./include/linux/rmap.h
   ./include/linux/swap.h

Silly noMMU Memory Tricks

Sometimes, large buffers are needed (like for image capture) in a noMMU system, and you don't want to let the kernel manage this, to ensure this does not cause fragmentation. There are a few silly noMMU tricks to be able to do this.

Tell the Kernel you have less memory

By using the mem=nM in bootargs, will tell the kernel it has less memory that it really does. To you can set this dynamically in the bootloader by setting the bootargs variable.

bf537> set bootargs root=/dev/mtdblock0 rw mem=50M

When this kernel boots, it will print out:

Board Memory: 50MB
Kernel Managed Memory: 50MB

Accessing more than this memory will result in a CPLB fault. To tell the kernel the hardware actually has more memory than the mem= variable, use:

bf537> set bootargs root=/dev/mtdblock0 rw mem=50M max_mem=64M$#

From 50M64M is the reserved memory, $ means enable data cache on this reserved memory, and # means enable instruction cache on this memory.

When this kernel boots,

Board Memory: 64MB
Kernel Managed Memory: 50MB

You are free to use this unmanaged memory, for anything you want, directly by memory pointers. For an example, see the mmap example.

Start the Kernel at an offset

Normally, the kernel starts at 0x1000 (or 4k), but there is no reason why it can't start at 1Meg. To do this is also a Kconfig setting (since the kernel is linked to an absolute address. In Blackfin Processor OptionsBoard customizationsKernel load address for booting, just change the option. In very constrained systems, this can be set to 0x0. (It is recommended not to set to zero since null/bad pointers accesses are trapped on the first 1k (0x400) of data.

Difference between GFP_KERNEL and GFP_DMA

On most architectures GFP_DMA is an extension to the GFP_KERNEL flag to request memory to be allocated from the first 16M of physical memory. This was ( or still is a restriction on some hardware with only 24 bit addressing abilities.

Other architectures can use the GFP_DMA flag to allocate memory best suited for DMA activities.

Problems in uClinux power of two memory allocation (defragmentation)

The buddy allocator provides memory in packets of increasing order of 2 size. This provides for a fast efficient allocator on system with larger memories ( Desktop Systems ).

On really small Embedded Systems with only a few M bytes of memory the buddy allocator can quickly use up the available memory. This can cause the system to fail due to lack of memory prematurely.

The Buddy Allocator was optionally replaced on the 2.4 Kernels by the “Non Power of 2” allocator. This was slower than the Buddy Allocator but did not have the same memory size restraints as the buddy allocator. This means that system that seemed to run out of memory using the buddy allocator would work using the “Non Power of 2” allocator.

This memory allocator has not (yet ) been ported to the 2.6 kernel.

Cache Management

A portion of the L1 internal memory can be configured as data and instruction caches. For the BF533 the following Memory Map is available

Type of Memory Size
Instruction SRAM or Cache 16K byte
Instruction SRAM 64K byte
Data SRAM or Cache 32K byte
Data SRAM 32K byte
Scratchpad SRAM 4K byte

Instruction Cache

The IMEM_CONTROL register is used to set up the mode of the Instruction SRAM as either a cache or additional instruction storage. Write access to this area is only via a DMA port because the memory is effectively 64 bits wide. The Instruction cache needs to be managed manually if the physical memory referenced may have changed. Instructions are available to invalidate a single cache line , an address or the complete cache.

Code is in the file linux-2.6.x/arch/blackfin/mach-common/flush.S to manage flusing both Instructions and Data Caches.

Data Cache

The system has an optional 32K byte Data Cache

This can be configured ( in the kernel Configuration System ) in many ways

  • CONFIG_BLKFIN_CACHE - Instruction Cache
  • CONFIG_BLKFIN_DCACHE - Data Cache
  • CONFIG_BLKFIN_CACHE_LOCK - provides code to lock the cache
  • CONFIG_BLKFIN_WB - Write Back Cache
  • CONFIG_BLKFIN_WT - Write Through Cache
  • CONFIG_UNCACHED_1MB - leave 1Mbyte at top of memory untouched by Linux

The following Data Cache types are available

  • Write Through - each store goes to cache and also updates physical memory.
  • Write Back - Actual memory is updated only if the cache line is replaced.
  • A SSYNC instruction will flush the cache write buffer

As with the instruction cache care must be taken to invalidate the cache whenever some external system may have changed the physical memory directly. When using external memory, a cache flush will ensure that any pending updates from the CPU to the external memory are committed to the memory. Cache flush is optimized out when the Write Through Cache option is selected