当前位置:首页 >> 电脑基础知识 >>

英特尔ipp函数库IA


Intel? Integrated Performance Primitives for Windows* OS on IA-32 Architecture
User’s Guide

March 2009

Document Number: 318254-007US World Wide Web: http://developer.intel.com

Version -001

Version Information Original issue of Intel? Integrated Performance Primitives (Intel? IPP) for Windows* OS on IA-32 Architecture User’s Guide. Documents Intel IPP 5.3 release Documents Intel IPP 6.0 beta release Documents Intel IPP 6.0 release Documents Intel IPP 6.1 beta release Documents Intel IPP 6.1 release

Date July 2007

-002 -003 -004 -005 -007

November 2007 February 2008 September 2008 January 2009 March 2009

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL? PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The software described in this document may contain software defects which may cause the product to deviate from published specifications. Current characterized software defects are available on request. This document as well as the software described in it is furnished under license and may only be used or copied in accordance with the terms of the license. The information in this manual is furnished for informational use only, is subject to change without notice, and should not be construed as a commitment by Intel Corporation. Intel Corporation assumes no responsibility or liability for any errors or inaccuracies that may appear in this document or any software that may be provided in association with this document. Except as permitted by such license, no part of this document may be reproduced, stored in a retrieval system, or transmitted in any form or by any means without the express written consent of Intel Corporation. Developers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Improper use of reserved or undefined features or instructions may cause unpredictable behavior or failure in developer's software code when running on an Intel processor. Intel reserves these features or instructions for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from their unauthorized use. BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Inside, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries. * Other names and brands may be claimed as the property of others. Copyright ? 2007 - 2009, Intel Corporation. All rights reserved.

ii

Contents
Chapter 1 Overview
Technical Support ....................................................................... 1-1 About This Document .................................................................. 1-2 Purpose................................................................................. 1-2 Audience ............................................................................... 1-3 Document Organization ............................................................... 1-4 Notational Conventions............................................................ 1-5

Chapter 2

Getting Started with Intel? IPP
Intel IPP Basics .......................................................................... 2-1 Cross-Architecture Alignment ................................................... 2-2 Types of Input Data ................................................................ 2-2 Domains................................................................................ 2-4 Function Naming and Parameters .............................................. 2-5 Checking Your Installation............................................................ 2-8 Obtaining Version Information ...................................................... 2-9 Building Your Application.............................................................. 2-9 Setting Environment Variables .................................................. 2-9 Including Header Files ........................................................... 2-10 Calling IPP Functions ............................................................. 2-10 Before You Begin Using Intel IPP ................................................. 2-12

Chapter 3

Intel? IPP Structure
High-level Directory Structure ...................................................... 3-1 Supplied Libraries ....................................................................... 3-2

iii

Intel? IPP User’s Guide

Using Intel IPP Dynamic Link Libraries (DLLs)............................. 3-2 Using Intel IPP Static Libraries ................................................. 3-3 Contents of the Documentation Directory ...................................... 3-4

Chapter 4

Configuring Your Development Environment
Configuring Microsoft* Visual C++* .NET 2003 or Microsoft Visual C++*2005 Software to Link with Intel IPP ................................... 4-1 Creating Visual C++ 2005 Project Files for the Intel? IPP Samples 4-2 Building a Microsoft* Visual C++ .NET* Solution for the UMC Sample Code................................................................................... 4-2 Using the IntelliSense* Capability ............................................. 4-3 Using Intel? IPP with Intel? C++ Compiler................................... 4-6 Using Intel? IPP with Borland C++ Builder* Integrated Development Environment ............................................................................ 4-6

Chapter 5

Linking Your Application with Intel? IPP
Dispatching ............................................................................... 5-1 Processor Type and Features .................................................... 5-2 Selecting Between Linking Methods............................................... 5-4 Dynamic Linking..................................................................... 5-5 Static Linking (with Dispatching) .............................................. 5-5 Static Linking (without Dispatching) .......................................... 5-7 Building a Custom DLL ............................................................ 5-9 Comparison of Intel IPP Linkage Methods................................. 5-10 Selecting the Intel IPP Libraries Needed by Your Application ........... 5-10 Dynamic Linkage .................................................................. 5-12 Static Linkage with Dispatching .............................................. 5-13 Library Dependencies by Domain (Static Linkage Only).............. 5-13 Linking Examples ..................................................................... 5-14

Chapter 6

Supporting Multithreaded Applications
Intel IPP Threading and OpenMP* Support ..................................... 6-1 Setting Number of Threads ...................................................... 6-1 Using Shared L2 Cache ........................................................... 6-2 Nested Parallelization .............................................................. 6-2 Disabling Multithreading .......................................................... 6-2

iv

Contents

Chapter 7

Managing Performance and Memory
Memory Alignment ..................................................................... 7-1 Thresholding Data ...................................................................... 7-3 Reusing Buffers.......................................................................... 7-4 Using FFT .................................................................................. 7-5 Running Intel IPP Performance Test Tool ........................................ 7-6 Examples of Using Performance Test Tool Command Lines............ 7-7

Chapter 8

Using Intel? IPP with Programming Languages
Language Support ...................................................................... 8-1 Using Intel IPP in Java* Applications ............................................. 8-2

Appendix A Performance Test Tool Command Line Options Appendix B Intel? IPP Samples
Types of Intel IPP Sample Code .................................................... B-1 Source Code Samples ................................................................. B-2 Using Intel IPP Samples .............................................................. B-4 System Requirements ............................................................. B-4 Building Source Code .............................................................. B-5 Running the Software ............................................................. B-6 Known Limitations ...................................................................... B-6

Index

v

Overview

1

Intel? Integrated Performance Primitives (Intel? IPP) is a software library that provides a broad range of functionality. This functionality includes general signal and image processing, computer vision, speech recognition, data compression, cryptography, string manipulation, audio processing, video coding, realistic rendering and 3D data processing. It also includes more sophisticated primitives for construction of audio, video and speech codecs such as MP3 (MPEG-1 Audio, Layer 3), MPEG-4, H.264, H.263, JPEG, JPEG2000, GSM-AMR, G.723. By supporting a variety of data types and layouts for each function and minimizing the number of data structures used, the Intel IPP library delivers a rich set of options for developers to choose from when designing and optimizing an application. A variety of data types and layouts are supported for each function. Intel IPP software minimizes data structures to give the developer the greatest flexibility for building optimized applications, higher level software components, and library functions. Intel IPP for Windows* OS is delivered in separate packages for: ? ? ? Users who develop on 32-bit Intel architecture (Intel IPP for the Windows* OS on IA-32 Intel? Architecture) Users who develop on Intel? 64-based (former Intel EM64T) architecture (Intel IPP for the Windows* OS on Intel? 64 Architecture) Uusers who develop on Intel? Itanium? 2 processor family (Intel IPP for the Windows* OS on IA-64 architecture)

Technical Support
Intel IPP provides a product web site that offers timely and comprehensive product information, including product features, white papers, and technical articles. For the latest information, see http://developer.intel.com/software/products/.

1-1

1

Intel? IPP User’s Guide

Intel also provides a support web site that contains a rich repository of self-help information, including getting started tips, known product issues, product errata, license information, and more (visit http://support.intel.com/support/). Registering your product entitles you to one-year technical support and product updates through Intel? Premier Support. Intel Premier Support is an interactive issue management and communication web site providing the following services: ? ? Submit issues and review their status. Download product updates anytime of the day.

To register your product, or contact Intel, or seek product support, please visit http://www.intel.com/software/products/support/ipp.

About This Document
This User's Guide provides information about how to make the most of Intel? IPP routines using Windows* applications running on IA-32 architecture. It describes features specific to this platform, as well as features that do not depend upon a particular architecture. After installation, you can find this document in the <install path>\doc directory (see Contents of the Documentation Directory).

Purpose
This document: ? ? ? ? ? ? ? ? Helps you start using the library by describing the steps you need to follow after installation of the product. Shows how to configure the library and your development environment to use the library. Acquaints you with the library structure. Explains in detail how to select the best linking method, how to link your application to the library, and it provides simple usage examples. Explains how to thread your application using IPP software. Describes how to code, compile, and run your application with Intel IPP. Provides information about how to accomplish Intel IPP functions performance tests by using Intel IPP Performance Test Tool. Describes types of Intel IPP sample code available for developers to learn how to use Intel IPP and it explains how to run the samples.

1-2

Overview

1

Audience
This guide is intended for Windows programmers with beginner to advanced software development experience.

1-3

1

Intel? IPP User’s Guide

Document Organization
The document contains the following chapters and appendices. Chapter 1 Chapter 2 Overview describes the document purpose and organization as well as explains notational conventions. Getting Started with Intel? IPP describes necessary steps and gives basic information needed to start using Intel IPP after its installation. Intel? IPP Structure describes the structure of the Intel IPP directory after installation and discusses the library types supplied. Configuring Your Development Environment explains how to configure Intel IPP and how to configure your environment for use with the library. Linking Your Application with Intel? IPP compares linking methods, helps you select a linking method for a particular purpose, describes the general link line syntax to be used for linking with the Intel IPP libraries, and discusses how to build custom dynamic libraries. Supporting Multithreaded Applications helps you set the number of threads in multithreaded applications, get information on the number of threads, and disable multithreading. Managing Performance and Memory discusses ways of improving Intel IPP performance and tells you how to create Intel IPP functions performance tests by using the Intel IPP Performance Test Tool. Using Intel? IPP with Programming Languages discusses some special aspects of using Intel IPP with different programming languages and Windows development environments. Performance Test Tool Command Line Options gives brief descriptions of possible performance test tool command line options. Intel? IPP Samples describes types of sample code available to demonstrate how to use Intel IPP, presents the source code example files by categories with links to view the sample code, and explains how to run the samples.

Chapter 3 Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Appendix A

Appendix B

The document also includes an Index.

1-4

Overview

1

Notational Conventions
The document usess the following font conventions and symbols:

Table 1-1 Italic

Notational conventions Italic is used for emphasis and also indicates document names in body text, for example, see Intel IPP Reference Manual Indicates filenames, directory names, and pathnames, for example:

Monospace lowercase Monospace lowercase mixed with uppercase UPPERCASE MONOSPACE Monospace italic

\tools\env\ippenv.bat
Indicates code, commands, and command-line options, for example:

ippsFFTGetBufSize_C_32fc( ctxN2, &sz );
Indicates system variables, for example, PATH Indicates a parameter in discussions, such as function parameters, for example, lda; makefile parameters, for example, functions_list; and so forth. When enclosed in angle brackets, indicates a placeholder for an identifier, an expression, a string, a symbol, or a value: <ipp directory>. Square brackets indicate that the items enclosed in brackets are optional. Braces indicate that only one of the items listed between braces can be selected. A vertical bar ( | ) separates the items

[ items ] { item | item }

1-5

Getting Started with Intel? IPP

2

This chapter helps you start using Intel? IPP by providing basic information you need to know and describing the necessary steps you need to follow after installation of the product.

Intel IPP Basics
Intel IPP is a collection of high-performance code that provides a broad range of functionality. This functionality includes general signal and image processing, computer vision, speech recognition, data compression, cryptography, string manipulation, audio processing, video coding, realistic rendering and 3D data processing, matrix math. It also includes more sophisticated primitives for construction of audio, video and speech codecs such as MP3 (MPEG-1 Audio, Layer 3), MPEG-4, H.264, H.263, JPEG, JPEG2000, GSM-AMR, G.723. Based on experience in developing and using the Intel Performance Libraries, Intel IPP has the following major distinctive features: ? Intel IPP provides basic low-level functions for creating applications in several different domains, such as signal processing, audio coding, speech recognition and coding, image processing, video coding, operations on small matrices, and realistic rendering functionality and 3D data processing. See detailed information in the section Domains. The Intel IPP functions follow the same interface conventions including uniform naming rules and similar composition of prototypes for primitives that refer to different application domains. For information on function naming, see Function Naming and Parameters. The Intel IPP functions use abstraction level which is best suited to achieve superior performance figures by the application programs.

?

?

To speed up performance, Intel IPP functions are optimized to use all benefits of Intel? architecture processors. Besides, most of Intel IPP functions do not use complicated data structures, which helps reduce overall execution overhead.

2-1

2

Intel? IPP User’s Guide

Intel IPP is well-suited for cross-platform applications. For example, the functions developed for IA-32 architecture-based platforms can be readily ported to Intel? Itanium?-based platforms (see Cross-Architecture Alignment).

Cross-Architecture Alignment
Intel IPP is designed to support application development on various Intel? architectures. This means that the API definition is common for all processors, while the underlying function implementation takes into account the variations in processor architectures. By providing a single cross-architecture API, Intel IPP allows software application repurposing and enables developers to port to unique features across Intel? processor-based desktop, server, and mobile platforms. Developers can write their code once in order to realize the application performance over many processor generations.

Types of Input Data
Intel IPP operations are divided into several groups in dependence on the types of input data on which the operation is performed. The types for these groups are:

One-Dimensional Arrays and Signals
This group includes most functions operating on one-dimensional arrays of data. In many cases these array are signals and many of the operations are signal-processing operations. Examples of one-dimensional array operations include: vectorized scalar arithmetic, logical, statistical operations digital signal processing data compression audio processing and audio coding speech recognition and speech coding cryptography and data integrity string operations

Images
An image is an two-dimensional array of pixels. Images have some specific features that distinguishes them from general two-dimensional array. Examples of image operations include: arithmetic, logical, statistical operations color conversion image filtering image linear and geometric transformations

2-2

Getting Started with Intel? IPP

2

morphological operations computer vision image compression video coding

Matrices
This group includes functions operating on matrices and vectors that are one- and two-dimensional arrays, and on arrays of matrices and vectors. These arrays are treated as linear equations or data vectors and subjected to linear algebra operations. Examples of matrix operations include: vector and matrix algebra solving systems of linear equations solving least squares problem computing eigenvalue problem

3D objects
This group includes functions operating with 3D objects. In this case input data depends on the used techniques. Examples of 3D operations include: realistic rendering resizing and affine transforming The Intel IPP functions are primarily grouped according to the input data types listed above. Each group has its own prefix in the function name (see Function Naming).

Core Functions
A few service functions in Intel IPP do not operate on one of these input data type. Such functions are used to detect and set system and Intel IPP configuration. Examples of such operations include getting the type of CPU, aligning pointers to the specified number of bytes, controlling the dispatcher of the merged static libraries and so on. These functions are called core functions and have its own header file, static libraries and DLLs. Table 2-1
Prefix in Functio n Name

Code

Header File

Static Libraries

DLL

ippCore

ippcore.h

ippcorel.lib, ippcore_t.lib

ippcore-*.*.dll

ipp

here *.* refers to the product version number, for example 6.1

2-3

2

Intel? IPP User’s Guide

Domains
For organizational purposes Intel IPP is internally divided into subdivisions of related functions. Each subdivision is called domain, (or functional domain) and generally has its own header file, static libraries, DLLs, and tests. These domains map easily to the types of input data and the corresponding prefixes. The Intel IPP Manual indicates in which header file each function can be found. The table below lists each domain's code, header and library names, and functional area. Table 2-2
Code Header file Static Libraries Prefix DLL Description

ippAC ippCC ippCH ippCP ippCV ippDC ippDI ippGEN ippIP ippJP ippMX ippRR

ippac.h ippcc.h ippch.h ippcp.h ippcv.h ippdc.h ippdi.h ipps.h ippi.h ippj.h ippm.h ippr.h

ippac*.lib ippcc*.lib ippch*.lib ippcp*.lib ippcv*.lib ippdc*.lib ippdi*.lib ippgen*.lib ippi*.lib ippj*.lib ippm*.lib ippr*.lib

ippac**.dll ippcc**.dll ippch**.dll ippcp**.dll ippcv**.dll ippdc**.dll ippdi**.dll ippgen**.dll ippi**.dll ippj**.dll ippm**.dll ippr**.dll

ipps ippi ipps ipps ippi ipps ipps ippg ippi ippi ippm ippr

audio coding color conversion string operations cryptography computer vision data compression data integrity generated functions image processing image compression small matrix operations realistic rendering and 3D data processing speech coding signal processing

ippSC ippSP

ippsc.h ipps.h

ippsc*.lib ipps*.lib

ippsc**.dll ipps**.dll

ipps ipps

2-4

Getting Started with Intel? IPP

2

Table 2-2
Code Header file Static Libraries Prefix DLL Description

ippSR ippVC ippVM

ippsr.h ippvc.h ippvm.h

ippsr*.lib ippvc*.lib ippvm*.lib

ippsr**.dll ippvc**.dll ippvm**.dll

ipps ippi ipps

speech recognition video coding vector math

* - refers to one of the following: emerged, merged, merged_t ** - refers to the processor-specific code and version number, for example, s8-6.1

Function Naming and Parameters
Function names in Intel IPP are structured in order to simplify their identification and use. Understanding Intel IPP naming conventions can save you a lot of time and effort in figuring out what the purpose of a specific function is and in many cases you can derive this basic information straight from the function's self-explanatory name. Naming conventions for the Intel IPP functions are similar for all covered domains. Intel IPP function names include a number of fields that indicate the data domain, operation, data type, and execution mode. Each field can only span over a fixed number of pre-defined values. Function names have the following general format:

ipp<data-domain><name>[_<datatype>][_<descriptor>](<parameters>);
The elements of this format are explained in the sections that follow.

Data-Domain
The data-domain is a single character indicating type of the input data. The current version of Intel IPP supports the following data-domains:

s for signals (expected data type is a 1D array) g for signals of the fixed length (expected data type is a 1D array) i for images and video (expected data type is a 2D array of pixels) m for vectors and matrices (expected data type is a matrix or vector) r for realistic rendering functionality and 3D data processing (expected data type
depends on supported rendering techniques)

2-5

2

Intel? IPP User’s Guide

The core functions in Intel IPP do not operate on one of these types of the input data (see Core Functions). These functions have ipp as a prefix without data-domain field, for example, ippGetStatusString.

Name
The name identifies the algorithm or operation that the function does. It has the following format:

<name> = <operation>[_modifier]
The operation field is one or more words, acronyms, and abbreviations that identify the base operation, for example Set, Copy. If the operation consists of several parts, each part starts with an uppercase character without underscore, for example, HilbertInitAlloc. The modifier, if present, denotes a slight modification or variation of the given function. For example, the modifier CToC in the function ippsFFTInv_CToC_32fc signifies that the inverse fast Fourier transform operates on complex data, performing complex-to-complex (CToC) transform. Functions for matrix operation have and object type description as a modifier, for example, ippmMul_mv - multiplication of a matrix by a vector.

Data Types
The datatype field indicates data types used by the function in the following format:

<datatype> = <bit_depth><bit_interpretation> ,
where

bit_depth = <1|8|16|32|64>
and

bit_interpretation = <u|s|f>[c].
Here u indicates “unsigned integer”, s indicates “signed integer”, f indicates “floating point”, and c indicates “complex”. For functions that operate on a single data type, the datatype field contains only one value. If a function operates on source and destination objects that have different data types, the respective data type identifiers are listed in the function name in order of source and destination as follows:

<datatype> = <src1Datatype>[src2Datatype][dstDatatype].
For example, the function ippsDotProd_16s16sc computes the dot product of 16-bit short and 16-bit complex short source vectors and stores the result in a 16-bit complex short destination vector. The dstDatatype modifier is not present in the name because the second operand and the result are of the same type.

2-6

Getting Started with Intel? IPP

2

Descriptor
The optional descriptor field describes the data associated with the operation. It can contain implied parameters and/or indicate additional required parameters. To minimize the number of code branches in the function and thus reduce potentially unnecessary execution overhead, most of the general functions are split into separate primitive functions, with some of their parameters entering the primitive function name as descriptors. However, where the number of permutations of the function becomes large and unreasonable, some functions may still have parameters that determine internal operation (for example, ippiThreshold). The following descriptors are used in Intel IPP:

A A0 Axx C Cn Dx I L M P Pn R S Sfs s

Image data contains an alpha channel as the last channel, requires C4, alpha channel is not processed. Image data contains an alpha channel as the first channel, requires C4, alpha channel is not processed. Specifies the bits of accuracy of the result for advanced arithmetic operations. The function operates on a specified channel of interest (COI) for each source image. Image data is made of n discrete interleaved channels (n= 1, 2, 3, 4). Signal is x-dimensional (default is D1). The operation is performed in-place (default is not-in-place). Layout description of the objects for matrix operation, or indicates that one pointer is used for each row in D2 array for signal processing. The operation uses a mask to determine pixels to be processed. Pointer description of the objects for matrix operation, or specified number of vectors to be processed for signal processing. Image data is made of n discrete planar (non-interleaved) channels (n= 1, 2, 3, 4) with separate pointer to each plane. The function operates on a defined region of interest (ROI) for each source image. Standard description of the objects for matrix operation. Saturation and fixed scaling mode (default is saturation and no scaling). Saturation and no scaling.

The descriptors in function names are always presented in alphabetical order.

2-7

2

Intel? IPP User’s Guide

Some data descriptors are implied when dealing with certain operations. For example, the default for image processing functions is to operate on a two-dimensional image and to saturate the results without scaling them. In these cases, the implied abbreviations D2 (two-dimensional signal) and s (saturation and no scaling) are not included in the function name.

Parameters
The parameters field specifies the function parameters (arguments). The order of parameters is as follows: 1. 2. 3. All source operands. Constants follow arrays All destination operands. Constants follow arrays Other, operation-specific parameters Arguments defined as pointers start with p, for example, pPhase, pSrc, pSeed; arguments defined as double pointers start with pp, for example, ppState; and arguments defined as values start with a lowercase letter, for example, val, src, srcLen. Each new part of an argument name starts with an uppercase character, without underscore, for example, pSrc, lenSrc, pDlyLine. Each argument name specifies its functionality. Source arguments are named

The parameters name has the following conventions.

pSrc or src, sometimes followed by names or numbers, for example, pSrc2, srcLen. Output arguments are named pDst or dst followed by names or numbers, for example, pDst1, dstLen. For in-place operations, the input/output argument contains the name pSrcDst.
Examples of function syntax:

ippsIIR_32f_I(Ipp32f* pSrcDst, int len, IppsIIRState_32f* pState); ippiConvert_8u1u_C1R(const Ipp8u* pSrc, int srcStep, Ipp8u* pDst, int dstStep, int dstBitOffset, IppiSize roiSize, Ipp8u threshold); ippmSub_vac_32f(const Ipp32f* pSrc, int srcStride0, int srcStride2, Ipp32f val, Ipp32f* pDst, int dstStride0, int dstStride2, int len, int count).

Checking Your Installation
Once you complete the installation of Intel IPP, it is useful to follow these steps that confirm proper installation and configuration of the library.

2-8

Getting Started with Intel? IPP

2

1.

Check that the directory you chose for installation has been created: <installation path>\Intel\IPP\6.1.x.xxx\ia32. The default installation directory is C:\Program Files\Intel\IPP\6.1.x.xxx\ia32. Check that file ippenv.bat is placed in the \tools\env directory. You can use this file to set the environment variables PATH, LIB, and INCLUDE in the user shell. Check that the dispatching and processor-specific libraries are on the path. Run ippiDemo.exe (or ippsDemo.exe) from the C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\demo. If you receive the error messages “This application has failed to start because ippcore.dll was not found” or "No DLL were found in the waterfall procedure", this means that the operating system is unable to determine the location of the Intel IPP dynamic libraries. To solve this issue: ? Ensure that the Intel IPP directory is in the path. Before using the Intel IPP dynamic libraries, add C:\Program Files\Intel\IPP\6.1.x.xxx\ia32 to the PATH environment variable as described in Setting Environment Variables; Manually copy the contents of IPP\6.1.x.xxx\ia32\bin to the \system32 directory; Copy the contents of IPP\6.1.x.xxx\ia32\bin to the application directory.

2. 3. 4.

? ?

Note that you need to delete all Intel IPP DLLs from previous releases from the C:\winnt\system and C:\winnt\system32 directories. Verify that paths to older library versions are not listed in the PATH environment variable.

Obtaining Version Information
To obtain information about the active library version including the version number, package ID, and the licensing information, call the ippGetLibVersion function. See the ”Support Functions” chapter in the ”Intel IPP Reference Manual” (v.1) for the function description and calling syntax. You may also get the version information in the ippversion.h file located in the

\include directory.

Building Your Application
Follow the procedures described below to build the application.

Setting Environment Variables
The batch file ippenv.bat in the \tools\env directory sets the Intel IPP LIB, INCLUDE, and PATH environment variables for a command prompt session.

2-9

2

Intel? IPP User’s Guide

To set the environment variables outside of a single command prompt session, complete the following steps, for example in the Windows XP* OS: 1. 2. 3. 4. 5. 6. Right-click the My Computer icon on your desktop or from the Windows Explorer* and select Properties (or open Control Panel and select System), Select the Advanced tab, Select the Environment Variables button, Use the interface to set the environment variables for only the current user (top dialog box) or for anyone who uses the system (bottom dialog box), Select the variable you wish to modify and click the Edit button, Add the path to the related Intel IPP files to the existing list. For example: Select LIB and type in the directory for the Intel IPP stub libraries (default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\stublib), Select INCLUDE and type in the directory for the Intel IPP header files (default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\include), Select PATH and type in the directory for the Intel IPP binaries (default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\bin). 7. 8. 9. Click OK in the Edit User Variable dialog box, Click OK in the Environment Variables dialog box, Click OK in the Systems Properties dialog box.

For information on how to set up environment variables for threading, refer to Supporting Multithreaded Applications.

Including Header Files
Intel IPP functions and types are defined in several header files that are organized by the function domains and located in the \include directory. For example, the ippac.h file contains declarations for all audio coding and processing functions. The file ipp.h includes all Intel IPP header files. For forward compatibility, include only ipp.h in your program.

Calling IPP Functions
Due to the DLL dispatcher and merged static library mechanisms described in Linking Your Application with Intel? IPP, calling Intel IPP functions is as simple as calling any other C function. To call an Intel IPP function, do the following:

2-10

Getting Started with Intel? IPP

2

1. 2. 3.

Include the ipp.h header file Set up the function parameters Call the function

The multiple versions of optimized code for each function are concealed under a single entry point. Refer to the “Intel IPP Reference Manual” for function descriptions, lists of required parameters, return values and so on.

2-11

2

Intel? IPP User’s Guide

Before You Begin Using Intel IPP
Before you start using Intel IPP, it is helpful to understand some basic concepts. Table 2-3 summarizes important things to consider before you start using Intel IPP.

Table 2-3

What you need to know before you get started Identify the Intel IPP function domain that meets your needs. Reason: If you know function domain you intend to use will narrow the search in the Reference Manuals for specific routines you need. Besides, you may easily find a sample you would like to run from http://www.intel.com/software/products/ipp/samples.htm. Refer to Table 5-9 to understand what function domains are and what libraries are needed, and to Table 5-10 to understand what kind of cross-domain dependency is introduced. Decide what linking method is appropriate for linking. Reason: If you choose a linking method that suits, you will get the best linking results. For information on the benefits of each linking method, linking command syntax and examples, as well as on other linking topics, such as how to create a custom dynamic library, see Linking Your Application with Intel? IPP Select among the following options to determine how you are going to thread your application: ? Your application is already threaded. ? You may want to use the Intel? threading capability, that is, Compatibility OpenMP* run-time library (libiomp), or a threading capability provided by a third-party compiler. ? You do not want to thread your application. Reason: By default, Intel IPP uses the OpenMP* software to set the number of threads that will be used. If you need a different number, you have to set it yourself using one of the available mechanisms. For more information, see Supporting Multithreaded Applications.

Function domains

Linking method

Threading model

2-12

Intel? IPP Structure

3

This chapter discusses the structure of Intel IPP after installation as well as the library types supplied.

High-level Directory Structure
Table 3-1 shows the high-level directory structure of Intel IPP after installation.

Table 3-1
Directory

High-level directory structure
File types

<ipp directory> <ipp directory>\ippEULA.rtf <ipp directory>\bin <ipp directory>\demo <ipp directory>\doc <ipp directory>\include <ipp directory>\lib <ipp directory>\samples <ipp directory>\stublib <ipp directory>\tools

Main directory (by default: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32) End User License Agreement for Intel IPP Intel IPP dynamic link libraries (DLLs) Executable programs that demonstrate various image and signal processing functionalities Intel IPP documentation files Intel IPP header files Intel IPP static libraries Intel IPP application-level samples (see “Using Intel IPP Samples”) Intel IPP import libraries, used for linking DLLs. Intel IPP Performance Test tool, linkage tools, and tool to set environment variables

3-1

3

Intel? IPP User’s Guide

Supplied Libraries
Table 3-2 lists the types of libraries in Intel IPP and shows examples of the library files supplied:

Table 3-2
Library types

Types of Libraries of Intel IPP
Description Folder location Example

Dynamic

Dynamic link libraries (DLLs) include both processor dispatchers and function implementations "Stub" static library files. They load the required DLLs and link to the correct entry points Contain function implementations for all supported processor types Contain threaded function implementations Contain dispatchers for the merged libraries

\ia32\bin

ipps-6.1.dll, ippst7-6.1.dll

Static (import)

\ia32\stublib

ipps.lib

Static merged

\ia32\lib

ippsmerged.lib

Threaded static merged Static emerged

\ia32\lib \ia32\lib

ippsmerged_t.lib ippsemerged.lib

Using Intel IPP Dynamic Link Libraries (DLLs)
Intel IPP comes with the dynamic link libraries (DLLs) in the \ia32\bin directory. To load the Intel IPP DLLs and link to the correct entry points, use "stub" library files in the \ia32\stublib directory that come with the Intel IPP package (see Table 3-1). To use the DLLs, link to the ipp*.lib files. You must set your lib environment variable using the ippenv.bat file or refer to these files using their full path. Including these libraries is all you need to do to dynamically link to the DLL for the appropriate processor. The DLLs ipp*-6.1.dll (* denotes the appropriate function domain) are "dispatcher" dynamic libraries. At run time, they detect the processor and load the correct processor-specific DLLs. This allows you to write code to call the Intel IPP functions without worrying about which processor the code will execute on - the appropriate version is

3-2

Intel? IPP Structure

3

automatically used. These processor-specific libraries are named ipp*px-6.1.dll, ipp*w7-6.1.dll, ipp*t7-6.1.dll, ipp*v8-6.1.dll, and ipp*p8-6.1.dll (see Table 5-4). For example, in the \ia32\bin directory, ippiv8-6.1.dll reflects the imaging processing libraries optimized for the Intel? CoreTM 2 Duo processors. The only actions needed to use the Intel IPP DLLs, once the "stub" static libraries are linked, is to ensure that the dispatching DLLs and the processor-specific DLLs are on the path. See also Selecting the Intel IPP Libraries Needed by Your Application.

NOTE. You must include the appropriate libiomp5md.dll in your PATH environment variable. Include the directory bin when running on a system with IA-32 architecture.

The following shows a simple example of calling an IPP function; it uses the code in file

t1.cpp:
Example 3-1 Code calling example

#include <stdio.h> #include <ipp.h> int main() { const IppLibraryVersion* libver = ippGetLibVersion(); printf("%s %s\n", libver->Name, libver->Version); return 0; } cmdlinetest>cl /nologo /Fet1.exe -I "C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\include\" t1.cpp "C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\stublib\ippcore.lib" t1.cpp cmdlinetest>t1.exe ippcore-6.1.dll 6.1 build 81

Using Intel IPP Static Libraries
The Intel IPP comes with "merged" static library files that contain every processor version of each function. These files reside in the \ia32\lib directory (see Table 3-1).

3-3

3

Intel? IPP User’s Guide

Just as with the dynamic dispatcher, the appropriate version of a function is executed when the function is called. This mechanism is not as convenient as the dynamic mechanism, but it can result in a smaller total code size in spite of the big size of the static libraries. To use these static libraries, link to the appropriate files ipp*merged.lib in the \lib directory. You will either need to set your LIB environment variable using the ippenv.bat file or refer to these files using their full path. See also Selecting the Intel IPP Libraries Needed by Your Application.

Contents of the Documentation Directory
Table 3-3 shows the content of the \doc subdirectory in the Intel IPP installation directory. Table 3-3
File name

Contents of the \doc Directory
Description Notes

ipp_documentation.htm

Documentation index. Lists the principal Intel IPP documents with appropriate links to the documents General overview of the product and information about this release. Initial User Information Installation guide List of all Intel IPP functions threaded with OpenMP* Intel? Integrated Performance Primitives User’s Guide, this document Signal Processing (vol.1) - contains detailed descriptions of Intel IPP functions and interfaces for signal processing, audio coding, speech recognition and coding, data compression and integrity, string operations and vector arithmetic. Image and Video Processing (vol.2) contains detailed descriptions of Intel IPP functions and interfaces for image processing and compression, color and format conversion, computer vision, video coding. These files can be viewed prior to the product installation

ReleaseNotes.pdf README.txt INSTALL.htm ThreadedFunctionsList. txt userguide_win_ia32.pdf

Intel IPP Reference Manual (in four volumes):

ippsman.pdf, ippsman.chm

ippiman.pdf, ippiman.chm

3-4

Intel? IPP Structure

3

Table 3-3
File name

Contents of the \doc Directory
Description Notes

ippmman.pdf, ippmman.chm

Small Matrices, Realistic Rendering (vol.3) - contains detailed descriptions of Intel IPP functions and interfaces for vector and matrix algebra, linear system solution, least squares and eigenvalue problems as well as for realistic rendering and 3D data processing. Cryptography (vol.4) - contains detailed descriptions of Intel IPP functions and interfaces for cryptography.

ippcpman.pdf, ippcpman.chm

3-5

Configuring Your Development Environment

4

This chapter explains how to configure your development environment for the use with Intel? IPP.

Configuring Microsoft* Visual C++* .NET 2003 or Microsoft Visual C++*2005 Software to Link with Intel IPP
To configure Microsoft* Visual C++* .NET 2003 or Microsoft* Visual C++* 2005 environments to link with Intel IPP, follow the steps below: 1. 2. 3. Select View > Solution Explorer (make sure this window is active). Select Tools > Options > Projects and Solutions > VC++ Directories. In the drop down menu Show directories for:, select Include Files, and type in the directory for the Intel IPP include files (for example, the default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\include). In the drop down menu titled Show directories for:, select Library Files, and then type in the directory for the Intel IPP library files (for example, the default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\stublib or C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\lib). In the drop down menu Show directories for:, select Executable Files and type in the directory for the Intel IPP executable files (for example, the default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\bin). On the main toolbar, select Project > Properties > Linker > Input. In the Additional Dependencies line, add the libraries you wish to link to (for example, ipps.lib or ippsmerged.lib). For more information on choosing the best linking method for your Intel IPP application, please refer to Linking Your Application with Intel? IPP.

4.

5.

6.

4-1

4

Intel? IPP User’s Guide

Creating Visual C++ 2005 Project Files for the Intel? IPP Samples
To create Microsoft* Visual* C++ 2005 project files for the Intel IPP samples, follow the steps belowm which use jpegview sample code as an example. 1. 2. 3. Download the media codec sample jpegview from http://www.intel.com/software/products/ipp/samples.htm . Select File > New > Project from Existing Code: Select Visual C++ as the type of project you would like to create from the dropdown menu in the popup window. Fill in the project file location and project name. Check the Add files to the project from these folders check box. Click Add, select the jpegview folder, and click OK. Select View >Solution Explorer (make sure this window is active). Select Projects > Properties. Under Configuration Properties > C/C++ in the drop down menu titled Show directories for:, select Include Files and type in the directory for the Intel IPP include files (for example, default: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\include). In the drop down menu titled Show directories for:, select Library Files and type in the directory for the Intel IPP library files (for example, default: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\stublib or C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\lib). In the drop down menu titled Show directories for:, select Executable Files and type in the directory for the Intel IPP executable files (for example, default: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\bin).

4. 5. 6.

7.

8.

Building a Microsoft* Visual C++ .NET* Solution for the UMC Sample Code
To generate the Microsoft* Visual C++ .NET* project and solution files for the Intel IPP UMC sample code, use the script file gen_vsproj.pl, which can be downloaded from http://www.intel.com/support/performancetools/libraries/ipp/win/ia/sb/cs-022835.htm. It works under Microsoft Windows* OS with the ActivePerl* script installed. This script can build solution files for Intel? C++ Compiler 10.0, Microsoft* Visual C++ .NET* 2005. Here are some notes for using this script:

4-2

Configuring Your Development Environment

4

1. 2.

Put the script file in the Intel IPP UMC sample code folder:

\ipp-samples\audio-video-codecs.
At the command line, use the gen_vsproj.pl command to create solution files. Typing “gen_vsproj.pl” will print all command messages. Here is an example that generates Microsoft Visual C++.NET 2003 solutions for applications with the IA-32 architecture:

>gen_vsproj.pl -vs2003 -noicl -con application\* -gui application\umc_reverb_demo -dll plug-in\audio_codecs -lib codec\* core\* io\* pipeline\* plug-in\object_factory I"javascript:void(null);" -win32 -L "javascript:void(null);" "javascript:void(null);" -l "javascript:void(null);" ddraw.lib dsound.lib
Solution files are located in the ipp-samples\audio-video-codecs\application\xxxx directories. For details on running Intel IPP sample code, see Appendix , “Intel? IPP Samples”.

Using the IntelliSense* Capability
IntelliSense is a set of native features of the Microsoft Visual Studio* IDE that make language references easily accessible. The user programming with Intel IPP in the Visual Studio Code Editor can employ two IntelliSense features: Complete Word and Parameter Info.

NOTE. Both features use header files. Therefore, to benefit from IntelliSense, make sure the path to the include files is specified in the Visual Studio or solution settings. See above sections on how to do this.

Complete Word
For a software library, the Complete Word feature types or prompts for the rest of the name defined in the header file once the first few characters of the name are typed in your code. Provided your C/C++ code contains the include statement with the appropriate Intel IPP header file, to complete the name of the function or named constant specified in the header file, follow these steps:

4-3

4

Intel? IPP User’s Guide

1. 2.

Type the first few characters of the name (for example, ippsFFT) Press Alt+RIGHT ARROW or Ctrl+SPACEBAR. If you have typed enough characters to eliminate ambiguity in the name, the rest of the name is typed automatically. Otherwise, the pop-up list of the names specified in the header file opens (see Figure 4-1). Select the name from the list, if needed. IntelliSense Complete Word

3. Figure 4-1

Parameter Info
The Parameter Info feature displays the parameter list for a function to give information on the number and types of parameters.

4-4

Configuring Your Development Environment

4

Provided your C/C++ code contains the include statement with the appropriate Intel IPP header file, to get the list of parameters of a function specified in the header file, follow these steps: 1. 2. Type the function name Type the opening parenthesis

This makes the tooltip with the function API prototype pop up, and the current parameter in the API prototype is highlighted (see Figure 4-2). Figure 4-2 IntelliSense Parameter Info

4-5

4

Intel? IPP User’s Guide

Using Intel? IPP with Intel? C++ Compiler
Using Intel IPP with the Intel C++ Compiler is similar as to using Intel IPP with the Microsoft* C++ Compiler. In Microsoft Visual C++* .NET environment, instead of providing settings at Tools > Options > Projects > VC++ Directories, choose to provide settings at Tools > Options > Intel? C++ by following these steps: 1. 2. 3. Select View > Solution Explorer (and make sure this window is active), Select Tools > Options > Projects > Tools > Options > Intel? C++ > Intel? C++ XX, In the drop down menu Show directories for:, select Include Files and then type in the directory for the Intel IPP include files (for example, the default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\include), In the Include path, type in the directory for the Intel IPP library files (for example, the default is: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\stublib or C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\lib), In the library path, type in the directory for the Intel IPP executable files (for example, the default: C:\Program Files\Intel\IPP\6.1.x.xxx\ia32\bin), On the main toolbar, select Project > Properties > Linker > Input and in the Additional Dependencies line, add the libraries you wish to link to (for example, ipps.lib or ippsmerged.lib).

4.

5. 6.

Using Intel? IPP with Borland C++ Builder* Integrated Development Environment
You cannot directly link to the Intel IPP static library in Borland C++ Builder*. Instead, try the following: ? Use the Intel IPP merged static libraries to create a custom DLL (export the functions called in your application). The custom DLL and merged static library tools are available as part of the integration sample package. The Intel IPP static libraries are located in the IPP\6.1.x.xxx\ia32\lib directory. Use IMPLIB (on the DLL) to create an import library in the Borland environment.

?

4-6

Linking Your Application with Intel? IPP

5

This chapter discusses linking Intel IPP to an application, considers differences between the linking methods regarding development and target environments, installation specifications, run-time conditions, and other application requirements to help the user select the linking method that suits him best, shows linking procedure for each linking method, and gives linking examples.

Dispatching
Intel IPP uses codes optimized for various central processing units (CPUs). Dispatching refers to detection of your CPU and selecting the Intel IPP binary that corresponds to the hardware that you are using. For example, in the \ia32\bin directory, file ippiv8-6.1.dll reflects the optimized imaging processing libraries for Intel? Core? 2 Duo processors. A single Intel IPP function, for example ippsCopy_8u(), may have many versions, each one optimized to run on a specific Intel? processor with specific architecture, for example: the version of this function optimized for the Pentium? 4 processor is w7_ippsCopy_8u(). Table 5-1 shows processor-specific codes used in Intel IPP. Table 5-1
Abbreviation

Identification of Codes Associated with Processor-Specific Libraries
Meaning IA-32 Intel? architecture

px w7 t7 v8

C-optimized for all IA-32 processors Optimized for processors with Intel? Streaming SIMD Extensions 2 (Intel SSE2) Optimized for processors with Intel? Streaming SIMD Extensions 3 (Intel SSE3) Optimized for processors with Intel? Supplemental Streaming SIMD Extensions 3 (Intel SSSE3)

5-1

5
p8 s8

Intel? IPP User’s Guide

Table 5-1
Abbreviation

Identification of Codes Associated with Processor-Specific Libraries
Meaning

Optimized for processors with Intel? Streaming SIMD Extensions 4.1 (SSE4.1) Optimized for the Intel? AtomTM processor.

Processor Type and Features
Processor Features
To obtain information about the features of the processor used in your computer system, use function ippGetCpuFeatures, which is declared in the ippcore.h file. This function retrieves main CPU features returned by the function CPUID.1 and stores them consecutively in the mask that is returned by the function. Table 5-2 lists all CPU features that can be retrieved (see more details in the description of the function ippGetCpuFeatures in the Intel IPP Refrence Manual, vol.1). Table 5-2
Mask Value

Processor Features
Name Feature

1 2 4 8 16 32 64 128 256 512 1024 2048

ippCPUID_MMX ippCPUID_SSE ippCPUID_SSE2 ippCPUID_SSE3X ippCPUID_SSSE3 ippCPUID_MOVBE ippCPUID_SSE41 ippCPUID_SSE42 ippCPUID_AVXX ippAVX_ENABLEDBYOS ippCPUID_AES ippCPUID_CLMUL

MMXTM technology Intel? Streaming SIMD Extensions Intel? Streaming SIMD Extensions 2 Intel? Streaming SIMD Extensions 3 Supplemental Intel? Streaming SIMD Extensions MOVBE instruction is supported Intel? Streaming SIMD Extensions 4.1 Intel? Streaming SIMD Extensions 4.2 Intel? Advanced Vector Extensions (Intel AVX) instruction set is supported The operating system supports Intel AVX AES instruction is supported PCLMULQDQ instruction is supported

5-2

Linking Your Application with Intel? IPP

5

Processor Type
To detect the processor type used in your computer system, use function ippGetCpuType, which is declared in the ippcore.h file. It returns an appropriate IppCpuType variable value. All of the enumerated values are given in the ippdefs.h header file. For example, the return value ippCpuCoreDuo means that your system uses Intel? Core? Duo processor. Table 5-3 shows possible values of ippGetCpuType and their meaning.

Table 5-3

Detecting processor type. Returned values and their meaning
Processor Type

Returned Variable Value

ippCpuPP ippCpuPMX ippCpuPPR ippCpuPII ippCpuPIII ippCpuP4 ippCpuP4HT ippCpuP4HT2 ippCpuCentrino ippCpuCoreSolo ippCpuCoreDuo ippCpuITP ippCpuITP2 ippCpuEM64T ippCpuC2D ippCpuC2Q ippCpuPenryn ippCpuBonnell ippCpuNehalem ippCpuSSE ippCpuSSE2

Intel? Pentium? processor Pentium? processor with MMX? technology Pentium? Pro processor Pentium? II processor Pentium? III processor and Pentium? III Xeon? processor Pentium? 4 processor and Intel? Xeon? processor Pentium? 4 processor with Hyper–Threading Technology Pentium? Processor with Intel? Streaming SIMD Extensions 3 Intel? Centrino? mobile Technology Intel? Core? Solo processor Intel? Core? Duo processor Intel? Itanium? processor Intel? Itanium? 2 processor Intel? 64 Instruction Set Architecture (ISA) Intel? Core? 2 Duo Processor Intel? Core? 2 Quad processor Intel? Core? 2 processor with Intel? Streaming SIMD Extensions 4.1 instruction set Intel? AtomTM processor Intel? Core? i7 processor Processor with Intel? Streaming SIMD Extensions instruction set Processor with Intel? Streaming SIMD Extensions 2 instruction set

5-3

5

Intel? IPP User’s Guide

Table 5-3

Detecting processor type. Returned values and their meaning (continued)
Processor Type

Returned Variable Value

ippCpuSSE3 ippCpuSSSE3 ippCpuSSE41 ippCpuSSE42 ippCpuAVX ippCpuX8664 ippCpuUnknown

Processor with Intel? Streaming SIMD Extensions 3 instruction set Processor with Supplemental Intel? Streaming SIMD Extensions 3 instruction set Processor with Intel? Streaming SIMD Extensions 4.1 instruction set Processor with Intel? Streaming SIMD Extensions 4.2 instruction set Processor supports Intel? Advanced Vector Extensions instruction set Processor supports 64 bit extension Unknown Processor

Selecting Between Linking Methods
You can use different linking methods for Intel IPP: ? ? ? ? Dynamic linking using the run-time dynamic link libraries (DLLs) Static linking with dispatching using emerged and merged static libraries Static linking without automatic dispatching using merged static libraries Dynamic linking with your own - custom - DLL.

Answering the following questions helps you select the linking method which best suites you: ? ? ? Are there limitations on how large the application executable can be? Are there limitations on how large the application installation package can be? Is the Intel IPP-based application a device driver or similar “ring 0” software that executes in the kernel mode at least some of the time? Will various users install the application on a range of processor types, or is the application explicitly supported only on a single type of processor? Is the application part of an embedded computer where only one type of processor is used? What resources are available for maintaining and updating customized Intel IPP components? What level of effort is acceptable for incorporating new processor optimizations into the application?

?

5-4

Linking Your Application with Intel? IPP

5

?

How often will the application be updated? Will application components be distributed independently or will they always be packaged together?

Dynamic Linking
The dynamic linking is the simplest method and the most commonly used. It takes full advantage of the dynamic dispatching mechanism in the dynamic link libraries (DLLs) (see also Intel? IPP Structure). The following table summarizes the features of dynamic linking to help you understand trade-offs of this linking method.

Table 5-4
Benefits

Summary of Dynamic Linking Features
Considerations

? ? ?

?

Automatic run-time dispatch of processor-specific optimizations Enabling updates with new processor optimizations without recompile/relink Reduction of disk space requirements for applications with multiple Intel IPP-based executables Enabling more efficient shared use of memory at run-time for multiple Intel IPP-based applications

?

? ? ?

Application executable requires access to Intel IPP run-time dynamic link libraries (DLLs) to run Not appropriate for kernel-mode/device-driver/ring-0 code Not appropriate for web applets/plug-ins that require very small download There is a one-time performance penalty when the Intel IPP DLLs are first loaded

To dynamically link with Intel IPP, follow these steps: 1. 2. 3. 4. Add ipp.h, which includes the header files of all IPP domains. Use the normal IPP function names when calling IPP functions. Link corresponding domain import libraries. For example, if you use the ippsCopy_8u function, link against ipps.lib. Make sure that run-time libraries, for example ipps.dll, are on the executable search path at run time. Run the ippenv.bat from directory \tools\env to ensure this application built with the Intel IPP dynamic link libraries loads the appropriate processor-specific library.

Static Linking (with Dispatching)
Some applications use only a few Intel? IPP functions and require a small memory footprint. Using the static link libraries via the emerged and merged libraries offers both the benefits of a small footprint and optimization on multiple processors. The emerged

5-5

5

Intel? IPP User’s Guide

libraries (such as ippsemerged.lib) provide an entry point for the non-decorated (with normal names) IPP functions, and the jump table to each processor-specific implementation. When linked with your application, the function calls corresponding functions in the merged libraries (such as ippsmerged.lib) in accordance with the CPU setting detected by functions in ippcorel.lib. The emerged libraries do not contain any implementation code. The emerged libraries must be initialized before any non-decorated functions can be called. One may choose the function ippStaticInit() that initializes the library to use the best optimization available, or the function ippStaticInitCpu() that lets you specify the CPU. In any case, one of these functions must be called before any other IPP functions. Otherwise, a "px" version of the IPP functions will be called, which can decrease the performance of your application. Example 5-1 illustrates the performance difference. This example appears in the t2.cpp file: Example 5-1 Performance difference with and without calling StaticInit

#include <stdio.h> #include <ipp.h> int main() { const int N = 20000, loops = 100; Ipp32f src[N], dst[N]; unsigned int seed = 12345678, i; Ipp64s t1,t2; /// no StaticInit call, means PX code, not optimized ippsRandUniform_Direct_32f(src,N,0.0,1.0,&seed); t1=ippGetCpuClocks(); for(i=0; i<loops; i++) ippsSqrt_32f(src,dst,N); t2=ippGetCpuClocks(); printf("without StaticInit: %.1f clocks/element\n", (float)(t2-t1)/loops/N); ippStaticInit(); t1=ippGetCpuClocks(); for(i=0; i<loops; i++) ippsSqrt_32f(src,dst,N); t2=ippGetCpuClocks(); printf("with StaticInit: %.1f clocks/element\n", (float)(t2-t1)/loops/N); return 0; } t2.cpp cmdlinetest>t2 without StaticInit: 61.3 clocks/element with StaticInit: 4.5 clocks/element

5-6

Linking Your Application with Intel? IPP

5

When you perform static linking via the emerged libraries, there are things you should consider. Table 5-5 summarizes the pros and cons of this type of static linking.

Table 5-5
Benefits

Summary of Features of the Static Linking (with Dispatching)
Considerations

? ? ?

Dispatches processor-specific optimizations during run-time Creates a self-contained application executable Generates a smaller footprint than the full set of Intel IPP DLLs

?

?

Intel IPP code is duplicated for multiple Intel IPP-based applications because of static linking An additional function call for dispatcher initialization is needed (once) during program initialization

Follow these steps to use static linking with dispatching: 1. 2. Include ipp.h in your code. Before calling any Intel IPP functions, initialize the static dispatcher using either the function ippStaticInit() or ippInitCpu(), which are declared in the header file ippcore.h. Use the normal IPP function names to call IPP functions. Link corresponding emerged libraries followed by merged libraries, and then

3. 4.

ippcorel.lib. For example, if the function ippsCopy_8u() is used, the linked libraries are ippsemerged.lib, ippsmerged.lib, and ippcorel.lib.

Static Linking (without Dispatching)
This method uses linking directly with the merged static libraries. You may want to use your own static dispatcher instead of the provided emerged dispatcher. The IPP sample mergelib demonstrates how to do this. Please refer to the latest updated sample from the Intel IPP samples directory: \ipp-samples\advanced-usage\linkage\mergelib at http://www.intel.com/software/products/ipp/samples.htm. When a self-contained application is needed, only one processor type is supported and there are tight constraints on the executable size. One common use for embedded applications is when the application is bundled with only one type of processor.

5-7

5

Intel? IPP User’s Guide

Table 5-6 summarizes basic features of this method of linking. Table 5-6
Benefits

Summary of Features of the Static Linking (without dispatching)
Considerations

? ? ?

?

? ?

Small executable size with support for only one processor type An executable suitable for kernel-mode/device-driver/ring-0 use*) An executable suitable for a Web applet or a plug-in requiring very small file download and support for only one processor type Self-contained application executable that does not require the Intel IPP run-time DLLs to run Smallest footprint for application package Smallest installation package *) for not-threaded libraries only

? ?

The executable is optimized for only one processor type Updates to processor-specific optimizations require rebuild and/or relink

You can use alternatives to the above procedure. The Intel IPP package includes a set of processor-specific header files (such as ipp_w7.h). You can use these header files instead of the IPPCALL macro. Refer to Static linking to Intel? IPP Functions for One Processor in \ia32\tools\staticlib\readme.htm.

5-8

Linking Your Application with Intel? IPP

5

Building a Custom DLL
Some applications have few internal modules and the Intel IPP code needs to be shared by these modules only. In this case, you can use dynamic linking with the customized dynamic link library (DLL) containing only those Intel IPP functions that the application uses. Table 5-7 summarizes features of the custom DLLs. Table 5-7
Benefits

Custom DLL Features
Considerations

? ? ?

Run-time dispatching of processor-specific optimizations Reduced hard-drive footprint compared with a full set of Intel IPP DLLs Smallest installation package to accommodate use of some of the same Intel IPP functions by multiple applications

?

? ?

?

Application executable requires access to the Intel compiler specific run-time libraries that are delivered with Intel IPP. Developer resources are needed to create and maintain the custom DLLs Integration of new processor-specific optimizations requires rebuilding the custom DLLs Not appropriate for kernel-mode/device-driver/ring-0 code

To create a custom DLL, you need to create a separate build step or project that generates the DLL and stubs. The specially developed sample demonstrates how it can be done. Please refer to the latest updated custom dll sample from the Intel IPP samples directory: \ipp-samples\advanced-usage\linkage\customdll at http://www.intel.com/software/products/ipp/samples.htm.

5-9

5

Intel? IPP User’s Guide

Comparison of Intel IPP Linkage Methods
Table 5-8 gives a quick comparison of the IPP linkage methods. Table 5-8
Feature

Intel IPP Linkage Method Summary Comparison
Dynamic Linkage Static Linkage with Dispatching Static Linkage without Dispatching Using Custom DLL

Processor Updates Optimization Build

Automatic

Recompile & redistribute All processors Link to static libraries and static dispatchers Regular names Small Small Yes

Release new processor-specific application One processor Link to merged libraries or threaded merged libraries Processor-specific names Smallest Small Yes

Recompile & redistribute All processors Build separate DLL

All processors Link to stub static libraries

Calling Total Binary Size Executable Size Kernel Mode

Regular names Large Smallest No

Regular names Small Smallest No

Selecting the Intel IPP Libraries Needed by Your Application
Table 5-9 shows functional domains and the relevant header files and libraries used for each linkage method. Table 5-9 Libraries Used for Each Linkage Method
Static Linking with Dispatching and Domain Description Header Files Dynamic Linking Custom Dynamic Linking Static Linking without Dispatching

Audio Coding

ippac.h

ippac.lib

ippacemerged.lib ippacmerged.lib ippacmerged_t.lib

5-10

Linking Your Application with Intel? IPP

5

Table 5-9

Libraries Used for Each Linkage Method (continued)
Static Linking with Dispatching and

Domain Description

Header Files Dynamic Linking

Custom Dynamic Linking

Static Linking without Dispatching

Color Conversion String Processing Cryptography

ippcc.h

ippcc.lib

ippccemerged.lib ippccmerged.lib ippccmerged_t.lib ippchemerged.lib ippchmerged.lib ippchmerged_t.lib ippcpemerged.lib ippcpmerged.lib ippcpmerged_t.lib ippcvemerged.lib ippcvmerged.lib ippcvmerged_t.lib ippdcemerged.lib ippdcmerged.lib ippdcmerged_t.lib ippdiemerged.lib ippdimerged.lib ippdimerged_t.lib ippgenemerged.li ippgenmerged.lib b ippgenmerged_t.lib ippiemerged.lib ippimerged.lib ippimerged_t.lib ippjmerged.lib ippjmerged_t.lib ipprmerged.lib ipprmerged_t.lib

ippch.h

ippch.lib

ippcp.h

ippcp.lib

Computer Vision Data Compression Data Integrity

ippcv.h

ippcv.lib

ippdc.h

ippdc.lib

ippdi.h

ippdi.lib

Generated Functions Image Processing Image Compression Realistic Rendering and 3D Data Processing

ipps.h

ippgen.lib

ippi.h

ippi.lib

ippj.h

ippj.lib

ippjemerged.lib

ippr.h

ippr.lib

ippremerged.lib

5-11

5

Intel? IPP User’s Guide

Table 5-9

Libraries Used for Each Linkage Method (continued)
Static Linking with Dispatching and

Domain Description

Header Files Dynamic Linking

Custom Dynamic Linking

Static Linking without Dispatching

Small Matrix Operations Signal Processing Speech Coding Speech Recognition Video Coding

ippm.h

ippm.lib

ippmemerged.lib

ippmmerged.lib ippmmerged_t.lib ippsmerged.lib ippsmerged_t.lib

ipps.h

ipps.lib

ippsemerged.lib

ippsc.h

ippsc.lib

ippscemerged.lib ippscmerged.lib ippscmerged_t.lib ippsremerged.lib ippsrmerged.lib ippsrmerged_t.lib ippvcemerged.lib ippvcmerged.lib ippvcmerged_t.lib ippvmemerged.lib ippvmmerged.lib ippvmmerged_t.lib ippcorel.lib ippcorel.lib ippcore_t.lib

ippsr.h

ippsr.lib

ippvc.h

ippvc.lib

Vector Math

ippvm.h

ippvm.lib

Core Functions ippcore.h ippcore.lib

Dynamic Linkage
To use the dynamic linking libraries, you must link to ipp*.lib files in the \stublib directory, where * denotes the appropriate function domain. You must also link to all corresponding domain libraries used in your applications plus the libraries ipps.lib, ippcore.lib, and libiomp5md.lib. For example, consider that your application uses three Intel IPP functions

ippiCopy_8u_C1R, ippiCanny_16s8u_C1R, and ippmMul_mc_32f. These three
functions belong to the image processing, computer vision, and small matrix operations domains, respectively. To include these functions into your application, you must link to the following Intel IPP libraries:

5-12

Linking Your Application with Intel? IPP

5

ippi.lib ippcv.lib ippm.lib ippcore.lib libiomp5md.lib

Static Linkage with Dispatching
To use the static linking libraries, you need to link to ipp*emerged.lib, ipp*merged.lib, ippsemerged.lib, ippsmerged.lib, and ippcorel.lib. The * denotes the appropriate function domain. If you want to use the Intel IPP functions threaded with the OpenMP*, you need to link to ipp*emerged.lib, ipp*merged_t.lib, ippsemerged.lib, ippsmerged_t.lib, ippcore_t.lib, and libiomp5mt.lib. All these libraries are located in the \lib directory containing domain-specific functions. Note that both merged and emerged libraries for all domains plus the signal processing domain must be linked to your application. For example, consider that your application uses three Intel IPP functions

ippiCopy_8u_C1R, ippiCanny_16s8u_C1R, and ippmMul_mc_32f. These three functions
belong to the image processing, computer vision, and small matrix operations domains respectively. If you want to use the threaded functions, you must link the following libraries to your application:

ippcvemerged.lib and ippcvmerged_t.lib ippmemerged.lib and ippmmerged_t.lib ippiemerged.lib and ippimerged_t.lib ippsemerged.lib and ippsmerged_t.lib ippcore_t.lib libiomp5mt.lib

Library Dependencies by Domain (Static Linkage Only)
Table 5-10 lists library dependencies by domain. When you link to a certain library (for example, data compression domain), you must link to the libraries on which it depends (in our example, the signal processing and core functions).

5-13

5

Intel? IPP User’s Guide

Table 5-10

Library Dependencies by Domain
Library Dependent on

Domain

Audio Coding Color Conversion Cryptography Computer Vision Data Compression Data Integrity Generated Functions Image Processing Image Compression Small Matrix Operations Realistic Rendering and 3D Data Processing Signal Processing Speech Coding Speech Recognition String Processing Video Coding Vector Math

ippac ippcc ippcp ippcv ippdc ippdi ippgen ippi ippj ippm ippr ipps ippsc ippsr ippch ippvc ippvm

ippdc, ipps, ippcore ippi, ipps, ippcore ippcore ippi, ipps, ippcore ipps, ippcore ippcore ipps, ippcore ipps, ippcore ippi, ipps, ippcore ippi, ipps, ippcore ippi, ipps, ippcore ippcore ipps, ippcore ipps, ippcore ipps, ippcore ippi, ipps, ippcore ippcore

Refer to Intel IPP Reference Manuals to find which domain your function belongs to.

Linking Examples
For more linking examples, please go to http://www.intel.com/software/products/ipp/samples.htm For information on using sample code, please see “Intel? IPP Samples”.

5-14

Supporting Multithreaded Applications
This chapter discusses the use of Intel? IPP in multithreading applications.

6

Intel IPP Threading and OpenMP* Support
All Intel IPP functions are thread-safe in both dynamic and static libraries and can be used in the multithreaded applications. Some Intel IPP functions contain OpenMP* code that increases significantly performance on multi-processor and multi-core systems. These functions include color conversion, filtering, convolution, cryptography, cross correlation, matrix computation, square distance, and bit reduction, etc. Refer to the ThreadedFunctionsList.txt document to see the list of all threaded functions in the \doc directory of the Intel IPP installation. See also http://www.intel.com/software/products/support/ipp for more topics related to Intel IPP threading and OpenMP* support, including older Intel IPP versions of threaded API.

Setting Number of Threads
The default number of threads for Intel IPP threaded libraries is equal to the number of processors in the system and does not depend on the value of the OMP_NUM_THREADS environment variable. To set another number of threads used by Intel IPP internally, call the function

ippSetNumThreads(n)at the very beginning of an application. Here n is the desired number of threads (1,...). If internal parallelization is not desired, call ippSetNumThreads(1).

6-1

6

Intel? IPP User’s Guide

Using Shared L2 Cache
Some functions in the signal processing domain are threaded on two threads intended for the Intel? Core?2 processor family, and exploit the advantage of a merged L2 cache. These functions (single and double precision FFT, Div, Sqrt, and so on) achieve the maximum performance if both two threads are executed on the same die. In this case, these threads work on the same shared L2 cache. For processors with two cores on the die, this condition is satisfied automatically. For processors with more than two cores, a special OpenMP environmental variable must be set:

KMP_AFFINITY=compact
Otherwise, the performance may degrade significantly.

Nested Parallelization
If the multithreaded application created with OpenMP uses the threaded Intel IPP function, this function will operate in a single thread because the nested parallelization is disabled in OpenMP by default. If the multithreaded application created with other tools uses the threaded Intel IPP function, it is recommended that you disable multithreading in Intel IPP to avoid nested parallelization and to avoid possible performance degradation.

Disabling Multithreading
To disable multi-threading, call function ippSetNumThreads with parameter 1, or link your application with IPP non-threaded static libraries.

6-2

Managing Performance and Memory

7

This chapter describes ways you can get the most out of the Intel? IPP software such as aligning memory, thresholding denormal data, reusing buffers, and using Fast Fourier Transform (FFT) for algorithmic optimization (where appropriate). Finally, it gives information on how to accomplish the Intel IPP functions performance tests by using the Intel IPP Performance Test Tool and it gives some examples of using the Performance Tool Command Lines.

Memory Alignment
The performance of Intel IPP functions can be significantly different when operating on aligned or misaligned data. Access to memory is faster if pointers to the data are aligned. Use the following Intel IPP functions for pointer alignment, memory allocation and deallocation:

void* ippAlignPtr( void* ptr, int alignBytes )
Aligns a pointer, can align to 2/4/8/16/…

void* ippMalloc( int length )
32-byte aligned memory allocation. Memory can be freed only with the function ippFree.

void ippFree( void* ptr )
Frees memory allocated by the function ippMalloc.

Ipp<datatype>* ippsMalloc_<datatype>( int len )
32-byte aligned memory allocation for signal elements of different data types. Memory can be freed only with the function ippsFree.

void ippsFree( void* ptr ) Frees memory allocated by ippsMalloc. Ipp<datatype>* ippiMalloc_<mod>(int widthPixels, int heightPixels, int* pStepBytes)
32-byte aligned memory allocation for images where every line of the image is

7-1

7

Intel? IPP User’s Guide

padded with zeros. Memory can be freed only with the function ippiFree.

void ippiFree( void* ptr ) Frees memory allocated by ippiMalloc.
Example 7-1 demonstrates how the function ippiMalloc can be used. The amount of memory that can be allocated is determined by the operating system and system hardware, but it cannot exceed 2GB.

NOTE. Intel IPP memory functions are wrappers of the standard malloc and free functions that align the memory to a 32-byte boundary for optimal performance on the Intel architecture.

NOTE. The Intel IPP functions ippFree, ippsFree, and ippiFree can only be used to free memory allocated by the functions ippMalloc, ippsMalloc and ippiMalloc, respectively.

NOTE. The Intel IPP functions ippFree, ippsFree, and ippiFree cannot be used to free memory allocated by standard functions like malloc or calloc; nor can the memory allocated by the Intel IPP functions ippMalloc, ippsMalloc, and ippiMalloc be freed by the standard function free.

7-2

Managing Performance and Memory

7

Example 7-1 Calling the ippiMalloc function

#include "stdafx.h" #include "ipp.h" #include "tools.h" int main(int argc, char *argv[]) { IppiSize size = {320, 240}; int stride; Ipp8u* pSrc = ippiMalloc_8u_C3(size.width, size.height, &stride); ippiImageJaehne_8u_C3R(pSrc, stride, size); ipView_8u_C3R(pSrc, stride, size, "Source image", 0); int dstStride; Ipp8u* pDst = ippiMalloc_8u_C3(size.width, size.height, &dstStride); ippiCopy_8u_C3R(pSrc, stride, pDst, dstStride, size); ipView_8u_C3R(pDst, dstStride, size, "Destination image 1", 0); IppiSize ROISize = { size.width/2, size.height/2 }; ippiCopy_8u_C3R(pSrc, stride, pDst, dstStride, ROISize); ipView_8u_C3R(pDst, dstStride, ROISize, "Destination image, small", 0); IppiPoint srcOffset = { size.width/4, size.height/4 }; ippiCopy_8u_C3R(pSrc + srcOffset.x*3 + srcOffset.y*stride, stride, pDst, dstStride, ROISize); ipView_8u_C3R(pDst, dstStride, ROISize, "Destination image, small & shifted", 1); return 0; }

Thresholding Data
Denormal numbers are the border values in the floating-point format and special case values for the processor. Operations on denormal data make processing slow, even if corresponding interrupts are disabled. Denormal data occurs, for example, in filtering by Infinite Impulse Response (IIR) and Finite Impulse Response (FIR) filters of the signal captured in fixed-point format and converted to the floating-point format. To avoid the slowdown effect in denormal data processing, the Intel IPP threshold functions can be applied to the input signal before filtering. For example:

if (denormal_data) ippsThreshold_LT_32f_I( src, len, 1e-6f ); ippsFIR_32f( src, dst, len, st );

7-3

7

Intel? IPP User’s Guide

The 1e-6 value is the threshold level; the input data below that level are set to zero. Because the Intel IPP threshold function is very fast, the execution of two functions is faster than execution of one if denormal numbers meet in the source data. Of course, if the denormal data occurs while using the filtering procedure, the threshold functions do not help. In this case, for Intel processors beginning with the Intel? Pentium? 4 processor, it is possible to set special computation modes - flush-to-zero (FTZ) and the denormals-are-zero (DAZ). You can use functions ippsSetFlushToZero and ippsSetDenormAreZeros to enable these modes. Note that this setting takes effect only when computing is done with the Intel? Streaming SIMD Extensions (Intel? SSE) and Intel Streaming SIMD Extensions 2 (Intel SSE2) instructions. Table 7-1 illustrates how denormal data may affect performance and it shows the effect of thresholding denormal data. As you can see, thresholding takes only three clocks more. On the other hand, denormal data can cause the application performance to drop 250 times.

Table 7-1
Data/Method

Performance Resulting from Thresholding Denormal Data
Denormal + Normal Denormal Threshold

CPU cycles per element

46

11467

49

Reusing Buffers
Some Intel IPP functions require internal memory for various optimization strategies. At the same time, you should be aware that memory allocation inside of the function may have a negative impact on performance in some situations, such as in the case of cache misses. To avoid or minimize memory allocation and keep your data in warm cache, some functions, for example, Fourier transform functions, can use or reuse memory given as a parameter to the function. If you have to call a function, for example, an FFT function, many times, the reuse of an external buffer results in better performance. A common example of this kind of processing is to perform filtering using FFT, or to compute FFT as two FFTs in two separate threads:

ippsFFTInitAlloc_C_32fc( ippAlgHintAccurate );

&ctxN2, order-1, IPP_FFT_DIV_INV_BY_N,

ippsFFTGetBufSize_C_32fc( ctxN2, &sz ); buffer = sz > 0 ? ippsMalloc_8u( sz ) : 0; int phase = 0;

7-4

Managing Performance and Memory

7

/// prepare source data for two FFTs

ippsSampleDown_32fc( x, fftlen, xleft, &fftlen2, 2, &phase ); phase = 1; ippsSampleDown_32fc( x, fftlen, xrght, &fftlen2, 2, &phase );

ippsFFTFwd_CToC_32fc( xleft, Xleft, ctxN2, buffer ); ippsFFTFwd_CToC_32fc( xrght, Xrght, ctxN2, buffer );
The external buffer is not necessary. If the pointer to the buffer is 0, the function allocates memory inside.

Using FFT
Fast Fourier Transform (FFT) is a universal method to increase performance of data processing, especially in the field of digital signal processing where filtering is essential. The convolution theorem states that filtering of two signals in the spatial domain can be computed as point-wise multiplication in the frequency domain. The data transformation to and from the frequency domain is usually performed using the Fourier transform. You can apply the Finite Impulse Response (FIR) filter to the input signal by using Intel IPP FFT functions, which are very fast on Intel? processors. You can also increase the data array length to the next greater power of two by padding the array with zeroes and then applying the forward FFT function to the input signal and the FIR filter coefficients. Fourier coefficients obtained in this way are multiplied point-wise and the result can easily be transformed back to the spatial domain. The performance gain achieved by using FFT may be very significant. If the applied filter is the same for several processing iterations, then the FFT of the filter coefficients needs to be done only once. The twiddle tables and the bit reverse tables are created in the initialization function for the forward and inverse transforms at the same time. The main operations in this kind of filtering are presented below:

ippsFFTInitAlloc_R_32f( &pFFTSpec, fftord, IPP_FFT_DIV_INV_BY_N, ippAlgHintNone );
/// perform forward FFT to put source data xx to frequency domain

ippsFFTFwd_RToPack_32f( xx, XX, pFFTSpec, 0 );
/// perform forward FFT to put filter coefficients hh to frequency domain

7-5

7

Intel? IPP User’s Guide

ippsFFTFwd_RToPack_32f( hh, HH, pFFTSpec, 0 );
/// point-wise multiplication in frequency domain is convolution

ippsMulPack_32f_I( HH, XX, fftlen );
/// perform inverse FFT to get result yy in time domain

ippsFFTInv_PackToR_32f( XX, yy, pFFTSpec, 0 );
/// free FFT tables

ippsFFTFree_R_32f( pFFTSpec );
Another way to significantly improve performance is by using FFT and multiplication for processing large size data. Note that the zeros in the example above could be pointers to the external memory, which is another way to increase performance. Note that the Intel IPP signal processing FIR filter is implemented using FFT and you do not need to create a special implementation of the FIR functions.

Running Intel IPP Performance Test Tool
The Intel IPP Performance Test Tool is available for Windows* operating systems based on Intel? Pentium? processors and Intel? Itanium? processors. It is a fully-functioned timing system designed to do performance testing for Intel IPP functions on the same hardware platforms that are valid for the related Intel IPP libraries. It contains command line programs for testing the performance of each Intel IPP function in various ways. You can use comand line options to control the course of tests and generate the results in a desirable format. The results are saved in a .csv file. The course of timing is displayed on the console and can be saved in a .txt file. You can create a list of functions to be tested and set required parameters with which the function should be called during the performance test. The list of functions to be tested and their parameters can either be defined in the .ini file, or entered directly from the console. In the enumeration mode, the Intel IPP performance test tool creates a list of the timed functions on the console and in the .txt or .csv files. Additionally, this performance test tool provides all performance test data in the .csv format. It contains data covering all domains and CPU types supported in Intel IPP. For example, you can read that reference data in sub-directory \tools\perfsys\data. Once the Intel IPP package is installed, you can find the performance test.exe files located in the \ia32\tools\perfsys directory. For example, ps_ipps.exe is a tool to measure performance of the Intel IPP signal processing functions. Similarly, there are the appropriate executable files for each Intel IPP functional domain.

7-6

Managing Performance and Memory

7

The command line format is:

<ps_FileName>.exe [option_1] [option_2] … [option_n]
A short reference for the command line options can be displayed on the console. To invoke the reference, just enter -? or -h in the command line:

ps_ipps.exe -h
The command line options can be divided into six groups by their functionality. You can enter options in an arbitrary order with at least one space between each option name. Some options (like –r, -R, -o, -O) may be entered several times with different file names, and option -f may be entered several times with different function patterns. For detailed descriptions of the performance test tool command line options, see Appendix A, “Performance Test Tool Command Line Options”.

Examples of Using Performance Test Tool Command Lines
The following examples illustrate how you can use common command lines for the performance test tool to generate IPP function performance data.

Example 1. Running in the standard mode:
ps_ippch.exe –B –r
This command causes all Intel IPP string functions to be tested by the default timing method on standard data (-B option). The results will be generated in file ps_ippch.csv (-r option).

Example 2. Testing selected functions:
ps_ipps.exe -fFIRLMS_32f -r firlms.csv
This command tests the FIR filter function FIRLMS_32f (-f option), and generates a .csv file named firlms.csv (-r option).

Example 3. Retrieving function lists:
ps_ippvc.exe -e –o vc_list.txt
This comand causes the output file vc_list.txt (-o option) to list all Intel IPP video coding functions (-e option).

ps_ippvc.exe -e -r H264.csv -f H264
This comand causes the list of functions with names containing H264 (-f option) that can be tested (-e option) to be displayed on the console and stored in file H264.csv (-r option).

7-7

7

Intel? IPP User’s Guide

Example 4. Launching performance test tool with the .ini file:
ps_ipps.exe –B –I
This comand causes the .ini file ps_ipps.ini to be created after the first run (-I option) to test all signal processing functions using the default timing method on standard data (-B option).

ps_ippi.exe –i –r
This comand causes the second run to test all functions usung the timing procedure and all function parameters values specified in the ps_ipps.ini file (-i option) and genarates the output file ps_ipps.csv (-r option). For detailed descriptions of performance test tool command line options, see Appendix A, “Performance Test Tool Command Line Options”.

7-8

Using Intel? IPP with Programming Languages

8

This chapter describes how to use Intel IPP with different programming languages in the Windows*OS development environments, and gives information on relevant samples.

Language Support
In addition to the C programming language, Intel IPP functions are compatible with the following languages (download the samples from http://www.intel.com/software/products/ipp/samples.htm):

Table 8-1

Language support
Environment The Sample Description

Language

Makefile

C++

The sample shows how Intel IPP C-library functions can be overloaded in the C++ interface to create classes for easy signal and image manipulation. The sample shows how to use the Intel IPP libraries in the development of applications for the Windows* CE environment for x86.

Microsoft* eMbedded Visual C++* 4.0, or higher WinCE 5.0 SDK for x86 Makefile Microsoft .NET C#* Microsoft .NET Visual Basic* Borland Delphi*

Fortran C#* Visual Basic* Object Pascal

N/A The sample shows how to use the Intel IPP functions in a C# wrapper class. The demo-application shows how to call Intel IPP functions from a Visual Basic wrapper class. The sample shows how to use Intel IPP image processing primitives in Borland Delphi*.

8-1

8

Intel? IPP User’s Guide

Table 8-1 Java*

Language support (continued)
Environment The Sample Description

Language

Java Development Kit 1.5.0

The sample shows how to use the Intel IPP image processing functions in a Java wrapper class.

Using Intel IPP in Java* Applications
You can call Intel IPP functions in your Java application by using the Java* Native Interface (JNI*). There is some overhead associated with JNI use, especially when the input data size is small. Combining several functions into one JNI call and using managed memory will help improve the overall performance.

8-2

Performance Test Tool Command Line Options
Table A-1
Groups 1. Adjusting Console Input

A

Table A-1 gives brief descriptions of possible command line options for the performance test tool (PTT). Performance Test Tool Command Line Options
Options -A -B -r[<file-name>] -R[<file-name>] -H[ONLY] 2. Managing Output -o[<file-name>] -O[<file-name>] -L<ERR|WARN|PARM|INFO|TRACE> -u[<file-name>] Descriptions Ask parameters before every test from console Batch mode Create .csv file and write PS results Add test results to .csv file Add 'Interest' column to table file [and run only hot tests] Create .txt file and write console output Add console output to .txt file Set detail level of the console output Create .csv file and write summary table ('_sum' is added to default title name) Add summary table to .csv file ('_sum' is added to default title name) Enumerate tests and exit Signal file is created just at the end of the whole testing Sort or don't sort functions (sort mode is default) Run tests of functions with pattern in name, case sensitive Do not test functions with pattern in name, case sensitive

-U[<file-name>] -e -g[<file-name>] -s[-] 3. Selecting Functions for Testing -f <or-pattern> -f-<not-pattern>

A-1

A

Intel? IPP User’s Guide

Table A-1
Groups

Performance Test Tool Command Line Options (continued)
Options -f+<and-pattern> -f=<eq-pattern> -F<func-name> Descriptions Run only tests of functions with pattern in name, case sensitive Run tests of functions with this full name, case sensitive Start testing from function with this full name, case sensitive Read PTT parameters from .ini file Write PTT parameters to .ini file and exit Read tested function names from .ini file Set default title name for .ini file and output files Set default directory for .ini file and input test data files Set default directory for output files Set PTT parameter value Set high or normal process prioritY (normal is default)

4. Operation with .ini Files

-i[<file-name>] -I[<file-name>] -P

5. Adjust default directories and file names for input & output

-n<title-name> -p<dir-name> -l<dir-name>

6. Direct Data Input 7. Process priority

-d<name>=<value> -Y<HIGH/NORMAL>

8. Setting environment 9. Getting help

-N<num-threads>

Call ippSetNumThreads(<num-treads>)

-h -hh -h<option>

Type short help and exit Type extended help and exit Type extended help for the specified option and exit

A-2

Intel? IPP Samples

B

This appendix describes the types of Intel? IPP sample code available for developers to learn how to use Intel IPP, gives the source code example files by categories with links to view the sample code, and explains how to build and run the sample applications. For information on configuring Microsoft* Visual* C++ project files for the Intel IPP samples, see “Creating Visual C++ 2005 Project Files for the Intel? IPP Samples”. For information on generating Microsoft Visual C++ .NET project and solution files for Intel IPP UMC sample code, see “Building a Microsoft* Visual C++ .NET* Solution for the UMC Sample Code”.

Types of Intel IPP Sample Code
There are three types of Intel IPP sample code available for developers to learn how to use the Intel Integrated Performance Primitives. Each type is designed to demonstrate how to build software with the Intel IPP functions. All types are listed in Table B-1. Table B-1
Type Application-level samples

Types of Intel IPP Sample Code
Description These samples illustrate how to build a wide variety of applications such as encoders, decoders, viewers, and players using the Intel IPP APIs. These platform independent examples show basic techniques for using Intel IPP functions to perform such operations as performance measurement, time-domain filtering, affine transformation, canny edge detection, and more. Each example consists of 1-3 source code files (.cpp). These code examples (or code snippets) are very short programs demonstrating how to call a particular Intel IPP function. Numerous code examples are contained in the Intel IPP Manual (.pdf ) as part of the function descriptions.

Source Code Samples

Code examples

B-1

B

Intel? IPP User’s Guide

NOTE. Intel IPP samples are intended only to demonstrate how to use the APIs and how to build applications in different development environments.

Source Code Samples
Table B-2 presents the list of files with source code for the Intel IPP samples. All these samples are created for Windows* OS, but they can be easily adapted for Linux* OS.

Table B-2
Category

Source Files of the Intel IPP Sample Code
Summary Introduction to programming with Intel IPP functions Description and Links

Basic Techniques

? ? ?

Performance measurement GetClocks.cpp Copying data: Copy.cpp Optimizing table-based functions: LUT.cpp Executing the DFT: DFT.cpp Filtering with FFT: FFTFilter.cpp Time-domain filtering: FIR.cpp Generating DTMF tones: DTMF.cpp Using IIR to create an echo: IIR.cpp Using FIRMR to resample a signal: Resample.cpp Allocating, initializing, and copying an image: Copy.cpp Rectangle of interest sample wrapper: ROI.h ROI.cpp ROITest.cpp Mask image sample wrapper: Mask.h Mask.cpp MaskTest.cpp

Digital Filtering

Fundamentals of signal processing

? ? ? ? ? ?

Audio Processing

Audio signal generation and manipulation

Image Processing

Creating and processing a whole image or part of an image

? ?

?

B-2

Intel? IPP Samples

B

Table B-2
Category

Source Files of the Intel IPP Sample Code (continued)
Summary General image affine transformations Description and Links

Image Filtering and Manipulation

?

?

?

Wrapper for resizing an image: Resize.h Resize.cpp ResizeTest.cpp Wrapper for rotating an image: Rotate.h Rotate.cpp RotateTest.cpp Wrapper for doing an affine transform on an image: Affine.h Affine.cpp AffineTest.cpp ObjectViewer application: ObjectViewerDoc.cpp ObjectViewerDoc.h ObjectViewerView.cpp ObjectViewerView.h Transforming vertices and normals: CTestView::OnMutateM odel Projecting an object onto a plane: CTestView::OnProjectPl ane Drawing a triangle under the cursor: CTestView::Draw Performance comparison, vector vs. scalar: perform.cpp Performance comparison, buffered vs. unbuffered: perform2.cpp

Graphics and Physics

Vector and small matrix arithmetic functions

?

? ?

B-3

B

Intel? IPP User’s Guide

Table B-2
Category

Source Files of the Intel IPP Sample Code (continued)
Summary Cryptography and computer vision usage Description and Links

Special-Purpose Domains

?

?

?

RSA key generation and encryption: rsa.cpp rsa.h rsatest.cpp bignum.h bignum.cpp Canny edge detection class: canny.cpp canny.h cannytest.cpp filter.h filter.cpp Gaussian pyramids class: pyramid.cpp pyramid.h pyramidtest.cpp

Using Intel IPP Samples
Download the Intel IPP samples from http://www.intel.com/software/products/ipp/samples.htm. These application-level samples are updated in each version of Intel IPP. It is strongly recommended that you upgrade the Intel IPP Samples when a new version of Intel IPP is available. Several common samples are included with the product. You can find them in the <ipp directory>\samples.

System Requirements
Refer to the readme.htm document in the root directory of each sample to learn the system requirements for the specific sample. Most common requirements are listed below.

Hardware requirements:
? A system based on an Intel? Pentium? processor, Intel? Xeon? processor, or a subsequent IA-32 architecture-based processor

Software requirements:
? ? ? Intel? IPP for the Windows* OS, version 6.1 Microsoft* Windows Vista*, Microsoft Windows XP*, Microsoft Windows Server* 2008, or Microsoft Windows Server 2003 operating system Microsoft* DirectX* API: 9.0 SDK Update (February 2005) or SDK (December 2005)

B-4

Intel? IPP Samples

B

? ? ? ? ? ? ?

Intel? C++ Compiler for Windows* OS: versions 11.1, 11.0 or 10.1 , Intel? Parallel Composer Microsoft Visual C++* .NET 2008, or Microsoft Visual C++ .NET 2005 development systems When building for a processor supporting the Intel? 64 architecture, the Microsoft EM64T Platform SDK is required. When building for an Intel? Itanium? 2 processor, the Platform SDK for Microsoft Windows Server 2003 SP1 is required. Microsoft eMbedded Visual C++ 4.0 tool with Service Pack 4 (SP4). When building for the Windows CE OS version 5.0 for x86, the Standard Software Development Kit (SDK) for Windows CE 5.0 operating system is required.

Building Source Code
The building procedure is described in the readme.htm document for each sample. Most common steps are described below. Set up your build environment by creating an environment variable named IPPROOT that points to the root directory of your Intel IPP installation. For example: C:\Program

Files\Intel\IPP\6.1.x.xxx\ia32\.
To build the sample, change your current folder to the root sample folder and run batch file build32.bat [option]. By default, the batch file searches the compilers step by step according to the table below (assuming that compiler is installed in the default directory). If you wish to use a specific version of the Intel C/C++ compiler or the Microsoft C/C++ .NET 2005 compiler, set an option for the batch file from the table below.

Table B-3

Options for Batch File
Option

Compiler Intel C++ Compiler 11.1 for Windows OS Intel C++ Compiler 11.0 for Windows OS Intel C++ Compiler 10.1 for Windows OS Intel Parallel Composer Microsoft Visual C++ .NET 2008 software

icl111 icl110 icl101 ipc2009 cl9

B-5

B

Intel? IPP User’s Guide

Table B-3

Options for Batch File
Option

Compiler Microsoft Visual C++ .NET 2005 software Microsoft Visual C++ .NET 2003 software

cl8 cl7

NOTE. If you use the Windows CE OS version 5.0, it is necessary to choose or set the PLATFORM in wceplatform.bat to the targeted one. Make sure you use a valid SDK for your platform. You will probably need to set the SDKROOT environment variable manually, for example: set SDKROOT=C:\Program Files\Windows CE Tools

After the successful build, the result file or files are placed in the corresponding sample directory: <install_dir>\ipp-samples\sample-name>\bin\win32_<compiler>, where compiler = cl7|cl8|cl9|ipc2009|icl101|icl110|icl111.

Running the Software
To run each sample application, the Intel IPP dynamic link libraries must be on the system’s path. See “Setting Environment Variables” for more details. Refer to the readme.htm document in the directory of the specific sample for detailed instructions on how to run the application, the list of command line options, or menu commands.

Known Limitations
The applications created with the Intel IPP Samples are intended to demonstrate how to use the Intel IPP functions and help developers to create their own software. These sample applications have some limitations that are described in the section “Known Limitations” in the readme.htm document for each sample.

B-6

Index
B
building application, 2-9 building samples, B-5

J
java applications, 8-2

C
calling functions, 2-10 checking your installation, 2-8 configuring environment, 4-1 controlling number of threads, 6-1

L
language support, 8-1 library dependencies by domain, 5-13 linking custom dynamic, 5-9 dynamic, 5-5 static, with dispatching, 5-5 static, without dispatching, 5-7 linking examples, 5-14 linking models, 5-4 linking models comparison, 5-10

D
detecting processor type, 5-2 Disabling Multithreading, 6-2 dispatching, 5-1 document audience, 1-3 filename, 3-4 location, 3-1 organisation, 1-4 purpose, 1-2

M
managing performance, 7-1 memory alignment, 7-1 reusing buffers, 7-4 thresholding data, 7-3 using FFT, 7-5 memory alignment, 7-1

H
header files, 2-10

N
notational conventions, 1-5

I
Intel? IPP, 1-1 IntelliSence capability, 4-3 IntelliSense, in Visual Studio IDE, 4-3

O
OpenMP support, 6-1

Index-1

Intel? IPP User’s Guide

P
performance test tool, 7-6 command line examples, 7-7 command line options, A-1 processor-specific codes, 5-1

with programming languages, 8-1 with Visual C++.NET, 4-1

V
version information, 2-9

R
reusing buffers, 7-4 running samples, B-4 known limitations, B-6

S
Sample, B-1 samples types, B-1 selecting libraries, 5-10 linking models, 5-4 setting environment variables, 2-9 source code samples, B-2 structure by library types, 3-2 documentation directory, 3-4 high-level directory, 3-1 supplied libraries, 3-2

T
technical support, 1-1 threading, 6-1 thresholding data, 7-3

U
UMC sample code, 4-2 using DLLs, 3-2 static libraries, 3-3 using FFT, 7-5 using Intel IPP with compilers, 4-6 with Java, 8-2

Index-2


赞助商链接
相关文章:
更多相关标签: