quick navigator
Products
Technologies
Development Tools
* Features/Benefits
* White Paper
* What's New -- Revision History
* System Requirements
* Licensing
* How to Get the Compiler
* Compiler Updates
* Technical Support
*Back to Intel Software Performance Products Home
Developer Home Contents Search Feedback Support Intel(r)

Intel® C/C++ Compiler Demos at CGDC’98

Intel will be showing two compiler demos at CGDC’98, both of which showcase the latest Intel® C/C++ Compiler technology. These demos are the TimeDCT Demo and the Intel JPEG Library Looping Demo. The TimeDCT Demo and complete source code are available for download from this web page.

TimeDCT Demo Program

The TimeDCT demo program is a Windows* MFC application that demonstrates the performance capabilities of the Intel C/C++ Compiler by focusing on MMX™ technology. To do this, the demo calls an inverse Discrete Cosine Transform (iDCT) routine that is coded in four different ways and displays the length of time required for execution of each call.

The four methods of coding are:

  • Standard C
  • MMX technology assembly code
  • Intrinsic functions for MMX technology (supported only by the Intel C/C++ Compiler)
  • Ivec vector class library for MMX technology (supported only by the Intel C/C++ Compiler)

The C version of the algorithm, iDCT_AAN() in the aan_idct.cpp file, the MMX instructions assembly version, MMX_iDCT8X8AAN() in the maan_idct.cpp file, and the timing code, clear/start/stoptimer() in the timing.cpp file, are from the Intel JPEG Library. This library contains hand-optimized MMX technology assembly and non-MMX technology C routines to encode/decode JPEG images according to the type of processor. The library was developed in the Intel Architecture Labs, and has been highly tuned for Intel Pentium® II processors.

The intrinsics for MMX technology enable the user to code in C or C++, eliminating the time required to hand-optimize assembly language, while offering nearly the same performance. There are intrinsics for virtually all of the MMX instructions, but the compiler allocates registers and schedules instructions. The intrinsics version of the iDCT, MMXIntrin_iDCT8X8AAN() in the iaan_idct.cpp file, implements the same algorithm as the assembly version, but was coded in a fraction of the time; it executes only about 7% slower than the assembly version.

The vector class for MMX technology abstracts the intrinsics into C++ classes using overloading to enable developers to program in natural C++ without worrying about what intrinsic function to use. C++, due to object construction and copying, can incur some additional overhead beyond that seen with intrinsics, but even further reduces the time spent programming for MMX technology, and (in the future) the Katmai New Instructions. There are three different versions of the vector classes supported by the Intel C/C++ Compiler, one for each of the three data types supported by MMX technology: char, short, and int. The MMXIvec_IDCT8X8AAN() routine in the vaan_idct.cpp file uses the short version for 16-bit integers, I16vec4.

TimeDCT Demo Download

To run the demo, you need Windows* 95* or Windows NT* 4.0 running on a processor that supports MMX technology. Follow these steps:

  1. Click here to download the TimeDCTDownload.zip file.
  2. Unzip the TimeDCTDownload.zip file.
  3. Execute the TimeDCT.exe in the Release directory.

Under the "Timing" menu, the demo allows you to select the different versions of the DCT. The clock counts required to make 100,000 calls to the routine are displayed, as well as the percentage of clock counts vs. the number of clock counts for the assembly version. Included in the download is a Microsoft Visual C++* 5.0 project. Unfortunately, you will need to use version 3.0 of the Intel C/C++ Compiler, which is not scheduled to go to beta until later this summer. At CGDC’98 we will be using an internal alpha version.

The Intel JPEG Library Looping Demo

This demo can be seen only at CGDC’98. The demo was originally developed by the Intel Architecture Labs and uses the Intel JPEG Library. The demo program demonstrates the difference in performance that the coding method makes for the iDCT algorithm as applied to a complete application.

The program decodes 45 different JPEG images and displays them in a window along with the time taken to do the decoding and display. Even though the iDCT routine makes up only about 30% of the computation time in the whole program, the difference between using the C version and the others is very noticeable. However, the difference between using the assembly version and the intrinsics or Ivec versions is negligible. The coding time for the intrinsics and Ivec versions, however, was much smaller than that for the hand-tuned assembly version.


* Legal Information © 1998 Intel Corporation