Technical Note |
: SLE00012 |
Author |
: Scott Evans |
Created/Modified |
: 20/03/2000 |
Description |
: Code profiling |
This technical note describes how a simple
profiler can be used to help visualise how well (or
bad) your code is doing. Code is provided for such a profiling
tool and an example of using the
profiler also with source is available.
The theory behind the profiler is really simple. Root counter 1
is set by default to count
scanlines. This gives us an easy way to time our code. All we
need to do is record the value of
the root counter before the start of the code we want to profile
and then again at the end of the
code. The end time-start time will give us approximately how long
the code has taken to
execute in scanlines. The profiler then scales this value and
uses it to draw a bar on the
screen. You can then see how long a particular piece of code is
taking which is very useful
when trying to decide which functions need to be optimised.
The time taken by the GPU to draw the primitives is a little
harder to determine. Normally you
can do this very easily. You can get the GPU to generate an
interrupt when it has finished
drawing. A function is set to record the value of root counter 1
and this function is called by the
interrupt handler when the GPU interrupt is generated. Since the
Yaroze libraries are very
limiting this cannot be done since the function
DrawSyncCallback() which can be used to
set up the previously mentioned interrupt is not available. At
the moment I am not sure how to
get around this but I am working on it. For the moment the
drawing time is set to the maximum
value.
So enough theory lets get down to business.
The profiler is contained in its own source file profile.c. To
add the profiler to your projects
you will need to add profile.c to your build and include profile.h in any files
that reference
the profile functions. You will also need to add the files gtypes.h (type
definitions) and fp.h
(fixed point macros) to your project.
Before the profiler can be used it needs to be initialised once
at the start of the program.
Calling PROFILE_Initialise() will initialise the profiler. It takes three
parameters, screen
width and height which are used to scale the profile bars so they
fit the screen and the number
of frames to display.
Example
To initialise the profiler for a 320x256 display and set the
maximum profile time to 3 frames.
PROFILE_Initialise(320,256,3);
The initialise stage sets the default position of the profile
bars and calculates the number of
scanlines in one frame. It also sets a scale value to scale the
profile bars so they fit the current
screen width.
Once the profiler has been initialised you must call PROFILE_Start()
at the beginning of
each frame. This is usually just after the call to VSync().
Example
while(1)
{
VSync(0);
PROFILE_Start();
}
This marks the start of a frame and sets an internal variable start_count to
the current
value of root counter 1. We do this so all our times will be
based on this reference count and
we do not need to reset the root counter every frame.
The function
PROFILE_Read() is used to record the time
taken for a piece of code to
execute. It takes three parameters which are the colour of the
profile bar. Depending on the
number of readings taken the CPU profile bar is split into
sections each with its own colour.
This means you can time lots of different functions and assign a
different colour to each.
Example
while(1)
{
PROFILE_Read(0x0,0x0,0x0);
TestFunction1();
PROFILE_Read(0x80,0x0,0x0);
TestFunction2();
PROFILE_Read(0x0,0x80,0x0);
VSync(0);
PROFILE_Start();
}
So in the above example the red section of the CPU profile bar
will be the time taken to
execute the function
TestFunction1() and the green section is
the time taken for
TestFunction2() to execute. The black section will represent the time
taken for the code
between PROFILE_Start() and the 1st PROFILE_Read().
There is a limit to the maximum readings that can be taken in one
frame. It is set to 128 by
default but can be changed by setting the macro MAX_READINGS in
profile.h.
The last thing to do is draw the profile bars. The bars are drawn
using the GsBOXF primitive.
The function PROFILE_Draw() adds the profile bars to the ordering table which is
passed in
as a parameter.
Example
while(1)
{
PROFILE_Read(0x0,0x0,0x0);
TestFunction1();
PROFILE_Read(0x80,0x0,0x0);
TestFunction2();
PROFILE_Read(0x0,0x80,0x0);
while(DrawSync(1));
PROFILE_Draw(ot);
VSync(0);
PROFILE_Start();
}
Right then that is all there is to it. You will find the complete
source code to the profiler and the
example program that demonstrates the profiler in action. If you
have any problems let me
know. Likewise if you have any suggestions or improvements or
know of a better way to do
this the let me know.
The source code and an example.