![[Images/tutorial.gif]](Images/tutorial.gif)
![[bar]](Images/rule.gif)
This section is about making programs run faster by making some changes in the code. Unfortunately the changes suggested here means that the code either will be less flexible or more complicated, and can be of help only if the bottlenecks in the processing of your data are caused by I/O operations. Worse, there is no real warranty that the program will really run faster - it is up to you to try and see whether is worth or not to make the changes.
Why bother then ? Again, the answer is up to you. If you think that the answers here could improve your programs and is willing to try, this page can give some hints. These hints can serve both to make your programs more flexible but complicated (treating data differently accordingly to the data type) or more restrict (treating the data as you expect the data to be). This distinction will be easier to understand with the examples.
Often the problem of speed will be related with memory - if you don't have enough memory, the kernel (?) will try to allocate memory in the swap region, that means lots of disk reading and writing (at least in a FreeBSD system). Some of the hints here will be about how to avoid using too much memory and at the same time minimizing calls to kpds_get_data and kpds_put_data .
Before seeing what can be done in respect to code, let's see a related topic.
![[bar2]](Images/rule2.gif)
If you're creating cantata workspaces with lots of files being passed from glyph to glyph, and it is taking too much time, you could consider using other data transports instead of the default files between glyphs . Khoros data transport mechanism supports normal Unix files, pipes, virtual memory, shared memory, sockets and streams (depending on your Unix machine and Khoros configuration).
Here are some results I had converting a 288 kbyte RGB image to CMYK and back to RGB using the Color Toolbox. Times were measured with the csh built-in time command, from which I guess that the real time is the one which counts (includes I/O):
| Data transport mechanism | RGB to CMYK | CMYK to RGB |
| File in the /tmp directory | 7.07 | 4.37 |
| Shared memory | 5.92 | 4.17 |
| Virtual memory (mmap) | 5.25 | 4.54 |
I was unable to put the other transport mechanisms to work - probably because I don't know enough of Unix to make proper selections. Also, I had errors running the programs with shared and virtual memory for larger images. The bottom line is, it could make things faster when you have a workspace that reads and writes several small files, but again your mileage may vary, and the only way to see if it works is by trying.
For more information on the data transport mechanisms, please see the kman page for the kfopen and kopen functions.
![[bar]](Images/rule.gif)
Consider the following code:
The code starts HERE
kobject in_obj = NULL; /* an input object */
kobject out_obj = NULL; /* an output object */
double *inplane = NULL; /* data for an input plane */
double *outplane = NULL; /* data for an output plane */
int w,h,d,t,e,type; /* data attributes */
in_obj = kpds_open_input_object("myobj.kdf"); /* open the input object */
out_obj = kpds_open_output_object("newobj.kdf"); /* open the output object */
kpds_get_attribute(in_obj,KPDS_VALUE_SIZE,&w,&h,&d,&t,&e); /* get the object dimensions */
kpds_get_attribute(in_obj,KPDS_VALUE_DATA_TYPE,&type); /* get the object stored data type */
if (type == KCOMPLEX || type == KDCOMPLEX) /* won't process if data type is complex */
{
kprintf("Error, cannot process complex data.\n");
kexit(KEXIT_FAILURE);
}
kpds_set_attribute(in_obj,KPDS_VALUE_DATA_TYPE,KDOUBLE); /* will force internal processing as double */
kpds_copy_object(in_obj,out_obj); /* quick way to initialize the output object */
inplane = (double *)kmalloc(w*h*sizeof(double));
outplane = (double *)kmalloc(w*h*sizeof(double));
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute */
kpds_close_object(in_obj); /* close input object */
kpds_set_attribute(out_obj,KPDS_VALUE_DATA_TYPE,type); /* reset data type to be the same as input */
kpds_close_object(out_obj); /* close output object */
In the code above data is processed as double regardless of its type (if it is complex the program will give an error message and won't process it). If we pass a kobject with dimensions 1024x1024x1x1x3 for it (say, a medium RGB image), even processing the data by planes of 1024x1024, the memory allocated for the input and output will sum up to 16 megabytes - that could be a heavy load for some machines, specially if there are other memory-consuming processes running at the same time.
Note that the call to kpds_set_attribute(out_obj,KPDS_VALUE_DATA_TYPE,type) can save disk space (and the time required to write it) if the input data is of a shorter type (long or int), because the generated file will be smaller than if the file that would be generated if we left the KPDS_VALUE_DATA_TYPE of the output object set to KDOUBLE (it was set when we set that in the input object and copied it to the output object). On the other hand one must consider which processing is to be done in the data: it won't make too much sense if we want to extract the componentwise square root of the input, in this case the input could be integer but the output should be floating point. In other words: the set/reset of the data type of the output object will depend on the application.
Now consider the modified code:
The code starts HERE
kobject in_obj = NULL; /* an input object */
kobject out_obj = NULL; /* an output object */
int *inplane = NULL; /* data for an input plane */
int *outplane = NULL; /* data for an output plane */
int w,h,d,t,e,type; /* data attributes */
in_obj = kpds_open_input_object("myobj.kdf"); /* open the input object */
out_obj = kpds_open_output_object("newobj.kdf"); /* open the output object */
kpds_get_attribute(in_obj,KPDS_VALUE_SIZE,&w,&h,&d,&t,&e); /* get the object dimensions */
kpds_get_attribute(in_obj,KPDS_VALUE_DATA_TYPE,&type); /* get the object stored data type */
if (type == KCOMPLEX || type == KDCOMPLEX) /* won't process if data type is complex */
{
kprintf("Error, cannot process complex data.\n");
kexit(KEXIT_FAILURE);
}
if (type == KDOUBLE || type == KFLOAT || /* will process but warns the user */
type == KLONG || type == KULONG ||
type = KUINT) /* unsigned int range is actually different from signed int range */
{
kprintf("The data type of the input object requires more precision than the signed integer type.\n");
kprintf("The object will be processed but some rounding errors and/or precision loss may occur.\n");
}
kpds_set_attribute(in_obj,KPDS_VALUE_DATA_TYPE,KINT); /* will force internal processing as integer */
kpds_copy_object(in_obj,out_obj); /* quick way to initialize the output object */
inplane = (int *)kmalloc(w*h*sizeof(int));
outplane = (int *)kmalloc(w*h*sizeof(int));
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute */
kpds_close_object(in_obj); /* close input object */
kpds_set_attribute(out_obj,KPDS_VALUE_DATA_TYPE,type); /* reset data type to be the same as input */
kpds_close_object(out_obj); /* close output object */
The required memory for the same kobject mentioned before will sum up to 4 megabytes - 1/4 of the memory used if the data type was considered double. It seems an advantage to use integer data for processing, but on the other hand if a kobject which data type was double or float the data will suffer some loss of precision when casted to integer.
There are two options here: one is declare that the program will work only with integer data and warn the user about floating point data types as we did in the code above, which will be simpler/quicker to program but will limit the program's abilities to work with general data. The other alternative is prepare a more complex program that will deal with all (or at least some) data types, casting to the closest supported data type, which will use less memory if the data type is integer and switch to more memory (and precision) if the data type is floating point. The following code illustrate this approach, where four man data types will be considered: unsigned char (byte), integer, long and double:
The code starts HERE
kobject in_obj = NULL; /* an input object */
kobject out_obj = NULL; /* an output object */
unsigned char *UCinplane = NULL,*UCoutplane = NULL; /* data for planes of unsigned char */
int *Iinplane = NULL,*Ioutplane = NULL; /* data for planes of int size */
long *Linplane = NULL,*Loutplane = NULL; /* data for planes of long size */
double *Dinplane = NULL,*Doutplane = NULL; /* data for planes of double size */
klist *objlist = NULL; /* list of stuff that will be free'd at the end */
int w,h,d,t,e,type; /* data attributes */
int processtype; /* internal type for processing */
int cw,ch,cd,ct,ce; /* counters for the dimensions */
in_obj = kpds_open_input_object("myobj.kdf"); /* open the input object */
out_obj = kpds_open_output_object("newobj.kdf"); /* open the output object */
kpds_get_attribute(in_obj,KPDS_VALUE_SIZE,&w,&h,&d,&t,&e); /* get the object dimensions */
kpds_get_attribute(in_obj,KPDS_VALUE_DATA_TYPE,&type); /* get the object stored data type */
if (type == KCOMPLEX || type == KDCOMPLEX) /* won't process if data type is complex */
{
kprintf("Error, cannot process complex data.\n");
kexit(KEXIT_FAILURE);
}
if (type == KDOUBLE || type == FLOAT || type == KULONG) /* data types KDOUBLE, KFLOAT and KULONG */
processtype = KDOUBLE; /* will be processed as KDOUBLE */
if (type == KLONG || type == KUINT) /* data types KLONG, KUINT */
processtype = KLONG; /* will be processed as KLONG */
if (type == KINT || type == KUSHORT || type == KSHORT) /* data types KINT, KUSHORT, KSHORT */
processtype = KINT; /* will be processed as KINT */
if (type == KUBYTE || type == KBYTE || type == KBIT) /* data types KUBYTE, KBYTE, KBIT */
processtype = KUBYTE; /* will be processed as KUBYTE */
kpds_set_attribute(in_obj,KPDS_VALUE_DATA_TYPE,processtype); /* will force internal processing as integer */
kpds_copy_object(in_obj,out_obj); /* quick way to initialize the output object */
switch(processtype)
{
case KUBYTE:
{
UCinplane = (unsigned char *)kmalloc(w*h*sizeof(unsigned char));
UCoutplane = (unsigned char *)kmalloc(w*h*sizeof(unsigned char));
break;
}
case KINT:
{
Iinplane = (int *)kmalloc(w*h*sizeof(int));
Ioutplane = (int *)kmalloc(w*h*sizeof(int));
break;
}
case KLONG:
{
Linplane = (long *)kmalloc(w*h*sizeof(long));
Loutplane = (long *)kmalloc(w*h*sizeof(long));
break;
}
case KDOUBLE:
{
Dinplane = (double *)kmalloc(w*h*sizeof(double));
Doutplane = (double *)kmalloc(w*h*sizeof(double));
break;
}
}
/* do the processing accordingly to the data type */
switch(processtype) /* get the plane of data accordingly to its type */
{
case KUBYTE:
{
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute, getting the data in the variable
UCinplane and putting data with the variable UCoutplane */
break;
}
case KINT:
{
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute, getting the data in the variable
Iinplane and putting data with the variable Ioutplane */
break;
}
case KLONG:
{
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute, getting the data in the variable
Linplane and putting data with the variable Loutplane */
break;
}
case KDOUBLE:
{
/* do some processing in the planes here for all d*t*e planes -
you must use kpds_get_data and kpds_put_data with the
KPDS_VALUE_PLANE attribute, getting the data in the variable
Dinplane and putting data with the variable Doutplane */
break;
}
}
kpds_close_object(in_obj); /* close input object */
kpds_set_attribute(out_obj,KPDS_VALUE_DATA_TYPE,type); /* reset data type to be the same as input */
kpds_close_object(out_obj); /* close output object */
(void)klist_free(objlist,(kfunc_void)lkcall_free); /* free list of stuff */
This approach attempts to save memory to some extent: if the data type is int, it will not generalize and process as double, if he data type is unsigned char (byte) it will be processed as unsigned char and not as int. There are four general data types recognized by the code, and the input data type will be cast to the closest one without lost of precision. That could possibly save some memory since the data is allocated only for a particular type and not for all possible types.
The drawback is that the code is much more complicated. The second switch command hides the real processing of the program that could consist of three for loops (for depth, time and elements dimensions), the getting of the data from the input object with kpds_get_data , its processing and putting it on the output object with kpds_put_data - all repeated four times (for each supported data type), with minor differences on the data manipulation.
The idea could be extended for all Khoros supported data types, so if the input object data type is float, it will be processed as float, and so on. That will further increase the complexity of the code, but make it able to process any type of input data without loss and without using more memory than necessary.
![[bar]](Images/rule.gif)
This section just present some results of different methods to access data. The corresponding code is on the page Code Snippets - More examples for PDM and PDS . The data file has dimensions 64x64x1x3x6 and size integer. The following table show the times and memory requirements for simple access to this data with different access methods (times in second, will depend on your machine/OS/load/etc and should be used only for comparison of the methods):
| Access by | Total calls to kpds_get_data | Integer data type (time) | Double data type (time) | Integer data type (memory) | Double data type (memory) |
| WxHxDxTxE Points | 294912 | 125.17 | 147.21 | 4 bytes (a single integer) | 8 bytes (a single double) |
| WxHxDxT Vectors | 49152 | 22.76 | 26.66 | 24 bytes | 48 bytes |
| DxTxE Planes | 18 | 0.56 | 0.79 | 65536 bytes | 131072 bytes |
| TxE Volumes | 18 | 0.58 | 0.77 | 65536 bytes | 131072 bytes |
| All data at once | 1 | 0.61 | 0.97 | 1179648 bytes | 2359296 bytes |
The table above may suggest that getting all the data at once would be faster. This will be true for machines with a lot of available RAM or small objects. If the object size is too large or there is no available RAM, the OS will probably try to allocate memory in the swap, which is on the hard disk, which could slow down the program (I am not sure about this but it seems to be consistent with some speed problems I had in FreeBSD).
![[bar]](Images/rule.gif)
Another way to possibly increase the processing speed of the programs is to compile them with optimization switches. Optimized programs should be smaller and execute faster (accordingly to gcc man page). Please check the man page for your compile to see whether it support optimization and which are the switches.
You could probably turn off the debugging information when you compile the program, that will make the final binary smaller and possibly increase speed. For gcc 2.7.2.1, the switch -g controls the inclusion or not of debugging information, please refer to the man page for your compiler to see how it deals with the inclusion of debugging information.
Now, the bad news. If you decide to change the parameters you will have to change the files $BOOTSTRAP/config/Site.YOURMACHINE and $BOOTSTRAP/config/YOURMACHINE.cf where YOURMACHINE is the name of your computer architecture (in my case, freebsd).
Please check the file $BOOTSTRAP/config/Site.YOURMACHINE, there is an option called #define UseOptimization) with lots of useful hints) that controls whether optimization will be default or not, and the hints on the file show how one can override the defaults to create optimized or debuggable programs. The problem is that your program will use Khoros functions, that were compiled accordingly to the #define UseOptimization option. It means that even if you compile your programs with optimization, the Khoros libraries it will use weren't optimized. If you really want to use optimized code, your best bet would be recompile the Khoros system again.
The file $BOOTSTRAP/config/YOURMACHINE.cf also contains some #defines for gcc that define the level of optimization, but you should know which options are there and what they are for to be able to change them. Strong warning: Messing with these parameters could possibly give compilation problems, be sure of what you're doing before trying them !
Personally, I'd prefer to be able to debug programs than make them run a little faster. Again, the decision is up to you.
![[bar]](Images/rule.gif)
![[bar]](Images/rule.gif)
![[bar2]](Images/rule2.gif)
![[bar2]](Images/rule2.gif)
![[bar2]](Images/rule2.gif)