Programming Guidelines

From ReactOS
Jump to: navigation, search

Here are some hints to write good code, originally collected in the context of the ReactOS project.

For advice on writing secure code please see Secure Programming.

It's more important to be correct than to be fast

Premature optimization is the root of all evil.

With respect to writing time and memory efficient programs, I would put forward that one should not concern themselves with execution speed while building code. Such concerns will be mostly a waste of time. Build the program to be robust and easily maintainable, then if it exhibits performance that is not acceptable, profile it and improve only those sections that most need to be optimized. -RexJolliff

Small and smart solutions to solve a problem aren't a shame

Make your code small and smart while preserving its readability and understandability. Follow the guide line "less code – less bugs – more readable – easier to review". Think twice about your solution. Don't misunderstand this rule as optimization rule; maybe it affects the speed and size of the resulting application/driver/library, but that is not the (primary) intended reason. If the smart solution needs more changes, try to adapt as much as possible to make it smarter and more readable. Your credit is not derived from the count of lines you wrote.

Write thread safe code

Especially be aware of this if you write a library; your library may be used by an MT-application. This applies anyway to the kernel, since the kernel is a library which has to be thread-safe per se.

What differentiates thread safe code from thread unsafe code is the use of global variables. Thread safe code avoids nearly all global variables. This includes also local static variables and class variables. One has to decide carefully whether a variable or data structure is local, thread-global or process-global. Try to eliminate global variables. If this isn't possible and you need a thread-global variable then thread local storage (TLS) is the right thing. If you need a process global variable, then the right thing is a classic global variable. However don't forget to declare it with the volatile modifier. If you don't declare it as volatile, it may happen that two threads hold a copy of the variable in their contexts. So data consistency may not be guaranteed.

Global data structures are little more difficult. A thread-global structure like a list is no problem since only the owning thread knows about it. However a process-global structure which gets accessed by multiple threads needs more care. You have to synchronize the accesses of the threads with so called synchronization objects the kernel provides. These help your threads to guarantee a mutual exclusive access to common used data structures. The Mutex is commonly used for this purpose. One Mutex per list is mostly okay. However if such a structure gets accessed very hard, one should consider to use more than one Mutex. One per element is a waste of resources. So using Mutexes for several ranges would be a good compromise.

The same applies to kernel-mode. It's just harder to do and one uses spinlocks to even synchronize threads over multiple processors.

 volatile int really_global_i;
 DWORD tlsi = TlsAlloc();
 TlsSetValue( tlsi, 95 );

Short:

  • Avoid global variables.
  • Use TLS for thread-global variables.
  • Find the right synchronization granularity for global structures.
  • Write multiprocessor-aware code.

Use multithreading where possible and useful but judicious

Multithreading is not the holy grail. However there exist multiple examples out in the world where MT would have been appropriate. Win32-GUI apps usually have one GUI-thread and a bunch of workingthreads. It's also possible to have many GUI-threads but this gets complex very fast; if you don't take precautions it is not allowed to touch the GUI from more than one thread.

It's not that simple to write MT-GUI-apps. However some things are just intuitively threadable, so have a try.

Another thing are server applications. It's just a have to for server apps to be multi-threaded. Fortunately it is no big problem to make a server use multiple threads. However using too many threads is not good, either, a better strategy is to have a pool of sleeping server threads. If a request arrives, it is queued and dispatched to one of the threads. Win32 provides for this purpose the so called I/O-Completion ports. Using this mechanism, all this dispatch and queue work is a minor matter for you.

Using multiple threads which execute the same code in parallel leads us to another problem. If such a program is run on an SMP or an SMT capable system, a so called cache-aliasing happens. Processor caches are organized in so called cache lines, each of which consists of 32, 64 or more bytes. These lines are circular mapped to memory addresses. Result is that, addresses that map to the same cache line, repeat all 8 MB or so.

On SMT systems this means that two threads with the same memory access pattern hinder each other. Same memory access pattern happens if the same code was started nearly at the same time and the memory places are 8 MB off.

On SMT systems this means that if a thread reads a byte, the processor loads the whole cache line. The same happens on the other virtual processor but some MB off (which maps to the same cache-line). Next time the first processor acceses its address again. Now these two processors have to sync their caches with each other. This happens again and again. Result is that such a program runs faster on an uniprocessor system :-( This gets even worse on SMT-systems, because the two virtual processors share the same cache. This means also SMT and Cache-aliasing.

Try to open files shared (esp. SHAREMODE_DELETE)

Make programs support multiple instances

Avoid data corruption on crashes

It's not a nice thing if your application (or the system) crashes, and the user is left with corrupted files.

For the system, the bugcheck functions make sure the system stops, hopefully before damage is done. Use sanity checks to make sure you catch corruptions. For generic applications, it's nice to keep backup files so you have something to go back to in case of problems.

Write preemptive code

Use spinlocks rarely but where needed

Raise IRQL as short as possible – think of writing an DPC

Remember Unicode/non-Unicode

Do not make your programs use only UNICODE or only ANSI. Include <tchar.h> and use TCHAR instead of char, macro _T() for text strings, _tcscpy(), _tcscmp(), … or lstrcpy(), lstrcmp(), … macros for manipulating strings. To calculate length in TCHARs of the TCHAR array, use sizeof( array ) / sizeof( array[0] ) or sizeof( array ) / sizeof( TCHAR ).

Don't hardcode English phrases into source code – if you must, collect them in one place (easier to localize)

1 – An approach is to first write hardcoded phrases like we all usually use to do:

this should be the first step when dealing with bad written code.
let's forget unicode/nounicode now
    printf("my hardcoded string");

2 – Then make them macros like this:

    #define HARDSTRING "my hardcoded string"
    printf(HARDSTRING);

3 – and finally:

    char* AllStrings[NUMSTRINGS];
    #define HARDSTRING AllStrings[1]
 
    printf(HARDSTRING); // <--- not touched anymore

You could jump directly into step 2 when developing to save yourself some time avoiding you to then search for strings in your code.

Don't create a thread per connected user, use I/O Completion Ports instead

Having one thread to serve all users is a bad idea. It forces users to wait an undefined time rather than being served slower. A solution is to create one thread for every user. However this is a bad idea, too. It's OK if you serve a determinable maximum of queries. But if you can't determine how many queries will arrive, you better use the comfortable I/O Completion Ports. The thing with creating one and another server thread is: Mostly your thread does I/O operations (HD access). Invoking too many threads doesn't hurt the OS but your I/O subsystem. You get something like thrashing. The single thread's I/Os hinder each other to be finished and the whole thing gets slower and capacity melts.

So restricting to a number of server threads is the best idea. One could program such a behavior by hand or use the I/O completion ports. Example:

I/O completion ports are a great multithreaded model made to be fast and efficient. Put it this way. You create the threads once, that get wait there until they are required to do some job. They will eat 0 CPU time when waiting and will only wake up when needed. On the other side it gets the benefit of multithreaded code that won't cause delays when some operation blocks and then you could serve all those threads using only one thread for CPU. That means it won't be slower than having only one thread, even more your code probably behaves faster in SMP systems. They are of course harder to use than single threaded code or to create threads for each action or to have a pool of threads that wake from time to time checking for a job but actually the increase in complexity pays off.

Think of changing writing directions

Left to right reading order is not the only direction to read and write text on the globe. There exist also right to left reading order as in Hebrew and top to down reading order as in Japanese. Whereas one legal reading order in Japanese is also left to right. So you can restrict yourself to these two orders. This means that <help me>

Create a GUI with layoutmanager rather than pixel positioning of controls

If you use pixel positioning of controls, your GUI will no longer look right if the texts change due to localization, if the user selects a font you didn't expect (like large fonts for the visually impaired) and if the writing direction changes. Also probably your dialogs will not be resizable.

Write time and memory efficient programs

This conflicts with the earlier statement about optimizations. Keep in mind that that statement does not mean you should waste memory or cycles though.

Memory sizes get always bigger and processors get always faster. But programs get always less efficient. OK, enough. This seems to be just an advice, because everyone has to decide and design ones application by ones own. At least these hints: Keep in mind that you can use memory mapped files. This gives you easier access to your file data and you do not have to rebuild the whole data world in memory. This is a thing you should always avoid: like some bad games, occupy one gig on HD and if started, the same amount in swapspace. If you work object oriented, use references for object parameters. This avoids a copy constructor to be called twice. One time for the temporal object, one time for the higher level variable. Avoid copying of redundant data through an object hierarchy. This means, do not pass a set of variables to the next deeper object and so on but use an intelligent design with pointers.

Avoid memory leaks

Easier said than done. Memory leaks are always an offense. There's nothing one can really do to never have a memory leak. However take these hints. You can use a language which has a garbage collector, like Eiffel. In C, your only option is to be more careful and always write pairs of malloc and free or use reference counting. If you use C++ there exist the techniques of auto pointers and smart pointers. For how to use them, see the corresponding literature.

-- There are garbage collectors for C and C++ – Jakov

Put plenty of comments into your code

  • Think of block comments which explain a whole module and the play tog.

Find the right balance between abstraction and straight forward

Abstraction is a useful programming concept. It's easy to overdo it though.

Don't make redundant copies of program data

If you copy a piece of code you will have to maintain the code twice. Typically people working on your code will only work on one of the copies, so you have to make sure they keep in sync.

  • to be continued.......

If using assembler, always implement an alternative way in C or what you use

ReactOS is or will be a multi platform OS. Having nice fast assembler pieces for time critical operations is good. However, there has to be an alternative written in C, always. Only this guarantee enables ROS to compile on every of it's target platforms. At last one hint: The first goal is always to make a piece of code running. If we find this piece to be a bottleneck, we'll optimize it and possibly write it in assembler.

Preserve the meta information of files you use

On ReactOS file systems, files may have lots of additional information. This includes timestamps, attributes, access control lists, multiple filestreams and extended attributes. Not every file system provides every feature. But if it does, it is annoying for a user who works with these pieces of information, if a program destroys the meta information. At most this happens when copying or packing files or when saving a file. These programs just create a new file and store the content from the old one to the new. Afterwards they delete the old and rename the new one. Generally the optimal way would be to memory map the original and use its mapped data to directly for r/o access. If changes are required, because the user works with the program, a self made copy-on-write mechanism goes on and makes a working copy. If the user wants to save his changes, the changed structures are written back through the memorymapping.

If this is not an alternative for you, remember to use the parameter hTemplateFile of the CreateFile call.

Do not use absolute paths; don't imply path names

Ever used one of these programs that can only be installed on the C drive? Obvious but quite annoying, isn't it?

As you know, drive letters (what a pain) vary from system to system. So including drive letters in paths is bad. Rather use relative paths or local paths. Also paths are no axioms on windows systems. The „Program files“-directory for example changes it's spelling from localization to localization. On Win32 there do exist the functions GetWindowsDirectory, GetSystemDirectory, SHGetPathFromIDList, SHGetSpecialFolderPath and SHGetFolderPath. With these functions you can resolve the differences between the single installations and localizations.

You can use environment variables too.

Avoid directly assigning allocated memory to a pointer

Avoid things like this:

char *ptr;
ptr = malloc(size);

functions like malloc can return NULL anytime and unfortunately it will happen when you least expect it. Remember that in kernel mode there are malloc equivalents that behave the same way and that means a BSOD.

One suggestion is to call a function that always check for null before calling malloc like functions, that throws an exception back so you don't forget about the check. Of course you have to remember now to write the exception handler but this way you cover large blocks and save yourself some code and make the compiler throw a lot less conditional jumps. That means you write less code, with more chances to be correct and also faster!

Beware with the C syntax

One thing C syntax never needed is the possibility to write code like this:

 if (a = 5)
 {
 ...
 }

What does that means! To a programmer used to C that is very simple but for beginners is really confusing.

But the worst side effect is that probably you wanted to simply test if a was equal to 5 and is a common error to not write it the way you really wanted.

 if (a == 5)
 {
 ...
 }

Hungarian Notation Prefixes

Main article: Hungarian Notation

Large projects with many folks working on them should have common ground. Hunagrian Notation Prefixes naming convention provides marvelous consistency and self documentation.

Type Prefix Example
int n nCount
float f fBaby
double d dRoot
char c cKey
string (null terminated) sz szInput[]
global g_ g_nScore
pointer p (always first) *pnArray
struct S SCast
long l lHours
unsigned u uAge

See also

Personal tools
Namespaces

Variants
Actions
Navigation
Tools