Thursday, April 18, 2013

Algorithm Complexity and its Importance


1) Introduction:

    One thing I notice as I read forums and blogs is that a lot of people are moving to more modern and simpler languages than C, such as Python, Java and C#. Most of them don't even learn the basic concepts of algorithm complexity and data structure. I believe that this trend is creating a generation of programmers that have no idea of how to solve big problems by themselves.
   I want to be perfectly clear here: I am not saying that you shouldn't stick to your favorite language. I just strongly suggest that every programmer should at least learn what is the difference between data structures, when to use one over another and, specially, how their performance scale.
    I have lost the count of how many times I have seen people who don't know the difference between a hash and a binary tree or just believe that the <insert language name here> optimizer will do a better job than they could.
    In this post I will go around a real world problem I faced in my yet few years as a professional programmer. I will not, of course, reveal the name nor the context of the application.

2) The Problem:

    On the system I was working for there is the need of keeping track of the files that are saved in the disk (in this case, a NFS). This is made by saving some data in a text file.
    The system is accessible by the users via network and, when a user logs in, the system would load the text file information into its memory. Due to multiple processes accessing the data at the same time, the system needs to check if any files were removed and update the internal list and the text file accordingly (we can't use locks in the system because of the NFS implementation used). Another important mechanism to avoid collision was the filename pattern that was used, it was based on the timestamp of the file creation, the PID of the process that created it, the size of the file and the name of the machine that was running the process.
    One day I received the task to look into the function that would search if the list of files still existed in the text, because it was REALLY slow (several minutes in some cases). Since it was a program accessed via internet, the section would timeout before the check operation would finish and the user couldn't access its information.
   Please keep in mind that this is a pretty old system now and when it was written the amount of files that the users could keep was really small. As it scaled over the years, the amount of CPU was getting way too high and something had to be done.

3) The Original Approach

   The solution the program  originally used was the most "intuitive" one, in a step by step solution it would be:
  1. Find the text file size using stat.
  2. Create a buffer with malloc that would fit the whole text.
  3. Load the text into the buffer.
  4. Mark each end of line as end of string.
  5. Read the nth entry of the files list (where n starts from 1)
  6. Check each line of the buffer to find if it is nth entry of the file list (there was a small optimization here where it would mark the line as read, so it wouldn't check the same line twice)
  7. Increase the value of n by one.
  8. If (n < number of files in list go to 5, else go to 9).
  9. Free the buffer and finish.
    One important notice is that the algorithm would react to new files and to not found files, but this is not relevant to the matter here, as we are just interested in the search device.
    So, why was this algorithm slow? The first reason is that it was using a linear search. For each entry in the file list, it would have to go through a lot of registers, in the worst case scenario (which would happen every time the file was not in the list), it would have look at every register.
    The second, is that the data wasn't binary, the algorithm was comparing two strings to check if the name of the file would match with an entry in the list.

4) Sandboxing and New Approaches:

    As I have said before, this is a small part of a big program. Testing any changes in the program itself would require me to create an user, create several fake files and go through the authentication process and the normal sequence of operations to get to the point that the problematic algorithm would run (not to mention the fact that I could end up getting a timeout).
    To avoid all those steps, I created a fake program that would test the algorithm for me. I also coded a python script that would generate a fake test entry, as big as I needed. This was my sandbox environment.
    As for new approaches, I decided to use an ordered array (which I could use a binary search) and my contractors asked me to try with a hash table too. On the first solution, I used two different ordering methods, one parsing the entries' names so that I could handle the timestamps, size and PID as integers and the other using the MD5 of the file name. I will explain the methods with more detail in the next two sections.

5) Hash Tables 101

    Most of people have already used a hash table, they come in most languages as a ready to use container and have several names, such as: map, dict or hashmap. Still, very few people actually know what a hash table is or how it works.
    The hash table implementation is just an array and a function that maps a key to its value's real index in the array. For example, let's say we need to map and ID to an employee name, then we:
  • Create an array of n positions (let's use 211 for this example).
  • Create a function that would receive an integer key and map it to an index of the array (in this example, we are going to use id % 211).
    Let's consider the following entry data set:
  1. John Doe, id 67302
  2. Alice A., id 98321
  3. Floating Duck, id 48021
  4. Burning Witch, id 20109
  5. King Julian, id 32302
    Our function would convert the ids for the following indexes of the array: 204, 206, 124, 64 and 19 (respectively). To get the employee's name by its id, all we would have to do is enter the id to our hash table function and return the data in that position of the array (extra steps ahead). Up to now we have seen that the hash tables are fast to insert and recover data and, in this basic model, really easy to implement; now let's move to the problems with it.
    The first problem is called a collision. To understand it, imagine we want to add a new Employee to the hash table: Jon Snow, id: 96657. When we apply our hash table function to Snow's id we get 19 as result, but the index 19 of the hash table is already being used by King Julian. In other words, a collision happens when two or more keys generate the same result.
    There are several ways to solve a collision, for instance: using a small list instead of mapping it directly to the wanted value, using the closest empty entry and many others. The main point is that all them introduce one big problem, they make the search for the values slower.
   Finally, it is important to notice that since we now know that collisions may happen, we have to add one extra step to our search algorithm. Now we also need to store the key in the entry and check if the value we passed to our hash table function is the one we are actually looking for, not just a collision. Also keeping the key with the register is important to deal with collisions, as any approach will need to the key to match the other possible entries with the one we are looking for.
   The second problem is the limited space. As you add more entries, you will eventually need to expand the table (also, the more entries, the more likely it is to have collisions). After the table expansion the hash function must be updated so that its results cover the whole new size. For instance, in our example, we are using the mod operation, it will never result in a number equals to or bigger than 211. So, if we want to increase the table to 483, we will have to recalculate all the keys, which can take quite some time as the number of keys increase.
   Finally, the last problem is that the memory cost is pretty big, in our example we had to create 211 elements to use only 5. Since we need to create a lot of entries to have a decent space to map our entries to and we definitely want to avoid expanding our table.
    
6) Binary search 101
    Binary search is a simple, yet very powerful way of reducing the search space when trying to find an element. Let's take a look in the way it works.
    First, let's say we have an array of integers with these values:

318, 394, 73, 950, 286, 457, 825, 487, 504, 876,
 521, 560, 192, 1, 627, 483, 717, 338, 70, 541

    If we want to find 717 in this array, we would have to look for it, element by element, until we find it, after seventeen tries. If we sort the array, we would have this:

1, 70, 73, 192, 286, 318, 338, 394, 457, 483,
 487, 504, 521, 541, 560, 627, 717, 825, 876, 950

    If we get a random position of the array, for instance, the 7th one, we will find the value of 338. Knowing that the array is now sorted, we can be sure that the number we are looking is surely not in the first seven positions of the array, since 338 is smaller than 717 and every number before 338 is smaller than it.
    This example ilustrates how the binary search works, in a single check, we eliminated several possible positions. A real search would actually look at the middle point of the array and set it as a its upper limit or lower limit, then get a new mid point and so on. Here is the step by step in this array:
  1. lower = 1 upper = 20 mid = 10. Value = 483.
  2. lower = 10 upper = 20 mid = 15. Value = 560.
  3. lower = 15 upper = 20 mid = 17. Value = 717.
    The problems with a binary search is that we need to sort the array before we can apply it. Also, we need to create eitheir a key that is unique to each register or use binary information that make every register having an unique way to identify it.
    We must keep in mind that none of this problems is specially hard to solve. The sort algorithm must be run only once, so it in not likelly to have a huge impact. The single key is mostly easy to solve and in this case even more as the name of messages is already made to avoid those kind of problems.

7) Implementations used and complexity:

   To test which algorithm to choose I coded three solutions:
  • A hash table, using the filename as a key.
  • A binary search using the name MD5 as a key.
  • A binary search using the name data as multiple keys.
    To implement the hash table solution I used the functions implemented in the <search.h> header. The functions are: hcreate_r (to create the table, thread-safe),  hsearch_r (to insert entries and to search, also thread-safe) and hdestroy_r (to free the used memory). In this implementation I also needed a char** to hold the keys, I marked the final entry as NULL so that I could free the keys.
    To implement the binary search I used the <stdlib.h> header. I used the qsort function to sort the array and the bsearch to execute the binary search. In the implementation that used MD5, I used the <openssl/md5.h> header, from openssl, and the MD5 function to generate the keys.
    From the complexity point of view we would have an average case for the search of (assuming the parsing of the entries is always O(1)):
  • Original method: n * n = O(n^2)
  • Hash: n * (1 + n/k) = O(n)
  • Binary Search: n * log2n  = O(n * log2 n)
    And a worst case scenario:
  • Original method: n * n = O(n^2)
  • Hash: n * n = O(n^2)
  • Binary Search: n * log2 n = O(n * log2 n)
    As we can see, the avarage scenario would tell us that the hash table is our best bet. The worst case would give us that the hash and binary search would give us the same big O complexity.


8) Results and conclusions:

   Using my fake files list generator I created lists with 150, 1500, 15000 and 150000 entries. Here are the results (the time is in microseconds, also I used dots every three numbers to make it easier to read big numbers, not to denote a floating point number):

150 entries:

Total time of the method 'Comparative Search' is 265 us.
Total time of the method 'Hash Table' is 169 us.
Total time of the method 'Binary Search' is 361 us.
Total time of the method 'Binary Search with md5' is 339 us.

1500 entries:

Total time of the method 'Comparative Search' is 21.885 us.
Total time of the method 'Hash Table' is 1.301 us.
Total time of the method 'Binary Search' is 1.616 us.
Total time of the method 'Binary Search with md5' is 1.747 us.

15000 entries:

Total time of the method 'Comparative Search' is 2.103.593 us.
Total time of the method 'Hash Table' is 34.684 us.
Total time of the method 'Binary Search' is 15.006 us.
Total time of the method 'Binary Search with md5' is 16.783 us.

150000 entries:

Total time of the method 'Comparative Search' is 215.411.798 us.
Total time of the method 'Hash Table' is 400.879 us.
Total time of the method 'Binary Search' is 162.278 us us.
Total time of the method 'Binary Search with md5' is 188.064 us.

    Looking at those results we can clearly see that as the number of entries increases, the amount of time the old algorithm (called Comparative Search) increases exponentially. We can actually see that every time the entry increases 10 times, the processing time for the comparative search increases nearly 100 times. 
    Another result we can notice is that we would expect the hash table to be faster than the binary search, but this doesn't reflect the truth. The reason being here is the time lost due to the collisions. The lession we should learn is that hash tables are not that great when you have a lot of insertions and you can't determine them before hand.
   A final side note I want to address here is that some people may call me a cheater because I left the initialization times out of the calculation and quicksort worst case is O(n^2). I want to make it clear that this situation is actually impossible to happen here due to the way the filename generation works. If you want to take the inialization into matter, the avarage initialization time for the hash table is about 10% of the search time, while the binary seach is about 20%.

9) References and Further Reading:

Hash Tables: Wikipedia.
Big O Notation: Wikipedia.
Quicksort: Wikipedia.
Hash table C functions: search.h information, hcreate man page, hsearch man page and hdestroy man page.
Binary search C functions: qsort man page and bsearch man page.
This site, with several different sorting algorithm and visual comparing applied to several special cases.

Saturday, October 20, 2012

Using gdb to debug you code

Recently I had to adapt one open source project to the needs of the client I am currently working to. The way the code was written has proven a nightmare to track the function calls, value and type of the variables (since the program was written using a LOT of void pointers and function pointers).

To help me understanding the program's flow I used GDB (Gnu Debugger). It wasn't my first time using it, but in the past I mostly used it with core files. This time I learned some new tricks that I want to share.

In this post I will talk about the preparation and the process of using GDB. It will be divided into a few sections: introduction to GDB, the debug flag, the core files and useful commands.

1) Introduction to GDB:

GBD is a debugging program that allows you to retrieve several information about your application. You can check your stack trace, variable values, add breakpoints, run a program step by step and so on.

There are three ways of using GDB that I will explain in this post. The first one is straight running it, which is useful when your program is not behaving as expected (so you can add breakpoints or run step by step) or you can reproduce the case where it crashes. The second is when your program breaks sometimes, but you don't know how it broke (very common if you have a client-server application) - in this case you will need to allow your system to create a core file. The third method is by informing the program pid, this is very useful to debug client-server applications that spawn children process to run a specific task.

2) The Debug Flag:

In order to use gdb you must have a few things set in your code. The most important thing that you need to do is turn the debug flag of your compiler on.

If you are compiling your program in the command line all you have to is add one -g in the command line. For instance, if your line is:
gcc -O2 example.c -o example

It should become:
gcc -g -O2 example.c -o example

If you are using a Makefile, you have to do pretty much the same thing. Find the line with the flags and add the "-g" option to it. Don't forget to run the make clean before running the make, since it won't recompile the project by only changing the make file.

If you are using an interface, you will have to find where you turn the debug flag on. When I use code::blocks the flag is on Settings > Compiler and debugger... > Compiler settings > Compiler Flags; the option is "Produce debugging symbols  [-g]".

Finally, you should know that there are several debug flags. -g is the most general one and has always worked fine for me. If you want to learn more about the other debug flags, you may find a full explanation here.

3) Core Files

If your program is crashing under an unknown situation, you can configure your linux to create a core file. This file has all the information of registers, variables and stack trace of the program when it crashed and can be used with GDB to find out why your program crashed.

In order to create a core file you must first enable it. To do so you have to run the following command:
ulimit -c unlimited

To turn it off:
ulimit -c 0

This will allow your system to create core files of any size. It is important to notice that this command limit only lasts for your current section (in other words, when you logout/turn off you PC, it will reset to the default value) and is only valid to your user. If the program is run by other user, you must change this value under its section.

By default the core will be written at the running binary directory. So the user running the program must have write permission there. If you don't want to change the permissions on the binary directory (to avoid security issues), you can change the path by editing the configuration file at /proc/sys/kernel/core_pattern. You can also change the format (for instance, ubuntu's default format is "core", that doesn't say much huh?), I like to use core.<binary>.<pid> and to save then all at the tmp directory (so I don't have to manually remove them). To do so you must run:
echo "/tmp/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern

For more information on cores, you can check this link or simply type on your terminal:
man core

Finally, if your system is still not creating cores, it may be because your program is changing its user during its execution (again, to avoid security issues). In this case you must change your code to call the function prctl, with PR_SET_DUMPABLE as option and the first argument as 1. Other arguments are not used. So the line added should be (this function returns an integer, so you may want to check it):
prctl(PR_SET_DUMPABLE, 1, 0, 0, 0);

For more information on prtcl check this link or type man prtcl on your terminal.

4) Running GDB and Useful Commands:

As I said in the introduction, I will describe how to use GDB in three ways. If you just want to run your program simply use:
gdb ./<your binary>

If your application requires arguments to be passed, you must use:
gdb --args ./<your binary> <arg 1> <arg 2> ... <arg n>

If you are going to use a core file you need to run:
gdb -c <core file> ./<your binary>

Finally, if you want to detach a running process, you should use:
gdb -p <process pid>

Following I have compiled a list with the commands that I find most useful (in no particular order). For convenience I will add a number after each command that represents which kind of use they are valid. (1) is for running your binary on gdb, (2) is for running from a core file and (3) for detaching a process.
  • break <file:line> - Add a breakpoint, the program will stop if that line is reached. (1)(3)
  • run - Start running the program. (1)
  • step - run one operation (mostly a code sequence with no function calls, such as an if with many statements or a mathematical operation; if there is a function call, it will step to the first operation of the given function). (1)(3)
  • stepi - run one assembly operation (for instance, an if with only one simple comparison - in other words, comparing two variables, two constants or a variable and a constant - will take two stepi, one for the compare and other for the branch). (1)(3)
  • until - run the operation in a line completely, for instance if I have x = <some function>, it will run until x receives a value, no matter how many operations <some function> runs.(1)(3)
  • finish - run the current function until it returns (and prints the value returned). (1)(3)
  • continue - run the program until a breakpoint is reached or the program is finished.(1)(3)
  • bt - prints the stack trace. (1)(2)(3)
  • bt full - prints the stack trace and the values of the arguments of each function. (1)(2)(3)
  • print <variable name> - print the value of the variable. You may use pointer arithmetic (such as *<variable name>) and struct accessing (<variable name>.<element> or <variable name>-><element>). (1)(2)(3)
  • up - goes up one level of the stack. (2) 
  • down - goes down one level of the stack (2).
  • help -print a help message, you can also use help <command> to read a help message about that specific command, including arguments and syntax. (1)(2)(3).
Another useful tip is that simply pressing enter will repeat the last command, also there are shorter versions for some commands (such as "s" for step).

Well, that is it for today's post, hope it helps.

Saturday, September 22, 2012

A* implementation for grids in C


Hello there, for the very first post I wanted to do something big (and by that I mean something useful and that would took some time to code properly). So I decided to share my implementation of the A* path finding algorithm in C.

I will split this post in a few sections, as I want to explain the problem that A* solves, the algorithm limitations, how you interface it with your own code and customize it to your own needs. If you are in a hurry, the code download link is on section 6.

1) Introduction - The Path Finding Problem:

Look at this picture, it is a screenshot from the game Lufia II - Rise of the Sinistrals (for the SNES).



Let's imagine that we want to take the main char (the guy with red hair) from its starting position to the point A. It would be a simple task, all we need to do is moving him along the Y axis a few cells up.

On the other hand, if we want to move the character from its starting position to the point B, then we would have a totally different problem. We can't just move in a straight line since there are several bushes in the way (in this game those are not considered walkable cells, unless you cut them). To find a path, we can use several solutions, the fastest one known by the date of this post is the A* (normally referred as A star).

The algorithm is a modification of the Djikstra's algorithm that uses an heuristic to reduce the search space. Basically it will look at the starting point and mark it as the current point, it will then check all the points around it and determine a score for each one. The score uses both a precise distance function (that is used to determine the distance between the current point and a neighbor) and an heuristic function (that is used to estimate the distance between the neighbor and the goal). After a neighbor point is analyzed, it is added to a list, known as the open list, and each neighbor point is marked as having the current point as parent (it will be used later on).

After checking all the points around the current point, the algorithm will remove it from the open list and add it to another list (known as the closed list). It will pick a new point to be set as current, which shall be the one with the lowest score in the open list.

The algorithm will then use pretty much the same logic as described in the beginning, with two differences: the first one is that it will ignore points that are in the closed list. The second is that it will check if the neighbor's distance from the current point is not smaller than the distance between the neighbor and the point marked as its parent. If it is the neighbor's parent is changed and its score is updated.

The algorithm ends when the goal is reached or when the open list is empty (in this case there is no path available to the goal). If it found a path it can be recreated by looking at the goal parent point, then the parent of that point and so on, until the starting point is reached.

If you want to read more about the subjects related to the A* and its implementation, here are some links (the fourth one has an illustrated explanation, bigger, but way easier to understand than mine):
A* Algorithm on Wikipedia.
Dijkstra Algorithm on Wikipedia.
Dynamic programming on Wikipedia.
A nice tutorial on how A* works.
Great info on path finding (and AI in general).

2) Algorithm Limitations:

Although A* is known to be the fastest way to solve a path finding problem, this implementation I am publishing is not. 

The reason for such limitation is due to the data structure I used to save the list of cells that need to be checked. During the process of searching for the best path, the algorithm does two main operations: inserting the possible cells (at this point, they already have an score) in a list (know as open list) and finding the cell with the lower score in the open list.

In my list implementation the complexity of such operations are O(1) for the inserting and O(n) for finding the lowest score cell. On the other hand, if the implementation was using a heap the complexities would be O(log2n) and O(1) respectively.

The reason I didn't use a heap is that I already had a list working in my game and this implementation is already fast enough for my needs.

Another limitation is that INFINITY is defined as 1000000000, this can be easily changed, but it can give users problems if they use their own score functions (if you are using the default one, this means you will start having problems by the 71428571th cell, which is very unlikely to happen).

3) Interfacing with your Application:

The first thing to do is to determine if you need to change anything in the base code. If you just compile and use it, it will obey the following rules:
 - The agent may move in diagonals, but it can't if there is something in its way. For instance, if there is a wall to its right, it can't move to the upper right or lower right cells.
- The cost to move in a diagonal is bigger than the cost to move in a straight line (14 vs 10 per cell, respectively);
- The algorithm will iterate 1000 times before it stops and assume there is no path.

If your movement rules are different, check in the next section to learn how to change the implementation's behavior.

Now, the first thing to do is writing a function with the following prototype:
int name_of_your_function (void* yourGrid, int x, int y);

This function will be used by my implementation to find if the cell in index y and x is walkable or not. The first argument (the void*) is a pointer to your grid, you should cast it your own implementation type and return 0 if the cell is not walkable, and 1 if it is). A very basic example would be:
int walkable_interface(void* pt, int x, int y){
    int ** matrix = (int**)pt;
    return matrix[y][x];
}

In this example the grid is simply a two dimension int pointer that maps directly (1 or 0) if the region is walkable or not. Since a real grid implementation is likely to be much more complex, it will be passed simply as a void*.

The next step is creating a pathFindingStruct pointer, this can be done by using the function:
pathFindingStruct* createPathFindingStruct(void* , unsigned , unsigned , int (*)(void*, int, int));

Where:
- The first argument is a pointer to your grid.
- The second argument is the number of cells in the x axis of your grid.
- The third argument is the number of cells in the y axis of your grid.
- The forth argument is the function described just above.
 
To search for a path between two points, all you have to do is call this function:
list* processPathFinding(pathFindingStruct* , unsigned , unsigned , unsigned , unsigned );

Where:
- The first is the pathFindingStruct* you just created.
- The second argument is the index of the x cell where the path should start.
- The third argument is the index of the y cell where the path should start.
- The forth argument is the index of the x cell you want to get to.
- The fifth argument is the index of the y cell you want to get to.

This function will return a list of cellReference pointers. The cellReference struct has two unsigned int elements x and y. They are the cell indexes of the path found from the starting point to the end point.

To recover the results from a list you should use the listIterator type. The list iterator functions that are available are:
listIterator* createIteratorForList(list* l);
void startListIterator(list* l, listIterator* it);
void* getFirstListElement(listIterator*);
void* getNextListElement(listIterator*);
void* getCurrentListElement(listIterator*);
void resetListIterator(listIterator*);

If you create a list iterator with the first function you must free it by using the free function. The easiest way to use it is by declaring a listIterator variable and then using the second function, so you don't need to free it. Freeing a list iterator will not free the list nor the list's elements, when you no longer need a list you should use this function to free it:
freeList(&listReturned, free);

Where the second argument is the function that will be used to free the list elements. An example of the use of most of the functions described is:

pathFindingStruct* pathFinder = createPathFindingStruct(mainGrid, len, lines, walkable_interface);
if (
pathFinder == NULL){
    printf("Error.");
}
list* pathTo = processPathFinding(pathFinder, begin.x, begin.y, end.x, end.y);
listIterator it;
startListIterator(pathTo, &it);
cellReference* pathCell;
for (pathCell = getFirstListElement(&it); pathCell != NULL; pathCell = getNextListElement(&it)){
  //Do something...
}
freeList(&pathTo, free);
freePathFindingStruct(pathFinder);

You may reuse the pathFindingStruct for as many searches as you like. When you don't need it anymore don't forget to free it by using:
void freePathFindingStruct(pathFindingStruct*)

4) How to Customize the Algorithm:

The implementation offers some limited customization, if you look at the pathfinding.h file you will find the following defines:
#define CONSTANT_STRAIGHT_COST 10
#define CONSTANT_DIAGONAL_COST 14
#define PATHFINDING_INFINITY 1000000000
#define PATHFINDING_MAY_MOVE_DIAGONALLY 1
#define PATHFINDING_MAX_ITER 1000

The first two lines define the cost of moving one cell. The third one defines the cost considered infinity, this is used to find the lowest cost, if the costs passes this value, this implementation will not work properly. The fourth one defines if the agent may move in diagonals or not, you must comment this line if you want the algorithm not to consider diagonals (it will also change the default heuristic to the Manhattan distance). Finally, the last one is the default maximum number of iterations.

You can also change the maximum iteration number, heuristic function and neighbor distance function by using the following functions:
void setPathFindingMaxIter(pathFindingStruct*, unsigned );
void setPathFindingHeuristic(pathFindingStruct*, int (*)(void*, int, int, int, int));
void setPathFindingNeighborDistanceFunction(pathFindingStruct*, int (*)(void*, int, int, int, int));

The distance functions must have the following form:
int yourFunctionName(void* , int , int , int , int );

Where:
- The first argument shall be a void* to the grid that was passed to the createPathFindingStruct.
- The second and third arguments shall be the current point x and y indexes, respectively.
- The forth and fifth arguments shall be the either the neighbor point or the ending point, depending on which function you are setting, x and y indexes, respectively.

5) Examples of Use:

In the download you can find a full example of use, just get it and you will have six files:
- main.c
- list.c
- list.h
- pathfinding.c
- pathfinding.h
- input_example.txt

On windows I used code::blocks with MINGW to compile, but it should be compilable on any IDE. On linux I used vim and gcc, in this case all you need to do is use this command:
gcc -O2 main.c list.c pathfinding.c -o pathfinding_example

When you have the executable, you need to run it with a text file as argument.  The input file must have up to 100 lines and columns and all the lines must have the same length. The text may have the following letters:
- ".": an walkable cell.
- "0": an non-walkable cell.
- "s": the starting point.
- "e": the goal.

The file named "input_example.txt" can be used as a base, you can also use it to check the program working. It will print the text of the input file with the path marked by the "x" letter. Here are some examples of an input file (the left row) and the generated output (the right row):
.s...........
......0......
......0......
......0......
00000000000.0
.............
.............
.e...........
.............
.............
.............
.xxxxxxx....
......0.x...
......0..x..
......0...xx
00000000000x
...........x
..........x.
.xxxxxxxxx..
............
............
............

.........
.....0...
s....0...
0000000.0
e....0...
.00..0...
.....0.00
.........

....xxx..
...x.0x..
xxx..0.x.
0000000x0
xxxx.0.x.
.00x.0x..
...x.0x00
....xxx..

..........0......0.......
.....0....0........0.....
s....0....0......00000.00
0000000.000......0.......
..........00000.0000.....
.00000....0........0000.0
.....000000........0.....
...................0...e.

....xxx...0......0xxx....
...x.0.x..0.....xxx0.xx..
xxx..0.x..0.....x00000x00
0000000x000....x.0....x..
xxxxxxxx..00000x0000...x.
x00000....0....x...0000x0
x....000000...x....0...x.
.xxxxxxxxxxxxx.....0...x.

6) Code Download and Considerations:

You may download the code here.

Well, that is it, if you find any bugs, leaks or want to ask a question or make a suggestion, feel free to leave a comment.