Why SW developers should write their own backup app and how

We live in comfortable times with real time backups to the cloud. But what will you do if the cloud service provider kicks you out or a virus deletes the data you collected all your life ? Write your own app for a tailor made and efficient backup.

Why should you write your own backup app ?

Many of us depend on services like OneDrive or Google Drive for real time backup and GitHub or Azure DevOps to store our source code. If you do only that, you risk losing all your backup data, because:

  • A service provider like Google can lock you out of your account, there is nothing you can do about this and you lose all your backups, pictures and emails.
  • A virus can encrypt all your drives and since they get automatically copied to the cloud, you lose again all your data.
  • You might have deleted a file by mistake, but realise this only after some time. Of course, the file will then also be deleted in the cloud.

Another reason for writing your own backup program is that it is too expensive to copy all your data to the cloud, like for example video files you made yourself or music you downloaded. 

Organising your directory structure according to your backup needs

Your files have different backup needs depending on how you use your files and their size:

  1. Important files, normally not that big, like MS Office documents: immediate online copy to cloud and copy from time to time all these files to external drive
  2. Software development files: pushing often to GitHub and copy from time to time all these files without obj and bin directories to external drive. In this case you could also not backup .git directories, but if you push seldom, you should backup it too.
  3. Bigger files with hardly changing content, like pictures or movies: copy over new and update changed files to external drive
  4. Archive: Big files and collections of many files which do not change again, i.e. files you collected over many years: Copy everything once when you get a new external drive. There is no need to involve them in the backup runs.

It can be useful to group your directories into few main directories (cloud, repos, archive) according to their backup needs.

Keep copies of important files with their original content, even when they later get deleted or changed. If you discover some months later that you made that by mistake , you can then still fetch the original version.

2 Drive Backup

If you have 2 drives in your computer, the fastest backup would be from one drive to the other. This kind of backup is useful in case the first drive fails or you make mistakes on that drive. But: having just that as backup is not good enough. Once my computer got stolen and I would have lost the files on both drives without external backup. For such reasons, one must have an external drive and keep it at a different location than the PC. When online backup was not possible yet, I used to make daily a quick backup to my second drive in my PC and sometimes backups to my external drive. Having now cloud storage I feel daily backups are no longer needed.

However, it is really up to you what you backup where and how often. Since your needs might be different from others, like SW developers don’t need to backup obj and bin directories, because they are big and VS can recreate them. If you just download a backup program, it wouldn’t support this kind of functionality, but if you write it yourself, you can fit it precisely to your needs.

Cleaning up Backup Directories

When you backup all important files regularly to your external hard disk, it might fill up quickly. Meaning from time to time you need to delete some of the old backups. Easiest would be just to delete the oldest backup, but I recommend another procedure:

 

 

1) I usually fill the external hard disk up to 90%. When that happens, I delete every second backup directory.

2..4) Each time I reach 90%, I delete again every second backup directory.

The benefits of this approach are that I still have the very first backup together with few old backups and many recent backups. This has 2 advantages:

  1. I don’t lose my earliest files which I had when I got the external drive.
  2. Mistakes are often made in recent files. So it helps to have different copies for the last few months, but for older times it is not necessary to keep so many details

Of course, in reality I am not that disciplined that I do backups precisely every month. Actually, I do backups to the external drive irregularly, only when I feel I have made many changes and I don’t want to lose them. Interestingly, the approach to delete every second backup directory works well also for that case.

Writing your own Backup App

Easiest is if you start with an existing app like MyBackup and then add the functionality you need.

Overview MyBackup App

You can get it from Github at this link: github.com/PeterHuberSg/MyBackup

In the upper part of the window, the user can enter on the left side which directories should be copied completely for each backup. The program will first create a new directory at the Backup Path, the name being the date of the backup run. Inside that directory, each directory listed on the left side gets completely copied into that date directory.

The directories on the right side get completely copied into the Backup Path directory during the first backup run, in the following runs only newly added and changed (same name, but different date or size) files will be copied over. For updated files, the old version gets lost.

Execute starts the backup, Purge deletes evers second of the date directories

The lower part of the window shows the user which top level directories have been backed up already and on the last line which file gets presently copied. A backup might cover thousands of files and of course the user will not be able to see all of them, because the small ones get overwritten quickly or, for performance reasons, even not displayed. But if the user can read a file name, he knows which big file gets copied right now and why the backup is not proceeding faster.

 

While the backup is running, the user can scroll, which stops autoscrolling, but once the user scrolls back to the end, autoscrolling continues.

 

Once the backup is completed, I like to check in the output which directories are big and then decide if I need to clean them up.

Design Details

Here are some pointers to get you quickly started if you want to adapt MyBackup to your needs.

 

Normally I setup a WPF app to have 3 libraries:

  • GUI
  • Business Logic
  • Tests of Business Logic

But since MyBackup does not need much code and interacts closely with the GUI all the time, I put almost everything into MainWindow.xaml.cs and few C# files.

Multi Threading

The WPF design is quite cleverly organised. All UI activities run on the WPF thread, meaning there are no multithreading problems when coding the UI. However the backup activity should be run on a different Thread, otherwise the UI will freeze. One would normally create a Task which will run on a thread pool thread. In this app however I decided to use it’s own Thread, because for most of the app runtime we need it.

 

This thread should report to the user what it is doing, basically showing which file currently gets copied. In practice I noticed these are on average hundreds of files per second ! Of course not all these files get copied, most of the time they haven’t changed and nothing needs to be done. However, interrupting the WPF hundreds of times a second is a bad idea. Using LogViewer solves that problem.

LogViewer

 

 

LogViewer meets the following specifications:

  • Multithreading safe message logging
  • Support fonts, bold, etc., but the code using the LogViewer does not need to have any WPF dependencies.
  • Differentiate between temporary and permanent messages. A temporary message gets overwritten by any following message, while a permanent message is still shown to the user once the background task has finished.
  • Collect all messages received and transfer them together every 0.1 sec to the WPF thread. 

Implementing this was quite a challenge. For details how this could be achieved read my article.

Storing the Setup Data

The user doesn’t want to key in each time MyBackup runs which directories need to get backed up. In the maybe not so good old days of .Net Framework one could use Properties.Settings to store such data easily. In .Net 6 the WPF template no longer provides this and I took this as a hint to implement a simple data store myself. It is a text file storing all TextBox.Text data the user has entered, separated by ‘<-=#=->’, a character sequence the user hopefully never enters.

Further Reading

If you have read until here, you might be truly interested in WPF, in which case I would warmly recommend some of my other WPF articles. The present one I feel is not that interesting to read, but some of my articles are really helpful, giving you insights into WPF which you will not find anywhere else:

My Projects on Github

I also wrote several opensource projects on Github:

StorageLib Truly amazing replacement of databases for single user WPF apps. The programmer just defines the C# classes he wants to use and the code generator writes all code for creating, updating and deleting the data permanently on a local disk. This speeds up the work of the developer greatly, because he no longer needs to use any SQL, Entity Framework, etc. Queries are done in Linq and run super fast. The code executes much faster than any database. Supports transactions, backups and much more. Runs error free since 5+ years in several projects.
TracerLib Collecting efficiently in real time multithreaded data which can be stored in a file or just stored in RAM and discarded after some time. This is useful for Exception handling, because this allows to tell the user what happened before (!) the Exception occurred.
WpfWindowsLib WPF Controls for data entry, detecting if required data is missing or data has been changed.

Last But Not Least a Great Game

I wrote MasterGrab 6 years ago and since then I play it nearly every day before I start programming. It takes about 10 minutes to beat 3 robots who try to grab all 200 countries on a random map. The game finishes once one player owns all the countries. The game is fun and every day fresh because the map looks completely different each time. The robots bring some dynamic into the game, they compete against each other as much as against the human player. If you like, you can even write your own robot, the game is open source. I wrote my robot in about 2 weeks (the whole game took a year), but I am surprised how hard it is to beat the robots. When playing against them, one has to develop a strategy so that the robots attack each other instead of you. I will write a CodeProject article about it sooner or later, but you can already download and play it. There is good help in the application explaining how to play:

https://github.com/PeterHuberSg/MasterGrab