I’m sure you found yourself in the situation of having to fix a piece of code that :
- was not written by you
- has not a clear set of requirements or the requirements are clear but the code is unnecessarily complex (at a first glance looks like you encountered the dreaded “big ball of mud”)
My first question here is to understand where is the break even between fix and rewrite. I tend to prefer rewriting code if :
- besides needing fixes there are also performance problems
- code needs to be fixed/changed often
- code is a company asset
- code complexity is exposed to customers
- coders turnover is high
Remember that :
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
— Kernighan and Plauger in “The Elements of Programming Style”
I recently had a chance to try out SmartOS after my last experience in application porting on it (2012) and I was impressed by the virtualization features that are made available directly by the os. One set of commands allows downloading images, running either vm like container or namespace like containers. Zones also allow you to run debian/centos applications on SmartOS inside a lx zone with system call translation … Awesome.
If your interested in digging deeper take a look at this post from Tim Boudreau.
Yes, I like Go programming language. I’m liking it so much that I have to resist from being a fan boy. I’m trying to understand where all this enthusiasm is coming from (I’m a seasoned coder) so here’s an attempt to find why :
- Code readability — and maintainability — first, language features second
- Integrated test environment : go test <package> executes all tests for the package. Unit testing features are builtin.
- Code Coverage is builtin (with some limitations, for example if you use cgo it will not work).
- Integrated tool chain : no need to have makefile at the cost of rigid hierarchy of data.
- Exhaustive standard library containing everything you need to do server-side/network programming
- Good Multi-threading features/model included in language (sync package, goroutines, channels), fast goroutines thanks to segmented stack implementation.
- Basic set of OOP features, centered on composition, not inheritance : you won’t be able to mess up your code at the cost of not being perceived as an OO language by OO fanboys. For more details on whether go is oo or not go here.
- Go is backed by some Famous Names in computing, and this inspires confidence.
- CamelCase 🙂 ? Naaah, I hate camel case but I like the choice of having standard style, comments, indent; all supported by the language via go fmt package so that all code will look coherent.
So basically I like the fact that Go is a very opinionated language. You might like the single decisions or not but what I like most is that someone took care of taking them for you (so you don’t have to enforce them team wide or company wide)
Interesting read also on how and why go was born : quoting from Rob Pike speech at go conference SF 2012 :
“To put it another way, oversimplifying of course:
Python and Ruby programmers come to Go because they don’t have to surrender much expressiveness, but gain performance and get to play with concurrency.
C++ programmers don’t come to Go because they have fought hard to gain exquisite control of their programming domain, and don’t want to surrender any of it. To them, software isn’t just about getting the job done, it’s about doing it a certain way.
The issue, then, is that Go’s success would contradict their world view.
And we should have realized that from the beginning. People who are excited about C++11’s new features are not going to care about a language that has so much less. Even if, in the end, it offers so much more.”
New year is time for self-examination; one of the most frustrating things for a coder is writing code with bugs and bugs are almost always directly related to bad mental habits imho. This is not a complete list of any kind, it is a set of well known and widespread ones that you for sure have already encountered in your coder life. I’m writing them down just as a reminder for myself :
- The “Let’s do it this way for now (I know this cannot be the final way of doing it), because I don’t want to stop 5 minutes and think about it” attitude. This is the worst of all bad habits in my opinion. It is this attitude that generates the most of production bugs because the only moment you had to focus on that specific issue you decided to skip over it, for the sake of continuity in your mental path (which is a good thing by itself but bad if not derogable ). That moment will never come again, that “preliminary” code will go strait to production and that issue will never ever be taken into account again until it generates a bug.
- Using the “quick and dirty” way to do things even when there is no real need for that type of approach. This is related to the fact that we are almost always pushed to deliver fast and after time the “quick and dirty” approach becomes the standard one, always, regardless of requirements.
- Unreadable code : this does not necessarily generate bugs but makes it difficult to fix them.It is caused by :
- coder EGO : “nobody will ever be able to understand my code unless spending an hour over 10 lines”. I will be the only able to maintain it.
- “This way is faster, (probably) ” (note probably, because nobody is ever measuring code speed). Modern compilers/cpus do things that we can’t imagine in terms of optimization, but “I can do better”.
- Comment out unused code, or worse, gate it with a feature flag. Code that has no purpose is a major source of distraction and confusion. Today’s version control systems make it easy to revert any changes; there’s no reason not to remove dead code and other bloat.
- Over engineering code or overdoing features : this one is so big that needs a separate post to handle it but we might try to summarize it with
- more code, more bugs
- more code, more tests to make, more time
- “I’m gonna do this in 10 minutes” attitude
One if my main tasks from 2015 on has been optimizing performance on various languages api (mainly C/C++). This post tries to recap best practices in this area.
For those like me who work in IT since the z80 let me say that cpu have changed, a lot; variability in computing time in modern computer architectures is just unavoidable; while we can guarantee the results of a computation we cannot guarantee how fast this computation will be :
“Computer can reproduce anwsers, not performance” : Bryce Adelstein Lellback, https://youtu.be/zWxSZcpeS8Q?t=6m45s
Reasons for variance in computation time can be recap in :
- Hardware jitter : instruction pipelines, cpu frequency scaling and power management, shared caches and many other things
- OS activities : a huge list of things the kernel can do to screw up your benchmark performance
- Observer effect : every time we instruments code to measure performance we introduce variance.
Also warming up the cpu seems to have become necessary to get meaningful results. Running hot instead of cold on a single piece of code is well described here https://youtu.be/zWxSZcpeS8Q?t=18m51s
You have to measure. There is no other way; things that by your experience might look faster if done in a certain way reveal to be slower when measured so put away all your preconceptions and prepare to A/B test your code for performance. Here’s are some hints, not a complete list at all :
1) make sure your code is doing what you expect. Profile your code compiled without the optimizer and check that your are not calling unwanted code (valgrind/kcachegrind for profiling)
2) measure/time your code : I use linux/c this code for duration, gnu scientific library (libgsl) for related math. Check out chrono for c++ and/or google benchmark for a complete framework.
3) as mentioned above warm up the cpu with your code before measuring by running your code a large number of times. Measure the execution time average of a large number of runs. Ideally your measure is good when results have “normal” distribution. Narrow the code you measure until you get normal distributed results.