Dawid Ciężarkiewicz aka `dpc`

contrarian notes on software engineering, Open Source hacking, cryptocurrencies etc.


In this post, I will describe how I refactored quite complicated Rust codebase (rdedup) to optimize performance and utilize 100% of CPU cores.

This will serve as a documentation of rdedup.

Other reasons it might be interesting:

  • I explain some details of deduplication in rdedup.
  • I show an interesting approach of zero-copy data stream processing in Rust.
  • I show how to optimize fsync calls.
  • I share tips working on performance-oriented Rust codebase.


The biggest strength of #Go, IMO, was the FAD created by the fact that it is “backed by Google”. That gave Go immediate traction and bootstrapped a decently sized ecosystem. Everybody knows about it, and have a somewhat positive attitude thinking “it’s simple, fast, and easy to learn”.

I enjoy (crude but still) static typing, compiling to native code, and most of all: native-green thread, making Go quite productive for server-side code. I just had to get used to many workarounds for lack of generics, remember about avoid all the Go landmines and ignore poor expressiveness.

My favorite thing about Go, is that it produces static, native binaries. Unlike software written in Python, getting software written in Go to actually run is always painless.

However, overall, Go is a poorly designed language full of painful archaisms. It ignores multiple great ideas from programming languages research and other PL experiences.

“Go’s simplicity is syntactic. The complexity is in semantics and runtime behavior.”

Every time I write code in Go, I get the job done, but I feel deeply disappointed.



# Reattach to (or spawn new if not existing) tmux session
# tmux session <session_name> [ <session_directory> ]

export STY="tmux-$1"
if [ ! -z "$2" ]; then

RC="$(readlink -f "$RC")"

if ! tmux has-session -t "$1" 2>/dev/null ; then
    if [ ! -z "$RC" -a -f "$RC" ] ; then
        tmux new-session -d -s "$1" "tmux move-window -t 9; exec tmux source-file \"$RC\""
        tmux new-session -d -s "$1"

exec tmux attach-session -t "$1"


# Spawn tmux session in current directory
# use path's sha256 hash as session name

exec "$HOME/bin/tmux-session" "$(echo "$PWD" | sha256sum | awk '{ print $1 }')" "$PWD"

#shell #tool

Having a lot of RAM nowadays is relatively cheap and Linux can make a good use of it. With tools like preload most of Linux distributions are trying to proactively read things that you might want to use soon.

However if your desktop have a ridiculous amount of memory (mine has 32GB) it may take ages for these tools to make use of all that memory. And why would you pay for it and then let it just sit idle instead of working for you?

The thing is: you can do much better, because you know what you are going to use in the future.

So, as always, let’s write a tiny script under the name precache.


exec nice -n 20 ionice -c 3 find "${1:-.}" -xdev -type f \
    -exec nice -n 20 ionice -c 3 cat '{}' \; > /dev/null

Personally I keep it as $HOME/bin/precache.