Julia for Analysts: Tips for Better Beginnings - Sharpen your Axe (#1)

19 August 2022

TL;DR

Tips on preparing your tools before you start coding with the Julia language for an easier and more productive learning journey. Prepare your tools before you even start coding. Get a better terminal and configure it. Install pipx, juliaup, and set up your startup.jl file.

Introduction

If I were to learn Julia all over again, I would do a few things differently.

If you are a data professional (analyst/scientist/engineer) looking to minimize the time to learn enough Julia to be dangerous, this series is for you.

Invest in Sharpening your Axe 🪓

"Give me six hours to chop down a tree and I will spend the first four sharpening the axe." – Abraham Lincoln

President Lincoln knew what all developers learn early on –> invest time in setting up your tools.

Unfortunately, no one tells you that when you're a self-taught data analyst/scientists/engineer.

General Tips

Set up your Terminal

Whether you like it or not, you will likely spend a lot of time in the terminal, so you might as well set it up well.

On a Mac, I'd recommend Iterm2 + zsh + oh-my-zsh. See an installation guide

Benefits include autocompletion, great highlighting, clear information if you're in a GIT repository (and which branch), and many more!

Difficulty: Low
Downsides: None

Configure your Terminal and Add Secrets

Do you often call the same long commands? Do you need some secrets or configurations to access the same data warehouse across many projects?

Invest time in setting up the default configuration of your shell (eg,~/.zshrc file if you have zsh).

Tricks

# DATABASE CONNECTIONS
export DB_ORACLE_IP="..." # add your details
export DB_ORACLE_USERNAME="..." # add your details
export DB_ORACLE_PASSWORD="..." # add your details

# you can even have aliases for the same value
export MYDB_PASSWORD=$DB_ORACLE_PASSWORD #it will mirror the value above

    - Now, try accessing these values from the shell (echo $DB_ORACLE_IP), python (os.environ["DB_ORACLE_IP"]), or Julia (ENV["DB_ORACLE_IP"])

On the last point, there are several benefits:

Difficulty: Low
Downsides: None

Use Pipx

Start using pipx for all Python-based CLI applications (eg, AWS CLI, black, flake8, jupyter, language servers, mlflow)

A lot of Python-based applications ask you to "simply install with pip" (eg, pip install ABC). Unfortunately doing that will change all relevant python packages in your global environment (ie, break things)!

You could create small environments for each application to be able to independently remove them / update them. That is exactly what pipx does for you (and more)!

Try it out! On macOS:

brew install pipx
pipx ensurepath

No more pip install...

Difficulty: Low
Downsides: None

Use Mamba (/Conda)

Before Julia, having clean environments was not easy. If you use Python, the closest thing you can get is Mamba.

It makes creating and managing separate environments for Python easy and unlike Conda it's really fast.

Read more: Installation instructions

Difficulty: Low
Downsides: None

Julia-specific Tips

Use Juliaup

Install juliaup to install and automatically update your Julia version as well as to switch between different versions.

Installing it on a mac or linux is as simple as curl -fsSL https://install.julialang.org | sh

Create startup.jl

Similar to the theme of setting up your terminal with configurations, you can do the same for your Julia.

You can have all frequently used packages loaded automatically when you start Julia REPL by adding them to a file called startup.jl (Documentation).

Example:

# what text editor to use when edit() is called
# "code" assumes you can call VS Code from your shell
ENV["JULIA_EDITOR"] = "code"
using Pkg
import REPL
using OhMyREPL
using TheFix;TheFix.@safeword fix true
using BenchmarkTools
using Revise

@async begin
    # reinstall keybindings to work around https://github.com/KristofferC/OhMyREPL.jl/issues/166
    sleep(1)
    OhMyREPL.Prompt.insert_keybindings()
end

# automatically activate a project environment if available
if isfile("Project.toml") && isfile("Manifest.toml")
    Pkg.activate(".")
end

Save this file in ~/.julia/config/startup.jl (where ~ is your user's home directory). You will thank me later.

If something isn't working, you can suppress loading startup.jl by starting Julia with julia --startup-file=no

There are a few packages that haven't made it to my startup.jl file yet, but I would suggest you consider them:

Difficulty: Easy
Downsides: Slightly slower start-up time of Julia REPL (if you add too many packages)

(Advanced) Precompile your Sysimage

No beginner should ever start here, but it might happen in the first weeks/months.

There is an infamous waiting time for the first time a command runs (eg, waiting for the first plot) or, in general, waiting for Julia REPL with your startup.jl file. If you find it frustrating, use PackageCompiler.jl docs to create a system image (=Sysimage) with all these packages and functions you use preloaded and precompiled.

There is an alternative solution below.

Difficulty: Medium
Downsides: Lost flexibility / ability to easily update (eg, you won't be able to easily update your Julia or those precompiled packages)

Use persistent sessions (tmux)

Following on from the previous point, there is a different way to mostly avoid the "time-to-first-X" (ie, the first compilation).

You can use persistent sessions with tmux (=Terminal MUltipleXer, or others like Screen, Dtach, Abduco+Dvtm). Persistent means that instead of closing Julia REPL every time, you just disconnect and later on reconnect. It will remember all your loaded functions, variables, packages, etc.

Note: This is incredibly useful if you use Julia REPL as your super-calculator / Excel replacement.

Try it:

Read more: Quick and Easy Guide to Tmux

Difficulty: Low
Downsides: None (except for the layer of complexity)

CC BY-SA 4.0 Jan Siml. Last modified: December 09, 2024. Website built with Franklin.jl and the Julia asdasdas programming language.