Split a large CSV into smaller files (1000 lines each)
1
split-l1000bigfile.csvsmallfile_
Find the 10 largest directories in the current directory
1
du-h--max-depth=1|sort-hr|head-n10
Tmux
Tmux is a terminal multiplexer that allows you to run multiple terminal sessions inside one window. It's especially useful for long-running data science jobs that need to persist after SSH disconnections.
Install tmux (macOS)
1
brewinstalltmux
Start a new session
1
tmux
Start a new named session
1
tmuxnew-smy_session
List all sessions
1
tmuxls
Attach to an existing session
1
tmuxattach-tmy_session
Detach from current session
Press Ctrl+b then d
Kill a session
1
tmuxkill-session-tmy_session
Split pane horizontally
Press Ctrl+b then "
Split pane vertically
Press Ctrl+b then %
Navigate between panes
Press Ctrl+b then arrow keys
Create a new window
Press Ctrl+b then c
Switch between windows
Press Ctrl+b then window number (0-9)
Rename current window
Press Ctrl+b then ,
Scroll mode (view history)
Press Ctrl+b then [, use arrow keys to scroll, press q to exit
Run a long training job in tmux
1234
tmuxnew-straining
pythontrain_model.py
# Press Ctrl+b then d to detach# Reconnect later with: tmux attach -t training
Feel free to copy, modify, and combine these snippets for your data science projects!