Forget the commands. This is about reading Linux the way a detective reads a crime scene — every file a witness, every directory a trail of decisions made by engineers solving real problems.
Most people use Linux. Fewer people read Linux. The file system isn't just a place to store files — it's a live, annotated record of how the operating system thinks: how it trusts users, how it finds machines on a network, how it boots, how it dies, how it remembers. Every directory exists because an engineer once faced a problem and decided a file was the answer.
These are eleven things I found when I stopped typing commands and started opening files and asking why does this exist?
/etc/passwd is readable by every user on the system. Open it and you see every
account: username, UID, GID, home directory, default shell. What you won't
see is a password. The second field is just an x.
Originally, Unix stored hashed passwords directly in /etc/passwd. That worked
when only root could read it — but many programs (like ls resolving usernames,
or mail software looking up UIDs) need to read user data. Making /etc/passwd
root-only broke all of them. The solution was a deliberate split: non-secret user data
stays world-readable in /etc/passwd; the hashed secrets move to
/etc/shadow, readable only by root and the shadow group.
The x is literally a redirect: "the secret is elsewhere."
It solves a classic security vs. usability conflict. You can't lock the user database away completely because the entire system needs to read usernames. The shadow file pattern separates identification (public) from authentication (secret) — a principle that appears repeatedly in modern security design.
The /etc/shadow hash prefix reveals the algorithm: $y$ = yescrypt
(Ubuntu 22+ default), $6$ = SHA-512, $1$ = MD5 (a red flag
on any production server). The numeric fields encode a complete password policy:
last change date, minimum age, maximum age, warning period. A value of !!
in the hash field means the account exists but has no valid password — the standard
pattern for service accounts that authenticate only via SSH keys, never passwords.
| Field in /etc/shadow | Meaning |
|---|---|
| $y$j9T$... | Hashed password (yescrypt algorithm) |
| 19800 | Days since Unix epoch when password was last changed |
| 0 | Minimum days before password can be changed again |
| 99999 | Maximum days password is valid (≈ never expires) |
| 7 | Warning days shown before expiry |
Most people's mental model is: name → DNS → IP. The real chain on Linux has three
stages controlled by three separate files. Understanding the order explains a lot of
"mysterious" networking behavior — including why editing /etc/hosts can
override production DNS.
/etc/nsswitch.conf (Name Service Switch) is the orchestrator. It was designed
so that user/group/hostname lookups could come from multiple backends — local files,
LDAP, NIS, DNS — without changing any application code. The application calls
getaddrinfo(); nsswitch decides who answers.
It solves the "where do names come from?" problem in a way that's configurable without
recompilation. Developers exploit this daily: add 127.0.0.1 api.prod.company.com
to /etc/hosts and your machine routes that domain locally, before any DNS
server is ever consulted. This is why it's the first line of defense in penetration testing
and local development alike.
/etc/resolv.conf isn't even a real file — it's a symlink to a path managed
by systemd-resolved. The "nameserver" is 127.0.0.53 — a loopback
address for a local DNS daemon that handles caching, DNSSEC validation, and per-interface
DNS. My cache showed 847 entries from a single session. This means your DNS cache survives
browser restarts. Tools like nslookup that bypass libc and contact DNS
directly will miss this cache entirely — which is why they sometimes give different answers.
If /etc/resolv.conf points directly to 8.8.8.8 without
going through systemd-resolved, every DNS query leaves the machine unencrypted,
unvalidated, and fully visible to anyone on your network. Every hostname you visit
is legible as plaintext.
Every decision your machine makes about where to send a network packet is determined by
the routing table. The kernel exposes this as a readable file at /proc/net/route.
The catch: values are written in hexadecimal, little-endian byte order. Most people use
ip route and never see the raw data — but looking at it directly reveals
something interesting about how Linux abstracts hardware.
The kernel maintains this routing table in memory to make packet-forwarding decisions
at wire speed. Exposing it through /proc is a Unix philosophy win: instead of
writing a special system call for "read the routing table," the kernel just makes it a
file. Tools like ip route and netstat -rn don't maintain their
own data — they read this exact file and translate the hex to human form.
It answers definitively: where will this packet go? When routing behaves unexpectedly (VPN not routing certain traffic, wrong interface chosen), this file is the ground truth. No caching, no abstraction layer — this is the kernel's actual decision table at that exact moment.
The little-endian hex encoding isn't an accident — it's the native byte order of
x86 processors. The kernel writes these fields in the CPU's natural format, not in
a human-friendly one. This is a reminder that /proc is not designed for
humans; it's a kernel-to-userspace interface where the kernel writes values as-is and
expects userspace tools to handle the translation. When you use ip route,
you're trusting iproute2 to decode this correctly on your behalf.
Every running process gets its own directory under /proc/ named by its PID.
This directory doesn't exist on disk — the kernel conjures it in memory for the lifetime
of the process. The moment the process dies, the directory vanishes. I picked a running
Firefox instance (PID 3204) and walked through its directory like an autopsy.
The /proc/[pid]/fd/ directory contains symlinks to every file descriptor
the process currently has open — files, sockets, pipes. File descriptor 0, 1, 2 are
always stdin, stdout, stderr. That Firefox's point to /dev/null means it
discards terminal output (it logs elsewhere). The maps file shows the full
virtual memory layout — every shared library loaded, every anonymous allocation, the stack
and heap boundaries.
Without /proc, you'd need a special kernel API to inspect any running process.
Instead, every tool that monitors processes — top, htop,
lsof, strace — reads from /proc. It's how the kernel
makes its internal state visible without needing to write a custom tool for every possible
inspection.
/proc/[pid]/environ captures the environment variables at process
launch time and freezes them. Even if someone later changes $PATH
system-wide, a process retains the environment it was born with. This is a forensics
goldmine: reading a suspicious process's environ reveals what $HOME,
$PATH, or embedded secrets it was started with. I found
DBUS_SESSION_BUS_ADDRESS revealing exactly which desktop session spawned Firefox.
Services in Linux aren't magic — they're plain text files called unit files.
There are two locations, and understanding why both exist is the first step:
/lib/systemd/system/ contains vendor-supplied defaults (installed by packages);
/etc/systemd/system/ contains admin overrides. Files in /etc
take precedence, so you can customize any service without touching package files.
Before systemd, service management was a chaos of shell scripts in
/etc/init.d/ — different for every distro, impossible to parallelize,
with no standard dependency mechanism. Unit files replaced all of that with a
declarative format: you declare what a service is and what it
depends on, and systemd figures out the order.
Dependency resolution. The After=network.target line means "don't start
nginx until the network is ready." systemd builds a dependency graph and boots services
in parallel where possible, reducing boot time. Before this, services started in a fixed
serial order and frequently failed because a dependency hadn't initialized yet.
PrivateTmp=yes is a single line that gives the service its own isolated
/tmp namespace — it can't read or pollute the system's /tmp.
This is a Linux namespace feature (the same mechanism Docker uses) expressed as a unit
file option. NoNewPrivileges=yes means even if nginx were exploited and
called a setuid binary, it couldn't escalate. Security hardening is declarative and
human-readable, right there in the service definition.
/var/log/auth.log records every authentication event on the system:
every SSH login attempt (successful or failed), every sudo command,
every su invocation, every PAM authentication event. Reading it for
the first time is alarming.
Authentication logging is a legal and operational requirement. A server without auth logs can't tell you when a break-in happened, who did it, or what they ran. The log format includes timestamp, hostname, daemon name, PID, and the full event — enough to reconstruct the sequence of events after a compromise.
// Real-world impactIt solves post-incident forensics and real-time monitoring. The 312-attempt SSH dictionary attack visible in my log — targeting root, then admin, ubuntu, pi, test, user in order — shows that every internet-facing server gets this constantly. The log is the evidence that lets you identify, block, and report the attacker.
Modern Ubuntu stores logs in two places simultaneously: the traditional text
/var/log/auth.log (for compatibility with old tools) and the
structured binary journal at /var/log/journal/. The journal preserves
log level, PID, unit name, and boot session per entry. Running
journalctl -b -1 -p err shows all errors from the previous boot
— invaluable when diagnosing a crash. The journal stores boot sessions separately,
so you can trace exactly what was happening in the seconds before a kernel panic.
The free command is a simplified summary. /proc/meminfo is the raw
truth. Reading it carefully reveals how Linux's memory model actually works — and why
"used memory" is a misleading concept that causes unnecessary panic.
The apparent paradox: only 412 MB free, yet 9.8 GB available. The difference is the
page cache. Linux aggressively uses idle RAM to cache recently-read
file data because unused RAM is wasted RAM. When an application needs more memory,
the kernel evicts cold cache pages instantly. This is not memory pressure — it's the
system working correctly. MemFree is nearly irrelevant; MemAvailable
is what matters.
It gives the kernel — and tools that read it — a real-time view of physical memory
state without requiring an expensive system call. The Linux OOM (Out-of-Memory) killer
reads from /proc/meminfo to decide when memory is critically low and which
processes to terminate to recover it.
The Dirty field (14,208 kB) represents data written to the page cache but
not yet flushed to disk. If power is cut right now, those 14 MB of writes are lost.
The kernel flushes dirty pages on a timer controlled by
/proc/sys/vm/dirty_writeback_centisecs (default: 500 = every 5 seconds).
This is the fundamental write-durability trade-off every database must navigate.
Databases that call fsync() are explicitly forcing dirty pages to disk
rather than trusting this timer — because for them, losing a transaction is worse
than a few milliseconds of latency.
/dev/ doesn't hold regular files — it holds device nodes, kernel
interfaces that look like files but are gateways to kernel behavior. Three of them are
used so frequently they've become idioms:
/dev/null is the void — write to it and data disappears; read from it and
you get immediate EOF. The classic 2>/dev/null silences a command's
error output by routing it here. /dev/zero produces infinite null bytes,
used to create blank disk images or wipe sensitive memory regions.
/dev/urandom reads from the kernel's entropy pool — randomness gathered
from hardware timing jitter, interrupt timing, and other unpredictable physical events.
Every SSH key ever generated, every TLS session, every UUID on your system ultimately
gets its seed from this file.
They solve the problem of "where do I send data I don't want?" and "where do I get data I can't predict?" in a way that's composable with Unix pipes. Because they look like files, they work in any context where a file path is expected — no special API required.
There's a historical debate: /dev/random vs /dev/urandom.
Old wisdom said /dev/random was "more secure" because it blocked when
entropy ran low. Modern Linux (kernel 5.6+) made both equivalent — the kernel's CSPRNG
is always sufficiently seeded. Old Java VMs that used /dev/random caused
real production outages stalling for entropy the kernel already had. The security
model changed; the file interface stayed the same; old code assumptions broke silently.
/etc/fstab (File System TABle) is the persistent mount configuration.
Every filesystem that should be automatically mounted at boot is listed here. Reading it
carefully reveals architectural decisions — and security policies — baked into the boot process.
Device names like /dev/sda are assigned by the kernel based on detection order
at boot. Add a second disk and /dev/sdb might become /dev/sda.
UUIDs are baked into the filesystem itself and never change regardless of hardware changes.
A machine using device names in fstab could fail to boot after adding a new drive.
It makes the storage layout declarative and persistent across reboots, while the
pass column controls filesystem check order: 1 = check
first (root), 2 = check after root, 0 = skip (swap doesn't
need fsck). The errors=remount-ro option means if the root filesystem
encounters an error, mount it read-only rather than risk corruption.
The last line mounts /tmp as tmpfs — a RAM-backed filesystem,
lost on reboot. The nosuid flag means setuid executables in /tmp
won't escalate privileges — a direct response to historical exploits where attackers
planted setuid binaries in world-writable temp directories. The
security policy is written directly into the mount configuration. A single
mount option is doing the work of an intrusion prevention system.
Every process on Linux has a parent, except one. Process ID 1 is the first userspace process started by the kernel after boot. Everything else — every shell, every server, every desktop — is a descendant. On modern Ubuntu, PID 1 is systemd.
PID 1 has immunities no other process has. The Out-Of-Memory killer is hardcoded to
never kill it — its oom_score_adj of -1000 is the minimum
possible. If PID 1 dies, the kernel doesn't gracefully shut down — it panics.
/sbin/init is a symlink to systemd, but this is a convention: in containers,
PID 1 is often a minimal shell or tini; the kernel only cares that
something occupies that slot and never terminates.
PID 1 solves the "orphan processes" problem. When a parent process dies before its
children, those children are re-parented to PID 1. PID 1 must then call
wait() to collect their exit status and prevent zombie processes accumulating.
This is why poorly-written container entrypoints cause zombie accumulation — a shell
script that forks and doesn't wait() will silently build up hundreds of
defunct entries over time.
/proc/sysrq-trigger is a write-only file that sends magic system requests
directly to the kernel, bypassing everything. Writing b to it reboots
immediately — no sync, no unmounting. Writing s syncs all filesystems.
Writing o powers off. Writing t dumps all running thread
backtraces to the kernel log. This exists because sometimes the kernel itself is more
trustworthy than the processes running on top of it.
/proc/sys/ is a writable directory tree that exposes live kernel parameters —
and lets you change them instantly, without rebooting. It's the kernel's
control panel, expressed as files.
Kernel parameters used to require recompiling the kernel to change. The sysctl
interface — exposed via /proc/sys/ — was created so administrators
could tune kernel behavior at runtime without downtime. The parameters cover
networking behavior, virtual memory policy, filesystem limits, and security features.
High-performance applications like databases, web servers, and networking tools
often require kernel tuning to operate correctly at scale.
Redis documentation, for example, tells you to set vm.overcommit_memory=1
via sysctl. A single echo to a /proc/sys/ file is all it takes —
no restart, no recompile, live effect.
net.ipv4.ip_forward = 0 on my machine means Linux will drop packets
that arrive on one network interface destined for another. Writing 1
to that file turns the machine into a router instantly. This is exactly how Docker and
Kubernetes enable container networking — they write 1 to
/proc/sys/net/ipv4/ip_forward during startup and set up iptables rules
to route traffic between containers and the host. Every container networking feature
you've ever used was enabled by a write to this single file.
There's a persistence gap: writes to /proc/sys/ survive only until the
next reboot. Permanent changes require /etc/sysctl.conf or a file in
/etc/sysctl.d/. This two-tier system (temporary vs. persistent) mirrors
the pattern seen everywhere in Linux: /proc for live state,
/etc for persistence. Any system tuning guide that tells you to write
directly to /proc/sys/ and doesn't mention sysctl.conf
is giving you half the answer — your changes will vanish on the next restart.
Every finding in this report traces to the same principle: Linux externalizes its
reasoning. Passwords split across two files because usability and security conflict.
The routing table is a file because files compose with pipes. PID 1 can't be killed
because the system dies without it. /tmp is mounted nosuid
because attackers exploited it. The entropy pool is a file because randomness is
a resource, and Unix treats resources as files.
The file system is not just where data lives. It's the operating system's
argument about how a computer should work — written in directories you can
read, files you can open, and parameters you can change with a single echo.
The most important skill in Linux isn't knowing commands. It's knowing which file to open.