So, you'd install debian from a USB drive (usually) quite normally, using the debian 12 amd64 netinst image (that's the default that you get when you click on "download" on debian.org). Go for the "graphical install".
If you have DHCP working in your network (or IPv6 autoconf), then you don't have to set up anything for networking – just go with the defaults.
At some point the installer will ask you to select the scheme with which you want to set up your disks. Use
Guided - use entire disk and set up LVM
Select the drive you want to boot from in the next dialog as where you install things. (if in doubt, use the SSD; it's going to be the one that has to wake up the most, and is fastest at that). Select "yes" when asked whether to write the changes to disk.
You'll be asked afterwards
You may use the whole volume group for guided partitioning, or part of it.…
In that dialog, only use 20 GB for now. This is the size of system installation volume, and we don't need much for that. The Linux Logical Volume Manager (LVM) that we're using also allows us to, at runtime and at any later point in time, add more space if we need it. Neat.
Again, confirm you want changes to be written to disk.
The installer starts installing the base system. Finish the installation, make sure that you're choosing "SSH server" and "standard system utilites" in your Software selection. Since this box is going to be a screenless server sitting in a corner, we don't start with installing any graphical desktop environment :)
Things will get installed, and you'll be asked where to install the bootloader to. Same drive as before.
after the installation finished you can reboot and will be greated by a rather dull prompt:
Debian GNU/Linux 12 debian tty1 debian login:
Use the normal user credentials you specified during installation.
ip address
will tell you this machine's IP address.
Use another laptop to ssh into that freshly set up machine – we'll not be sitting in front of it to configure it any further.
Logged in via SSH (this of course also works locally logged in),
sudo vgs
will show the volume groups; you should have exactly one, debian-vg.
sudo lvs
will show you the logical volumes; you should have two, root, which contains the filesystem with the system, and swap_1, which is disk space used in case you ran out of RAM.
We'll want to use all the other disks as well, to
- add them to the volume group
- create a new, large logical volume that spans multiple disks,
- format that new logical volume with XFS, and finally
- use that as CIFS/Windows share
So, let's first figure out where these drives are. sudo pvs shows you the currently used physical volumes, i.e., the actual storage devices that the volume group is using. There should be exactly one. Something like /dev/sda5. So, /dev/sda would be the one disk we set up during installation.
Run sudo lsblk. You get a list of all disks. Your other disks are also going to be there. Note their names. Let's assume they are sbd, sdc, sdd and so on. (they might really not be!)
sudo pvcreate /dev/sdb /dev/sdc /dev/sdd
will do irreperable damage to the partition tables of these disks, and prepare them to be added to your volume group.
sudo vgextend debian-vg /dev/sdb /dev/sdc /dev/sdd
will add them to the debian-vg volume group.
Check sudo vgs again, and see how much VFree Size you now have!
Let's make a new logical volume called datavolume that uses half that space:
sudo lvcreate -n datavolume -l 50%FREE debian-vg
Neat, sudo lvs will now show that, and sudo vgs will show the reduction in unused space.
Let's format that new volume with XFS:
sudo mkfs.xfs /dev/debian-vg/datavolume
oops! That would fail, because we've yet to install the mkfs.xfs program:
sudo apt update sudo apt install xfsprogs
Let's try that again:
sudo mkfs.xfs /dev/debian-vg/datavolume
That should work. Let's make a new directory where we'll mount that volume, so that we can access files stored on it:
mkdir -p /srv/data
Run sudo nano /etc/fstab. Add a line
/dev/mapper/debian--vg-datavolume /srv/data xfs noatime 0 0
save (Ctrl+o) and quit (Ctrl+x). sudo mount --all will now go ahead and do that mounting. it happens automatically on boot from thereon.
As we discussed before, you wanted to regularly deduplicate. So, install the duperemove program (sudo apt install duperemove). You can manually run it like this:
sudo duperemove -rd --hashfile=/var/lib/data-deduplication.db /srv/data
which will go through all the (currently unindexed) contents of /srv/data, write the hashes of all blocks to the file /var/lib/data-deduplication.db (so that we don't have to do a complete scan next time we run!), compare hashes, then let Linux verify the contents are actually the same (not just the hashes), and deduplicate these.
You can put this command in a systemd service file and use that in a systemd timer to do things like execute once a week, or after 1 hour of idleness, or something, but I'll be honest: my answer here is getting a tad long already. If that kind of automation is interesting, just ask a new question ("how do I run duperemove… once a week?" with a link to this answer).
About how to connect your Windows clients: the usual way here is setting up Samba, so you get a windows share. That's not hard, but it's covered in so many places already. You'll want to share subdirectories of /srv/data. (also, again, asking questions is always welcome.)
If this is for backup purposes only, there's also alternatives, like Unison, which people quite succesfully use, that work with direct logins to the server via SSH, and do not require you to set up samba. Depends on your needs, really!
duperemoveon a refcount-supporting file system (btrfs, xfs) once in a while, or just doing incremental backups, would totally do to achieve the desired deduplication of data.