0

Situation: I want to reinstall a Homelab Server (Windows OS) as a linux-based Server

Server | Purpose: Backup System (mostly offline)

I currently have an HP Proliant Microserver N54
Turion II Neo N54l 2,2Ghz , 4GB RAM

https://geizhals.at/a688459.html

Setup
6 physikal Disks (5 HDD, 1 SSD) in a Pool to a JBOD Storage Space (15,6TiB)
1 LUN, formatted NTFS
Files are shared via Windows Share (SMB/Cifs)
No special NTFS permissions (since it is just me)
Windows Server 2012 R2 (soon EOL)
Deduplication enabled , which saved almost 4,5 TiB on data
mode = general purpose file server

Clients Clients mostly Windows, perhaps some Linux in near future.
Access the server via SMB/Cifs and RDP (managing)

yeah, the server is slow, but the only purpose is archive, mostly turned off and sometimes access the data (single user, no parallel access needed). works OK as it is now

Goal
Since I want to go for linux a lot more and the Server 2012 R2 is End-of-life, I want to reinstall the system on GNU/Linux, providing the same functionality using the same base.

If I read about deduplication, it is always ZFS or BTRFS but LOTS of RAM needed. Or OpenMediaVault with BorgBackup... but the client also needs BorgBackup (and Clients will Windows still)

what would be the nearest equivalent linux setup?

12
  • Don't know where the "LOTS of RAM needed" comes from. Sure, if you build a high-performance NAS for dozens of clients and dozens of disks, then even modest buffers as desirable for maximum throughput become a couple gigabyte in size, but that's absolutely not your use case. I bet in your case, simply running duperemove on a refcount-supporting file system (btrfs, xfs) once in a while, or just doing incremental backups, would totally do to achieve the desired deduplication of data. Commented Aug 9, 2023 at 17:16
  • @MarcusMüller, ZFS is quire "hungry" with deduplication: "For every TB of pool data, you should expect 5 GB of dedup table data, assuming an average block size of 64K" superuser.com/a/1169159/409497 Commented Aug 9, 2023 at 17:23
  • 1
    @RomeoNinov jup jup, if you need instant deduplication, that's what you pay. But in backup application "back up with full duplication, then manually run deduplication on the stored data after" is feasible. Commented Aug 9, 2023 at 18:05
  • 1
    @MarcusMüller, yep, if it's for backups that's a option. But unfortunately my machines are "hungry" and constantly under I/O load :D Commented Aug 9, 2023 at 18:12
  • Hi. So just a (for example) Debian System, using BTRFS or XFS and have a duperemove scheduled repeatedly? Sorry, is there a sample how-to ? just a quick tutorial on how to set up? I am not that good with all the linux stuff yet Commented Aug 9, 2023 at 18:27

2 Answers 2

0

See on Truenas Scale for you file server https://www.truenas.com/truenas-scale/ Instead of RDP there is a web-admin panel

And I think, if you need to use deduplication on you zfs pool, you need to add RAM

1
  • I already tried "OpenMediaVault" which would be ok. But ZFS is - as far as i know - very hungry for ram. Seeing it is 15,6 TiB I will not be able to provide Currently I can do it with only 4GB. Commented Aug 10, 2023 at 11:37
0

So, you'd install debian from a USB drive (usually) quite normally, using the debian 12 amd64 netinst image (that's the default that you get when you click on "download" on debian.org). Go for the "graphical install".

If you have DHCP working in your network (or IPv6 autoconf), then you don't have to set up anything for networking – just go with the defaults.

At some point the installer will ask you to select the scheme with which you want to set up your disks. Use

Guided - use entire disk and set up LVM

Select the drive you want to boot from in the next dialog as where you install things. (if in doubt, use the SSD; it's going to be the one that has to wake up the most, and is fastest at that). Select "yes" when asked whether to write the changes to disk.

You'll be asked afterwards

You may use the whole volume group for guided partitioning, or part of it.…

In that dialog, only use 20 GB for now. This is the size of system installation volume, and we don't need much for that. The Linux Logical Volume Manager (LVM) that we're using also allows us to, at runtime and at any later point in time, add more space if we need it. Neat.

Again, confirm you want changes to be written to disk.

The installer starts installing the base system. Finish the installation, make sure that you're choosing "SSH server" and "standard system utilites" in your Software selection. Since this box is going to be a screenless server sitting in a corner, we don't start with installing any graphical desktop environment :)

Things will get installed, and you'll be asked where to install the bootloader to. Same drive as before.

after the installation finished you can reboot and will be greated by a rather dull prompt:

Debian GNU/Linux 12 debian tty1 debian login: 

Use the normal user credentials you specified during installation.

ip address 

will tell you this machine's IP address.

Use another laptop to ssh into that freshly set up machine – we'll not be sitting in front of it to configure it any further.

Logged in via SSH (this of course also works locally logged in),

sudo vgs 

will show the volume groups; you should have exactly one, debian-vg.

sudo lvs 

will show you the logical volumes; you should have two, root, which contains the filesystem with the system, and swap_1, which is disk space used in case you ran out of RAM.

We'll want to use all the other disks as well, to

  1. add them to the volume group
  2. create a new, large logical volume that spans multiple disks,
  3. format that new logical volume with XFS, and finally
  4. use that as CIFS/Windows share

So, let's first figure out where these drives are. sudo pvs shows you the currently used physical volumes, i.e., the actual storage devices that the volume group is using. There should be exactly one. Something like /dev/sda5. So, /dev/sda would be the one disk we set up during installation.

Run sudo lsblk. You get a list of all disks. Your other disks are also going to be there. Note their names. Let's assume they are sbd, sdc, sdd and so on. (they might really not be!)

sudo pvcreate /dev/sdb /dev/sdc /dev/sdd 

will do irreperable damage to the partition tables of these disks, and prepare them to be added to your volume group.

sudo vgextend debian-vg /dev/sdb /dev/sdc /dev/sdd 

will add them to the debian-vg volume group.

Check sudo vgs again, and see how much VFree Size you now have!

Let's make a new logical volume called datavolume that uses half that space:

sudo lvcreate -n datavolume -l 50%FREE debian-vg 

Neat, sudo lvs will now show that, and sudo vgs will show the reduction in unused space.

Let's format that new volume with XFS:

sudo mkfs.xfs /dev/debian-vg/datavolume 

oops! That would fail, because we've yet to install the mkfs.xfs program:

sudo apt update sudo apt install xfsprogs 

Let's try that again:

sudo mkfs.xfs /dev/debian-vg/datavolume 

That should work. Let's make a new directory where we'll mount that volume, so that we can access files stored on it:

mkdir -p /srv/data 

Run sudo nano /etc/fstab. Add a line

/dev/mapper/debian--vg-datavolume /srv/data xfs noatime 0 0 

save (Ctrl+o) and quit (Ctrl+x). sudo mount --all will now go ahead and do that mounting. it happens automatically on boot from thereon.

As we discussed before, you wanted to regularly deduplicate. So, install the duperemove program (sudo apt install duperemove). You can manually run it like this:

sudo duperemove -rd --hashfile=/var/lib/data-deduplication.db /srv/data 

which will go through all the (currently unindexed) contents of /srv/data, write the hashes of all blocks to the file /var/lib/data-deduplication.db (so that we don't have to do a complete scan next time we run!), compare hashes, then let Linux verify the contents are actually the same (not just the hashes), and deduplicate these.

You can put this command in a systemd service file and use that in a systemd timer to do things like execute once a week, or after 1 hour of idleness, or something, but I'll be honest: my answer here is getting a tad long already. If that kind of automation is interesting, just ask a new question ("how do I run duperemove… once a week?" with a link to this answer).

About how to connect your Windows clients: the usual way here is setting up Samba, so you get a windows share. That's not hard, but it's covered in so many places already. You'll want to share subdirectories of /srv/data. (also, again, asking questions is always welcome.)

If this is for backup purposes only, there's also alternatives, like Unison, which people quite succesfully use, that work with direct logins to the server via SSH, and do not require you to set up samba. Depends on your needs, really!

2
  • Thank you !!!! I need to a bit of time to install and test it out. Just wanted to say "thanks" immediately! Commented Aug 10, 2023 at 13:26
  • @David You're more than welcome Commented Aug 10, 2023 at 13:27

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.