Utilities in the Enterprise
Modern enterprise data centers are a complex mix of different technologies geared towards accomplishing business goals. Some of these technologies are pricy, big-name business solutions, but some are simple tools and utilities, facilitating processes. Linux sysadmins have been using rsync (remote synchronization) to move and mirror files for two decades, though versions of it now run on nearly every platform. Its lightweight build, small footprint, and usability make it a good choice for simple file copy operations. But this same asset is also a liability for many utilities: designed purely for functionality, they may not automatically account for potential risks to enterprise data. To successfully use rsync in the enterprise means protecting the data being transferred through it from accidental exposure.
About Rsync
One of the great advantages of rsync over other similar utilities is that it is able to easily transfer only the delta between systems. For example, if you set up rsync on a file server and connect a backup server as the mirror, the initial sync will move every file in the specified path. After that first sync, rsync will only move the changes, keeping the mirror identical to the primary server and minimizing network traffic. This type of file copy procedure is extremely common for most organizations, and without process guidelines, techniques and utilities vary widely among individual admins.
Despite its compact build, rsync does have security options that can protect the data it transfers. But like many pared-down tools, it does not invoke them by default, and the burden therefore rests on the person setting it up to configure it securely.
Why it Matters
Data Exposure
Utilities are data agnostic. They don’t know sensitive from not sensitive, and they don’t separate out dangerous from harmless. They only do what you tell them, and in the case of most Linux-based utilities, only exactly what you tell them. When rsync is used by businesses, the government, and other large organizations, the files being transferred may contain extremely sensitive information. Although rsync can move these files the same as it could if they contained gibberish, the risk to the business can be severe if that information leaks.
Data exposure has become a prominent business risk, and organizations that have experienced such a leak have also had to endure the associated financial and reputational damage. Rsync can be a powerful utility for simple file mirroring or transfer, but the level of care taken to configure it should be commensurate to the sensitivity of the data being transmitted.
Important Configurations
Rsync vs. Rsyncd
Before we dive into the configurations themselves, it’s important to note that there are two different ways to use rsync. One is a command line utility where all of the details are passed as argument variables, this is rsync. The daemonized version of rsync is known as rsyncd, and listens on a designated port as a service. Rsyncd relies on rsyncd.conf for its configurations, where each sync path has its own block of options. Rsyncd is the vector for data exposure involving rsync, as it can be opened by an anonymous third party without the proper protection. For our purposes, we will focus on rsyncd, which is the most common way rsync is utilized at scale.
Rsyncd.conf
The rsync daemon depends on rsyncd.conf for its authentication, access, logging, and available modules. In service mode, rsync can provide details for many different synchronization paths. Organizations that rely on rsync may find these paths accumulating over time. Because each path requires separate configuration, it’s easy for one to fall through the cracks and omit an important directive.
Example rsyncd.conf
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock
log file = /var/log/rsync.log
port = 12000
[files]
path = /var/storage/
comment = Primary file server
timeout = 300
In its simplest form, the rsync.conf file declares the following global parameters:
- Process ID file path - Necessary for the daemon to run
- Lock file path - Necessary for the daemon to run
- Log file path - Important for error handling, service monitoring, and troubleshooting
- Port to listen on - The default rsync port is 873, but can be overridden here
Other global options exist, such as specifying the IP address to listen on, advanced socket options, and the ability to send a message of the day (MOTD), essentially a service banner, to users of the rsync service.
Additionally, it can include any number of “modules,” or file paths to synchronize. In the example above, our [files] block denotes the sole module for this system. Under that module, many directives can be set. The most important among these are:
- Path - This is the path that will be synchronized. Note the trailing forward slash in the path /var/storage/. This indicates that the contents of this folder, and not the folder itself, will be synchronized. Without the trailing slash, rsync will place a copy of the folder inside of the destination, instead of just copying the contents.
- Timeout - The figure here is computed with seconds as the unit. Helps prevent embryonic connections from swamping the server.
- Read Only/Write Only - These parameters restrict access by function. Read only allows only downloads, while write only allows only uploads.
- Max connections - By default this option is set to 0, which means unlimited connections, but occasionally throttling connections is necessary for performance or protection from denial-of-service type attacks.
- Include/Exclude - These parameters allow rsync to act more granularly within the specified path by excluding files and directories, and allowing includes of specific files from those exclusions.
- Incoming/Outgoing chmod - The chmod parameters allow rsync to set ACLs on the files during the transfer process. This can be crucial when the source and destination should have different permission sets.
However, the directives that set security are even more important. Because rsync is a lean utility, none of them are engaged by default. This requires administrators to understand and validate their rsync module configurations in order to properly limit access to the information they handle. Let’s look at each of them in detail.
List
The list option allows rsync to “hide” a module from anyone who doesn’t know what they’re looking for. When the rsync daemon is queried for available modules, those set to list = false will be omitted from the results. While this kind of security through obscurity is not enough on it’s own, it is one additional layer you can add to protect particularly sensitive file paths. By default, modules are listable, so this parameter must be explicitly set to false for hidden modules.
List = true
The module is visible when the rsync daemon is queried for available paths.
List = false
The module is not visible in the daemon’s list and must be accessed directly.
Default: True. The module will be visible.
Hosts Allow/Deny
The most basic way to protect rsync modules from accidental exposure is to restrict which external machines can talk to it. By using the hosts allow and hosts deny directives, rsync can build a policy of least privilege by permitting only those clients necessary for business goals. With hosts allow, all unspecified source IPs will be disallowed automatically. This drastically narrows the attack surface of the rsync server and should always be established for even somewhat sensitive information. Hosts deny can block specific IP addresses, offering further access granularity to an allowed IP range.
Hosts allow [IP address, IP range, hostname]
Specified clients will be allowed, unless they are also in the hosts deny list. All others will be blocked.
Hosts deny [IP address, IP range, hostname]
The specified clients will be blocked. All others will be allowed, unless the hosts allow directive is in use, in which case they must also be specified there.
Default: All hosts are allowed.
When used in conjunction, the hosts allow directive is read first. If the client is allowed there, the hosts deny directive is then read. If a client matches there, they are denied access-- even if specified in the allow list.
Auth Users and the Secrets File
IP and hostname restrictions narrow the attack surface by device, but any user on those allowed devices will be able to access the rsync module. The auth users directive narrows the attack surface by user, limiting access to only specified accounts, regardless of device. When auth users is enabled and given a list of usernames, only those users can connect to the rsync daemon.
The auth users directive relies on a “secrets” file, for example, /etc/rsyncd/rsyncd.secrets. This file contains the username and password combinations for rsync accounts. It’s critical to note that the secrets file is stored in plain text, including passwords. This means the file should be heavily restricted.
If the auth users directive is absent, the default is to allow all users. And just like that, if your rsync server is available from the internet, you have a data leak. The most important takeaway to remember when building a secure rsync setup is that by default, anyone can access the path. Failure to correctly configure the auth users and hosts allow/deny settings turns whatever data is being synchronized into a public facing webpage. Anybody who finds the rsync server can pull the contents anonymously, without needing a password. Incidentally, finding internet exposed rsync hosts is trivial when the default port is being used. It is always recommended to limit access to rsync by user and device. Every layer reduces the risk of data exposure.
Auth users admin1,support,serviceadmin
The specified users will be allowed to authenticate to rsync. Default: All users are allowed.
Secrets file /etc/rsyncd/rsyncd.secrets
This specifies the location of the username and password combinations used by the auth users directive.
Default: None. Must be used in conjunction with auth users.
Strict Modes
Having a plain text file with usernames and passwords, like that of the rsync “secrets” file, is not a great idea. This illustrates the risks of using rsync in the enterprise, one which companies must be willing to take in order to employ its functionality. However, there is another directive, called strict modes, that can offset the risk of the secrets file being compromised to some degree. Strict modes checks that the secrets file can only be accessed by the account under which the rsync daemon is running. For instance, if rsyncd is running under our dedicated rsync user (as it should, with minimal privileges) then only the rsync user should have access to read the secrets file. The daemon checks the file permissions and will not run unless they are correct. This is some nice additional validation that the plain text passwords in the secrets file won’t be accessed by unauthorized users.
That said, most enterprise class technology would never store passwords unencrypted in a text file. This is a qualitative difference between tools geared towards maximum functionality and platforms designed with business risks in mind. However, with the proper care, even rsync can be fairly well protected against accidental and malicious access.
Strict modes = true
The secrets file will be checked for proper access and the daemon will not start without it.
Strict modes = false (default)
The secrets file will not be checked for the proper permissions.
Default: False. The secrets file will not be checked unless auth users, secrets file, and strict modes are all enabled.
Encryption
Encryption is one area where rsync and rsyncd differ greatly. When rsync is used on the command line, a separate protocol, usually SSH, must be specified for the transfer. However, the rsync daemon does not encrypt traffic. This means that an rsync process can potentially be sniffed in transit by a third party, granting them access to whatever information is being transferred. Therefore, rsync operations happening openly across the internet are extremely vulnerable to data exposure.
All rsyncd traffic should occur within a protected intranet or inside of an encrypted tunnel or VPN. At the enterprise level, there is no excuse for passing unencrypted data across the net. Alternative simple file copy solutions such as SCP and SFTP also support built-in encryption.
Default: Unencrypted on rsyncd.
Open Port
If rsync is open to the net, anyone who scans the server will find an open port. Changing the port from 873 in the rsyncd.conf file can help obfuscate this, but ultimately if the rsync port is exposed, someone will eventually find it and see what they can do. Like any enterprise service, access to the rsync port should be limited in scope. Firewall ACLs can block unauthorized source IPs, much like the hosts allow and hosts deny directives in rsync itself.
Consider the operations being carried out by rsync. Is the data being copied important? If so, internet facing rsync is a massive vector of risk, and even with careful configuration can prove dangerous over time.
Default: Port 837.
Conclusion
Building a secure rsync setup for enterprise operations requires applying multiple layers of protection, each helping to minimize the surface area of the daemon and limit the remote connections that will be allowed access.
- Only allow necessary remote hosts and user accounts
- Enforce strict modes to validate secrets file access
- Encrypt all rsync transmissions through a tunnel or limit them to an intranet
By following these three rules on every rsync module, you can reduce the chances of rsync-based data exposure significantly, allowing you to take advantage of the functionality of rsync without succumbing to its risks. But whether it’s an enterprise platform or a simple utility, misconfigurations will be the number one risk. People make mistakes all the time, and without the right process controls, those mistakes can come back around as a data breach or major outage. It’s fun to talk about 0-day exploits and fancy hacking methods, but an unprotected rsync server is far more likely and every bit as dangerous.