Skip to content

File Permissions

Steven Dieffenbach edited this page Sep 21, 2016 · 9 revisions

File permissions in HDFS provide base-level file system confidentiality. Multiple, configurable levels of file security are available in HDFS. These options include POSIX-like permissions, access control lists (ACLs), simple web-server user credentialing, and Kerberized user credentialing.

I/O Permissions Interactions

Clients communicating to the NameNode will be processed using the file meta-data (which include the FsPermission and FsAction objects) to determine if the client process has sufficient permissions for the given blocks.

Clients communicating to the DataNode will use a BlockPoolTokenSecretManager object during the DataTransfer to create an access token for the given block. The BlockPoolTokenSecretManager obtains a token from the BlockTokenSecretManager for the block. The BlockTokenSecretManager obtains the user information from UserGroupInformation when creating the token, which allows the client process to be verified as having access to the files. For this, specifically, the BlockTokenIdentifier object holds the permissions, which can be verified by referencing BlockTokenIdentifier.getAccessModes().contains(), which takes a BlockTokenIdentifier.AccessMode specifying which mode the file should be opened for.

Basic Permissions

The basic level of user and group file permissions in HDFS are provided by default. These permissions are similar to POSIX permissions, but are somewhat less complicated. In HDFS, the only types of data considered for permissions are files and directories. The 'r' and 'w' permissions are supported for files (for reads and writes/appends). The 'r,' 'w,' and 'x' permissions are supported for directories (for 'ls' like operations, file deletion, and child directory access respectively).

In the Hadoop project, the FsAction object is used to determine what type of permissions an action requires in HDFS (using the 'r,' 'w,' and 'x' options as in POSIX). The FsPermission object handles the types of permission flags that exist for a given file or directory. These objects can be defined with a user, group and other action (or with a string-represented mode). This object supports multiple methods, including application of a umask, the sticky bit, and multiple types of permissions checking.

As stated, the only types of data considered are files and directories. In this sense, there is no need to consider the POSIX permissions settings for executable files, which simplifies these checks.

Access Control Lists

Access Control Lists can be enabled in HDFS by configuration option (they are disabled by default). These ACLs are also similar to POSIX ACLs. An access control list is implemented as a list of (type, [name], permissions) entries. These entries specify the type of the entry (e.g. applying to users, groups, etc.), [optionally] the name of the entity the entry applies to (e.g. the group admin), and the permission flags (e.g. rw-).

ACLs add an additional level of configurable security to files and directories in HDFS, but tracking them is not as efficient in space complexity as the basic permissions detailed above.

Simple User Credentials

In order for a client process to be identified, HDFS must have a method for assigning it credentials for the permissions checking above. The basic level of credentialing will be called "Simple User Credentials." This method of credentialing obtains the user identity from the server connection. This method should not be used for protection against malicious actors.

Kerberized User Credentials

An additional level of file system confidentiality may be obtained by using Kerberos methods to authenticate client processes. Using Kerberos tickets, an HDFS user identity can be assigned to a client process securely, which allows for a stronger permissions system.

In Code

Clone this wiki locally