Advanced Kerberos
Kerberos protects all Hadoop services (Spark, YARN, HDFS, etc.), and a valid Ticket Granting Ticket (TGT) is required for access. This page guides using Kerberos within Terrascope.
To create a TGT, execute the kinit
command:
kinit
Password for username@VGT.VITO.BE:
Enter the Terrascope password when prompted.
To check the validity period of the ticket, use the klist
command:
klist
Ticket cache: FILE:/tmp/krb5cc_30320
Default principal: username@VGT.VITO.BE
To explicitly destroy the TGT, use:
kdestroy
TGT lifetime
By default, a TGT has a lifetime of 24 hours, which can be extended to 2 days with the command:
kinit -l 2d
If a valid ticket exists, it can be renewed for up to one week. Requesting a longer lifetime than allowed will cap both the validity and renewal time. For instance:
kinit -l 4d -r 20d
klist
Ticket cache: FILE:/tmp/krb5cc_30320
Default principal: username@VGT.VITO.BE
Valid starting Expires Service principal
11/10/2022 16:15:14 11/12/2022 16:05:14 krbtgt/VGT.VITO.BE@VGT.VITO.BE
renew until 11/17/2022 16:15:14
Clients must renew their tickets before expiration, which is typically handled when submitting Spark jobs.
Keytabs
Entering a password is suitable for initiating Spark jobs interactively. However, this method is not feasible when starting Spark jobs from a workflow. In such scenarios, an alternative is to request an additional Terrascope account with a keytab file, which serves as a password replacement. To create a TGT using a keytab file, use the following command:
kinit -kt /path/to/username.keytab username@VGT.VITO.BE
Avoid attempting to generate a keytab for a personal Terrascope account, as this will invalidate the password and block access to the User Virtual Machine and Jupyter Notebooks.
Delegation tokens
Theoretically, Kerberos could be used exclusively for authentication. However, in a distributed system like Hadoop, relying solely on Kerberos could overload the Kerberos service as all clients access it. To mitigate this, Hadoop introduced delegation tokens. Once a client authenticates with Kerberos, a delegation token is obtained. This token is then used to authenticate against other Hadoop services instead of relying on Kerberos for each service interaction.