Alright, let's talk about troubleshooting LDAP. Starting from the beginning, the one thing that you might have to troubleshoot is the validation failures.
This is where if your certificate is not right, if your group is not set up properly on the Active Directory or OpenLDAP server, if your username is not entered in the correct format, you could see all these problems.
So, if the password is incorrect for the query user, you will see it says "Failed to validate LDAP configuration details", and you will not be able to hit the "Submit" button.
One thing I want to call out is when you upload the certificate here, the "AD-cert.cer" that you see there, what ACM does with it is once you click the "Validate" button it basically stores it in the "Certificates Directory" that is located in the "User local data protection" bar, config manager server_data certificates.
Now, once the certificate gets stored there, ACM also converts that certificate from ".cer" to ".pem" format. Now, I've made another video to go through how a successful validation would look like.
And if you have a failed validation, what are the kind of things that you could look for in the log to troubleshoot? So, let's go right ahead with that.
So, in order to do so again, come to the "Logs Directory", Escape, shift G, and then Escape, search for this configure LDAP on an appliance.
Once you search for that, it should get you to the message like this, where it's trying to do the "testLDAPconnection". If you scroll down a little further, it does a ping test and then it verifies whether the user is successfully authenticated and is a member of the group.
And you’ll see a message like this, LDAP connection tested successfully. Now, let's review the logs for LDAP validation failure. Let's look at the failed one now.
Here, I enter the details, and I'll purposely try to enter one of the incorrect fields as incorrect value. And we'll see how that goes. This is a secure one, so we'll check that box.
We'll upload the certificate, and then we'll try to validate it. It looks like the validation failed. You can hover over that red exclamation. It says "Failed to validate LDAP configuration details".
So again, come to "ACM logs directory", view "server.log", shift G to go to the end of the log, ? , and then search for this, LDAP Util: testConnection, with "C" capital, one word.
And you’ll see a message like this, and it's clear that it was an issue with either the username or password. In this case, it was a password issue, and we see that the test connection failed.
So, with that, let's basically go through some of the gotchas to know about when you're entering some of these details. So, when it asks for "Server Hostname" or "LDAP Hostname", users must provide the "FQDN" or fully qualified domain name. IP addresses do not work.
Query username, users must provide the username in "User Principal" name format. An example would be "Abc@domain.com", that works all the time.
Admin Group Settings. Now, this is the admin group that is set up at the Active Directory or the OpenLDAP level. When they create the group, that's your customer, they need to ensure that the scope of the group should be set to "Global" and the type should be "Security".
Now, "Query username" must be a member of LDAP admin group. Best practice is generally to use lower case for all values. We have a code defect as well which causes some troubles when you're using values in upper case, so, be aware of that.
For secure LDAP configurations, users must provide root CA certificate in ".cer" format. And lastly, nested group is not allowed, meaning users should be a direct member of the LDAP admin group.
Now, this user we're talking about is the LDAP query user. For example, you can't have an LDAP query user as part of a group and that group is then made a member of that admin group. That will not work.
There's one more requirement that we wanted to cover. For LDAP integration to work successfully on Protection Storage or Data Domain, the LDAP query user must have "Create/Remove" Full Control permissions for computer object, because during the integration, it does create a computer object.
So, how do you do that? Basically, in the Active Directory, locate the LDAP query user and basically go to the parent "OU", or the "Organization Unit" where that user lives. Right-click on that, and click on "Delegate Control". When you do that, you'll get a wizard like this. You can click "Next", add your user, the query user.
Next, create a custom "Tasks to Delegate". And when you do that, you can basically select Computer objects here and ensure you select Create selected objects in this folder or "Delete". Once you do that, you can click "Next". Ensure you give it "Full Control" privileges, and "Next", and "Finish".
Once you do that, that will ensure you have a smooth configuration on the Protection Storage, or the Data Domain side as well. Now, when it comes to troubleshooting LDAP type issues, it's very important to go in a systematic order. I always recommend starting with a ping test.
Ensure connectivity using the ping command and always use hostname or FQDN for ping. I would suggest using IP address as well. And there's a reason for that. So, you can issue "ping -c". Issue four packets, so put the value "4 FQDN" and basically, you can test the ping and confirm that it was successful.
Now, the reason I said test the ping to both IP and hostname is because sometimes ping to IP works. However, it does not work to hostname from ACM, or from one or more components within the IDPA. And the reason for that could be that the DNS search domain is missing in "/etc/resolv.conf" file that can cause ping failures to LDAP server hostname.
Honestly, this is the most common issue that we get. So, always ensure if you're able to ping the IP address, however, cannot ping the hostname. Check the "/etc/resolv.conf" file on each of the IDPA components. Ensure that you have this search domain added for their Active Directory, or for the environment to ensure the resolution works.
Then, I would suggest certain port checks. So, what are the port requirements for LDAP integration? We basically require TCP ports 389 and 636 that must be open for communication between IDPA components and the Active Directory or "OPENLDAP" server. Now, TCP ports 88 and 464 must be open for Kerberos authentication between protection software which is your "Avamar", protection storage, which is your data domain, and the Active Directory or "OPENLDAP" server.
How would you test port connectivity? You can do that via "curl" command. We saw that in the demo. So, you can issue "curl -kv" the "FQDN_OF_LDAP_SERVER": the port number. Remember, 389 is for "Non-secure", and 636 is for "Secure". And when you issue that, this is what you're looking for. Connected to "dc.x400.sh" on "port 636". Troubleshooting using "ldapsearch", I feel is a very important tool.
This can be used if your validation is failing, if your configuration is failing. Because what it does is similar to a "validate" where it's basically using the LDAP details that you have, it will perform a test connection and confirm whether we can connect or not. Basically, "ldapsearch" is a command-line tool that opens a connection to an LDAP server, binds to it, and performs a search using a filter.
The results are then displayed in the "LDIF" format or LDAP Interchange format. ldapsearch tool can be used on IDPA components like ACM, DPC Search, Avamar, etc, to test connection with LDAP server and validate the settings. Now, syntax is different for non-secure and secure LDAP.
For non-secure, you can issue that command "ldapsearch -h". Put the FQDN of the LDAP server, be it OpenLDAP, or Active Directory "-p". Put the port number there, "-D", put the user credentials, LDAP_Query add "x400.sh", for example, or something in that format, -b, you put the "Base_DN" This is where the domain component goes. Something like "dc=x400, dc=2.sh." And then, "-w", you put in the query password.
Similarly, for Secure LDAP, you can do so as well. ldapsearch -h, ldaps://, put the URL there. So, you put the "LDAP_server_FQDN: port" number, The thing when you're configuring Secure LDAP type is "openssl" or validating the certificates. You can do so by issuing the "openssl" command. Run "openssl s_client -connect". Put in the FQDN of the Active Directory OPENLDAP server: the port number which is 636.
What you should notice is it shows "Connected". That means that ACM was able to validate and connect on that port 636, with that certificate. Now, you may notice when you issue this command that it shows "unable to validate the change" or "verify the change". And that's fine, because what you're doing here is you're simply connecting, you're not passing a certificate for the Active Directory server to validate the chain, etc. So that's fine. Again, this above output was truncated, so be aware of that.
Now, validating Query user and Admin group details. Let's say you find out that certain things are not right, maybe the group is not right, the details for the query user is not right, and that's where you move on to your next step of troubleshooting. And what do you do? You utilize PowerShell on the Active Directory server, which can be queried to fetch the user and group objects in DN format, or Distinguished Name format.
So, you have your "Get-ADUser" cmdlet, which gets a specified user object, or performs a search to get multiple user objects. And then you have your "Get-ADGroup" cmdlet, which could be used to get a group, or performs a search to retrieve multiple groups from an Active Directory. We have a screenshot of the output on the left for the user. On the right, for the group. You can see the kind of output that you would receive.
You see the "Distinguished Name". So, this is an "ldap.query". It's present in the "Users" OU. The domain component is "x400.sh" And just by looking at this, I'm able to verify the "User Principal Name". That's what I need to enter at the "configure LDAP" pop-up or the dialog box.
So, you can confirm all these details here. Now, when it comes to group, again, you have the DN format, and the location of the group. It's again in "Users" OU. Domain component is "x400" and "sh". Group category is "Security". Remember I told you the group categories should be "Security" and group scope should be "Global". That’s again, something that we can validate from here.
It also shows us the "SamAccountName" and the name of the group which is "dp_admin". So, this is a good way for you to validate on customers’ Active Directory level as well. Now, some things to note is that once the LDAP is configured on the IDPA as I said in my previous slides, ACM stores the details of the LDAP configuration in ldapconfig.xml. It also stores the password for this query user in a file called "Component Credentials.ReadXml".
And the basic functionality of ACM while it's monitoring the components, is every few minutes it will test an LDAP login using these credentials and details to each of the components. Why does it do so? It wants to ensure that the LDAP configuration is working and whether the LDAP query password that is stored on ACM for this user and the other components is actually in sync. It has not changed, and that's what it verifies.
Now, in use cases where a customer has changed the query user password, then your LDAP authentication will stop working on all the components and on ACM UI, you will see an error message which says "LDAP password out of sync". So, if that is the case, you can basically come to the ACM GUI click on the same "Configure external LDAP" pop up and you can click on this check box which says, Update external LDAP password.
And when you do so, you can see all the other details are grayed out, and you can just update the password. When you do so, it will update the password on itself or on the ACM "Component Credentials.ReadXml" file that stores the credentials, gets updated. And it will also update the configuration password on each of the point products or components of IDPA to ensure the authentication still works. So, this is one thing to really take care of.
Now, one of the most common reasons why configuration fails or users are having some sort of authentication issues, it could be because the time is not in sync. Time is very sensitive. A parameter, when it comes to LDAP, the time must be in sync between the IDPA components, and Active Directory or OpenLDAP server. Users may experience configuration failures, and even authentication login failures, if the time is out of sync.
Now, super important, we use these things day in, day out. The log locations and log files themselves, what are those? So, when troubleshooting LDAP issues, users must analyze the following log on ACM for any configuration, integration, validation, monitoring type of errors, and that is your "server.log".
We saw that in the demo as well. Now, if, let's say, there's a configuration failure on one or more components, then we will need to analyze the logs on that particular component where the LDAP configuration failed as well as the "server.log" from the ACM. And here are the log locations. On ACM, we have the "server.log". On Data Protection Central, we have "elg.log".
This is the whole part that's provided here. We have Search. Search has "cis.log". Protection Software has your "userauthentication.log". And there are some instances where you can find some details on "mcserver.log.0" as well, which is in the same location.
For your Protection Storage, or Data Domain, you can review the "messages.engineering" log, and you will need Bash access for that, so be aware of that. And lastly, DPA. You can see the "server.log", which is an opt/emc/dpa/services/logs directory.