Automate your DataLayer tests with Selenium & Python
Let’s face it: Nobody enjoys DataLayer quality checks on regular basis. Most of us prefer developing a cool new feature or diving deeply into a new analysis. But to be able to do all of this while producing value, data quality is a must (“shit in, shit out”). Thus, it is crucial to know, whether all the pages in your CMS have the correct page attributes pushed into the DataLayer. To be able to automate the above and other test-cases, I’ve developed a handful of Python modules to automate such quality assurance task. Requisite is, that you need to have Python installed on your machine (Python 3.10 or above) The Setup: The following function retrieves the sitemap.xml file and returns a list of all URLs included. Optionally you can use the limit argument to cut the list’s length: The next thing you want to do is take this list of URLs and use selenium to open each webpage and retrieve the DataLayer. This part is a little bit more tricky and needs to take some optional actions into account. The URL argument is obvious – it’s the webpages URL, you want to visit to retrieve the DataLayer object from. The index tells the function what occurrence of a specific DataLayer event, you want to retrieve. If you have multiple scroll events and you want to check the DataLayer for the first one, the index is 0, for the third one it’s 2. The event argument is “None” by default. In this case, you get the object from the DataLayer, that matches the index argument. If you want to check e.g. page information, that is populated on load, you can often leave the event as it is and set the index to 0, as this information is often the first element. The navigation_steps argument is the fun part. It takes a list of instructions, that selenium shall execute to simulate user behavior on your webpage. You can click stuff, scroll through it or even submit a form. The list itself can look like this: The above example instructs selenium to click on a button in the consent manager, scroll to the page’s footer and then scroll back to the H1. The “wait” key passes the maximum amount of seconds selenium is allowed to wait for the element to appear on the page. My function only includes clicking as well as scrolling as possible actions, but feel free to add more. The last two argument instruct selenium to wait for your CMP to be loaded and visible. Just pass the css-selector for an element within the CMP in the cmp_selector argument and that’s it. To check for missing keys, one can use this function: All you need to do is define a list of keys, that are mandatory to be included in your DataLayer. The function returns all of these, that are missing in a list. Before we put it all together, we do need to add a functionality to send an email (e.g. to yourself or someone managing your website). Of course you could use another communication tool as, like Slack or Microsoft Teams Webhooks. But since E-Mail communication is commonly available, I am sticking to it this time. Here is a generic function that sends an E-Mail using SMTP credentials of your mail-server. There are several posts/tutorials on how to get these for almost any provider. So let’s put it all together: First we take all the above functions (except for the one for E-Mails) and put it into a functions.py file as a collection of utility functions: Then we create a file “send_mail.py” and add the mail-function as well as all necessary modules to it And lastly, we create “main.py” to put it all together: When you run main.py selenium opens the first 10 pages in your sitemap.xml file (due to the limit argument being “10”), checks the existence of the defined DataLayer keys, saves the missing ones to a file and sends it via E-Mail. I hope this helps you in your journey to automating DataLayer QAs! Feel free to check out the GitHub repo with all files: https://github.com/ramonseradj/static_cms_qa_public
Automate your DataLayer tests with Selenium & Python Read More »