Skip to main content

In every company it’s a struggle to make sure we only keep the documents we want and need in the future, to minimize our total amount of files, but even more important, to make sure we’re not violating any GDPR legislation. Not only important for our Privacy Officer, but for all of us of course.


GDPR, we love (to hate) it

We just have to comply with the rules, sounds simple enough! But what sounds simple might be more complicated in practice. The daily hustle and bustle occupies our minds and might make us forget to clean up after a project is done. And isn’t GDPR sometimes just inconvenient too… We just need to share that resume and we need to do it now! And yes, there might be some contact information in that Excel file, but you will remove that as soon as you do not need it anymore, right?

Right… Wrong! Well, at least sometimes. We are all human, which means our actions are not always in line with our intentions. Which doesn’t mean we are willingly violating the GDPR legislation, but it also doesn’t mean this doesn’t happen at all.

Your Privacy Officer may be aware of this and is trying to encourage all users to clean up: check your downloads folder, delete stored attachments, empty your trash, clean up the project folder at the end of the project. But that doesn’t mean an extra check wouldn’t be an excellent idea.

Excuse me, how many?!

At Cmotions, we knew we might be at risk, simply due to the sheer number of files on our filesystem. Even while we are only keeping our own project files and don’t store data of our customers anywhere on our own filesystem. That’s why our Privacy Officer tried to come up with rules to eliminate GDPR sensitive files as much as possible. To the other employees, it felt these rules were not doing what they were meant to be doing and we, as data professionals, were convinced we should be able to do better. This is when we first came up with the idea to create a Python package to do these checks for us. The idea of this package was to make work a lot easier and to solve all our aforementioned problems. With just a few clicks you should be able to see a list of files you would need to check on GDPR sensitive information. Preferably, you should also be able to see which GDPR rule was violated and how.

With this in mind, we started building our Python package ‘DriveScanner’, and now we’re proud to be sharing our first version with you. It might not be perfect yet, it’s work in progress, but what better way to improve than with the help of our community. Check out our code in our repository, or simply start using our package by pip installing itpip install drivescanner.

Principal Consultant & Data Scientist
Close Menu